TWI706676B

TWI706676B - Multi- camera monitoring and tracking system and method thereof

Info

Publication number: TWI706676B
Application number: TW108123170A
Authority: TW
Inventors: 李育翰
Original assignee: 微星科技股份有限公司; 大陸商恩斯邁電子（深圳）有限公司
Priority date: 2019-07-01
Filing date: 2019-07-01
Publication date: 2020-10-01
Also published as: TW202103488A

Abstract

A multi-camera monitoring and tracking system and method, comprising: capturing a plurality of target images for a target character, and obtaining a plurality of target feature values according to the target images; respectively capturing a video by the plurality of image input devices; when judging that there is a humanoid image in each video, capturing an identification value from the humanoid image; obtained a trust value by comparing the identification value with the plurality of target feature value; judging the trust coefficient whether falls within a confidence value interval, and if the trust coefficient falls within the confidence interval, select the video with the smallest trust coefficient among the videos to be the specified video; and the specified video corresponding to the plurality of different time intervals is integrated into the monitoring and tracking video.

Description

Multi-camera monitoring and tracking system and method

本案是關於影像監控領域，特別是一種多攝影機監控追蹤系統及其方法。This case is about the field of image monitoring, especially a multi-camera monitoring and tracking system and method.

現今公共場所對於影像監控的需求以十分普及，但是由於公共場所通常包括許多不同的區域以及用於連接不同區域的通道，因此常需要裝設監控系統以許多台監視器監看不同的空間及通道以達成影像監控的功能。Nowadays, the demand for video surveillance in public places is very popular, but because public places usually include many different areas and channels used to connect different areas, it is often necessary to install a monitoring system to monitor different spaces and channels with many monitors. In order to achieve the function of image monitoring.

但是這樣的監控系統，通常需要將每台監視器所拍攝的畫面在同一個螢幕上並列顯示或是藉由不同的螢幕個別顯示。但是此類型的顯示方式，都因為監控系統同時呈現過多畫面給使用者，造成使用者很難集中注意在畫面中的目標人物上，尤其是目標人物同時出現在不同的畫面時。However, such a monitoring system usually needs to display the pictures taken by each monitor side by side on the same screen or display them individually on different screens. However, this type of display method is because the monitoring system presents too many screens to the user at the same time, which makes it difficult for the user to focus on the target person in the screen, especially when the target person appears on different screens at the same time.

另外，為了便於使用者監看公共場所中人員移動的情形，有一些監控系統會將監視器的監視區域兩兩互相重疊，以確保沒有監視漏洞。但是在實務上，這樣的設計往往需要耗費過多的監視器，並且很難避免監視死角，因此還是造成監控系統在實際使用上很大的不便性。In addition, in order to facilitate users to monitor the movement of people in public places, some monitoring systems overlap the monitoring areas of the monitors to ensure that there are no monitoring loopholes. However, in practice, such a design often requires too many monitors, and it is difficult to avoid monitoring blind spots, so it still causes great inconvenience in actual use of the monitoring system.

有鑑於此，本案提出一種監控系統及其方法。In view of this, this case proposes a monitoring system and method.

依據一些實施例，多攝影機監控追蹤方法，包括：對目標人物進行擷取多個目標圖片，並依據目標圖片獲得多個目標特徵值；由多個影像輸入裝置分別擷取影片；當判斷影片存在人形影像時，對人形影像擷取辨識值；比對辨識值與目標特徵值，而獲得信賴係數；判斷信賴係數是否落入信賴值區間，若信賴係數落入信賴區間時，選擇影片中具有最小的信賴係數的影片作為指定影片；以及，將多個不同時間區間所分別對應的指定影片整合為監控追蹤影片。According to some embodiments, a multi-camera monitoring and tracking method includes: capturing a plurality of target pictures of a target person, and obtaining a plurality of target feature values according to the target pictures; capturing videos by a plurality of image input devices; when determining that the video exists In the case of a humanoid image, the identification value of the humanoid image is captured; the identification value is compared with the target characteristic value to obtain the trust coefficient; it is judged whether the trust factor falls within the trust value interval, and if the trust factor falls within the trust interval, select the film with the smallest value A movie with a reliability coefficient of is used as a designated movie; and, multiple designated movies corresponding to different time intervals are integrated into a surveillance tracking movie.

依據一些實施例，多攝影機監控追蹤系統包括：目標特徵值擷取裝置、多個影像輸入裝置、控制器及影像輸出裝置。控制器包括：辨識值擷取模組、信賴係數計算模組、影片指定模組、監控追蹤影片整合模組及人物持續追蹤模組。其中，目標特徵值擷取裝置用於對目標人物進行擷取多個目標圖片，並依據目標圖片獲得多個目標特徵值。影像輸入裝置用於擷取影片。控制器用於接收目標特徵值及影片。辨識值擷取模組用於依據辨識值擷取程序以偵測影片中的人形影像以擷取辨識值。信賴係數計算模組用於比對辨識值與目標特徵值而獲得信賴係數。影片指定模組用於判斷信賴係數是否落入信賴值區間。若信賴係數落入信賴區間時，影片指定模組選擇影片中具有最小的信賴係數的影片作為指定影片。反之，若信賴係數未落入信賴區間時，影片指定模組選擇影片中的預設影片為指定影片。監控追蹤影片整合模組用於將多個不同時間區間所分別對應的指定影片整合為監控追蹤影片。影像輸出裝置用於輸出監控追蹤影片。According to some embodiments, the multi-camera monitoring and tracking system includes: a target feature value capturing device, a plurality of image input devices, a controller, and an image output device. The controller includes: an identification value acquisition module, a reliability coefficient calculation module, a video designation module, a monitoring and tracking video integration module, and a character continuous tracking module. Wherein, the target feature value capturing device is used to capture multiple target pictures of the target person, and obtain multiple target feature values according to the target picture. The image input device is used to capture video. The controller is used to receive the target feature value and the movie. The identification value acquisition module is used for the identification value acquisition process to detect the humanoid image in the video to acquire the identification value. The reliability coefficient calculation module is used to compare the identification value with the target characteristic value to obtain the reliability coefficient. The video specification module is used to determine whether the trust factor falls within the trust value interval. If the reliability coefficient falls within the confidence interval, the video specifying module selects the video with the smallest reliability coefficient as the specified video. Conversely, if the reliability factor does not fall within the confidence interval, the video specification module selects the default video in the video as the specified video. The monitoring and tracking video integration module is used to integrate multiple specified videos corresponding to different time intervals into monitoring and tracking videos. The image output device is used to output monitoring and tracking videos.

綜上所述，本案中一些實施例的多攝影機監控追蹤系統及其方法能藉由比對目標人物的目標特徵值與各個影片中的人形影像的辨識值以監控各個影片中可能出現的目標人物。在一些實施例，多攝影機監控追蹤系統及其方法也能整合多個不同時間區間所分別對應的指定影片為監控追蹤影片，因此能讓使用者以單一畫面持續監控目標人物。In summary, the multi-camera monitoring and tracking system and method thereof in some embodiments of this case can monitor the target person that may appear in each movie by comparing the target feature value of the target person with the identification value of the human image in each movie. In some embodiments, the multi-camera monitoring and tracking system and method thereof can also integrate multiple designated videos corresponding to different time intervals into monitoring and tracking videos, so that the user can continuously monitor the target person on a single screen.

圖1為本案一些實施例之多攝影機監控追蹤系統10的示意圖。請參照圖1，在一些實施例，多攝影機監控追蹤系統10包括多個影像輸入裝置100、控制器200、影像輸出裝置300、目標特徵值擷取裝置400及儲存裝置（圖中未繪示）。其中影像輸入裝置100、影像輸出裝置300、目標特徵值擷取裝置400及儲存裝置耦接於控制器200。在一些實施例中，影像輸入裝置100、影像輸出裝置300、目標特徵值擷取裝置400也耦接於儲存裝置。FIG. 1 is a schematic diagram of a multi-camera monitoring and tracking system 10 according to some embodiments of the present application. 1, in some embodiments, the multi-camera monitoring and tracking system 10 includes a plurality of image input devices 100, a controller 200, an image output device 300, a target feature value capturing device 400, and a storage device (not shown in the figure) . The image input device 100, the image output device 300, the target feature value capturing device 400 and the storage device are coupled to the controller 200. In some embodiments, the image input device 100, the image output device 300, and the target feature value capturing device 400 are also coupled to the storage device.

需特別說明的是，於多攝影機監控追蹤系統10之中，影像輸入裝置100、控制器200、影像輸出裝置300、特徵擷取裝置400及儲存裝置之間的訊號傳輸不限於藉由電性連接的有線傳輸，或是藉由無線雲端傳輸。並且，影像輸入裝置100、控制器200、影像輸出裝置300、特徵擷取裝置400及儲存裝置之間的檔案存取不限於透過儲存裝置，或是直接透過各個裝置內建的儲存功能。在一些實施例，影像輸入裝置100例如但不限於定點式攝影機、具即時影像傳輸功能的行動式攝影機或空拍機。影像輸出裝置300例如但不限電腦螢幕、手機螢幕、車用螢幕或其他電子設備上具有影像顯示功能的螢幕。It should be particularly noted that, in the multi-camera monitoring and tracking system 10, the signal transmission between the image input device 100, the controller 200, the image output device 300, the feature extraction device 400, and the storage device is not limited to electrical connections. Wired transmission, or by wireless cloud transmission. Moreover, the file access between the image input device 100, the controller 200, the image output device 300, the feature extraction device 400, and the storage device is not limited to the storage device, or directly through the built-in storage function of each device. In some embodiments, the image input device 100 is, for example, but not limited to, a fixed-point camera, a mobile camera with real-time image transmission function, or an aerial camera. The image output device 300 is, for example, but not limited to, a computer screen, a mobile phone screen, a car screen, or a screen with an image display function on other electronic devices.

在一些實施例，控制器200包括辨識值擷取模組210、信賴係數計算模組220、影片指定模組230、監控追蹤影片整合模組240及人物持續追蹤模組250。其中，辨識值擷取模組210、信賴係數計算模組220、影片指定模組230、監控追蹤影片整合模組240及人物持續追蹤模組250例如但不限於各自以單一晶片實現、任意組合為一晶片以實現，或是由控制器200將辨識值擷取模組210、信賴係數計算模組220、影片指定模組230、監控追蹤影片整合模組240及人物持續追蹤模組250皆整合在單一晶片中實現，本案不以此為限。In some embodiments, the controller 200 includes an identification value acquisition module 210, a reliability coefficient calculation module 220, a video specification module 230, a monitoring and tracking video integration module 240, and a character continuous tracking module 250. Among them, the identification value acquisition module 210, the reliability coefficient calculation module 220, the video specification module 230, the monitoring and tracking video integration module 240, and the character continuous tracking module 250 are, for example, but not limited to, each implemented by a single chip, and any combination is One chip is implemented, or the controller 200 integrates the identification value acquisition module 210, the reliability coefficient calculation module 220, the video specification module 230, the monitoring and tracking video integration module 240, and the character continuous tracking module 250 into the Implementation in a single chip, this case is not limited to this.

圖2繪示本案一些實施例之多攝影機監控追蹤方法的示意圖。請同時參閱圖1及圖2。在一些實施例，多攝影機監控追蹤方法包括以下步驟：目標特徵值擷取步驟（步驟S210）；影片擷取步驟（步驟S220）；辨識值擷取步驟（步驟S230）；信賴係數計算步驟（步驟S240）；影片指定步驟（步驟S250）；監控追蹤影片整合步驟（步驟S260）；以及，人物持續追蹤步驟（步驟S270）。FIG. 2 is a schematic diagram of a multi-camera monitoring and tracking method according to some embodiments of this case. Please refer to Figure 1 and Figure 2 at the same time. In some embodiments, the multi-camera monitoring and tracking method includes the following steps: a target feature value capturing step (step S210); a video capturing step (step S220); an identification value capturing step (step S230); a reliability factor calculation step (step S230) S240); movie specifying step (step S250); monitoring and tracking movie integration step (step S260); and, character continuous tracking step (step S270).

目標特徵值擷取步驟（步驟S210）包括：對目標人物進行擷取多個目標圖片，並依據此些目標圖片獲得多個目標特徵值。具體而言，目標人物即為多攝影機監控追蹤方法欲監控追蹤的目標。經由目標特徵值擷取裝置400對目標人物拍照能獲得多個目標圖片。目標圖片通常為目標人物不同面向的影像以突顯目標人物各個角度的特徵，例如目標人物的正面影像、側面影像或背面影像。並且目標特徵值擷取裝置400藉由分析此些目標圖片，能夠獲得每個目標圖片各自對應的目標特徵值，也就是目標人物在對應角度所呈現的顏色特徵及紋理特徵（詳細機制將於後面段落說明）。依據一些實施例，特徵值擷取裝置400是對目標人物進行環形攝影以擷取此些目標圖片。在實務上，環形攝影可透過在360度中平均分配各張目標圖片對應的角度來實現。例如，當欲拍攝的目標圖片為8張，就可選擇對目標人物在八個方向（0度、45度、90度、135度、180度、225度、270度、315度，兩兩相鄰的方向所夾的角度為45度）進行攝影以獲得此八個方向的特徵影像。在一些實施例，於多攝影機監控追蹤系統10中，目標特徵值擷取裝置400輸出目標特徵值，並且由控制器200接收目標特徵值擷取裝置400輸出的目標特徵值。The target feature value capturing step (step S210) includes: capturing multiple target pictures of the target person, and obtaining multiple target feature values based on the target pictures. Specifically, the target person is the target to be monitored and tracked by the multi-camera monitoring and tracking method. Multiple target pictures can be obtained by taking pictures of a target person through the target feature value capturing device 400. The target image is usually an image of the target person in different aspects to highlight the characteristics of the target person from various angles, such as the front image, the silhouette image or the back image of the target person. And by analyzing these target pictures, the target feature value extraction device 400 can obtain the target feature value corresponding to each target picture, that is, the color feature and texture feature of the target person at the corresponding angle (the detailed mechanism will be described later) Paragraph description). According to some embodiments, the feature value capturing device 400 performs circular photography of the target person to capture the target pictures. In practice, ring photography can be achieved by evenly distributing the angles corresponding to each target picture in 360 degrees. For example, when the target picture to be shot is 8 pictures, you can choose to target the target person in eight directions (0 degrees, 45 degrees, 90 degrees, 135 degrees, 180 degrees, 225 degrees, 270 degrees, 315 degrees, two-phase The angle between the adjacent directions is 45 degrees.) Take pictures to obtain characteristic images in these eight directions. In some embodiments, in the multi-camera monitoring and tracking system 10, the target feature value capturing device 400 outputs the target feature value, and the controller 200 receives the target feature value output by the target feature value capturing device 400.

影片擷取步驟（步驟S220）包括：由多個影像輸入裝置100分別擷取影片。具體而言，影像輸入裝置100通常至少包括一鏡頭，並且影片也就是影像輸入裝置100透過擷取鏡頭視野所見區域對應的動態影像。其中各個影像輸入裝置100對應的監視區域可以重疊，也可以不重疊，本案不以此為限。例如影像輸入裝置100的數量可為3台，分別是第一攝影機、第二攝影機及第三攝影機。第一攝影機對應第一拍攝區域，第二攝影機對應第二拍攝區域，第三攝影機對應第三拍攝區域。其中第一拍攝區域與第二拍攝區域之間沒有重疊的部分，第一拍攝區域與第三拍攝區域之間沒有重疊的部分，但是第二拍攝區域與第三拍攝區域之間卻有重疊的部分。在一些實施例，於多攝影機監控追蹤系統10中，影像輸入裝置100輸出影片，並且由控制器200接收影像輸入裝置100輸出的影片。The video capturing step (step S220) includes: capturing videos from a plurality of image input devices 100 respectively. Specifically, the image input device 100 usually includes at least one lens, and the video, that is, the image input device 100 captures a dynamic image corresponding to the area seen by the lens field of view. The monitoring areas corresponding to the respective image input devices 100 may or may not overlap, and this case is not limited to this. For example, the number of image input devices 100 can be three, which are the first camera, the second camera, and the third camera. The first camera corresponds to the first shooting area, the second camera corresponds to the second shooting area, and the third camera corresponds to the third shooting area. There is no overlap between the first shooting area and the second shooting area, there is no overlap between the first shooting area and the third shooting area, but there is an overlap between the second shooting area and the third shooting area. . In some embodiments, in the multi-camera monitoring and tracking system 10, the image input device 100 outputs a video, and the controller 200 receives the video output from the image input device 100.

辨識值擷取步驟（步驟S230）包括：當判斷各個影片存在人形影像時，對人形影像擷取辨識值。具體而言，當辨識值擷取模組210偵測影片中存在人形影像時，辨識值擷取模組210即依據辨識值擷取程序從人形影像之中獲得辨識值。也就是，當任一個影片中有人形影像時，辨識值擷取模組210就會獲得對應此人形影像的辨識值，而並不需要每個影片都存在人形影像才運作。依據一些實施例，辨識值擷取模組210藉由使用OpenPose演算法即可判斷影片中是否存在人形影像。為便於解釋，本說明書僅以各個影片最多出現一個人形影像為例進行說明，在實務上影片中可以同時出現多個人形影像，並且僅需重複此步驟（步驟S230）即可獲得同一個影片中的各個人形影像對應的辨識值。The identification value capturing step (step S230) includes: when it is determined that there is a humanoid image in each movie, the identification value is captured from the humanoid image. Specifically, when the identification value acquisition module 210 detects the presence of a humanoid image in the video, the identification value acquisition module 210 obtains an identification value from the humanoid image according to the identification value extraction process. That is, when there is a humanoid image in any movie, the identification value capturing module 210 will obtain the identification value corresponding to the humanoid image, and it does not need to have a humanoid image in every movie to operate. According to some embodiments, the identification value capturing module 210 can determine whether there is a human image in the video by using the OpenPose algorithm. For the convenience of explanation, this manual only takes as an example that each movie has at most one humanoid image. In practice, multiple humanoid images can appear in the movie at the same time, and only need to repeat this step (step S230) to get the same movie. The identification value corresponding to each humanoid image of.

接著，對人形影像擷取辨識值的細部步驟進一步說明。圖3繪示本案一些實施例之辨識值擷取程序的流程圖，圖4繪示本案一些實施例之身體位置點的示意圖，圖5至圖8繪示本案一些實施例之評分區域的示意圖。請同時參閱圖1至圖8。在一些實施例，辨識值擷取程序包括以下步驟：偵測人形影像以獲得骨架資訊（步驟S310）；依據骨架資訊獲得多個身體位置點（步驟S320），依據此些身體位置點擷取人形影像為辨識人形圖片（步驟S330）；切分此辨識人形圖片為多個評分區域；（步驟S340）；以及，依據各個評分區域對應的顏色特徵值、及各個評分區域對應的紋理特徵值以獲得人形影像的辨識值（步驟S350）。Next, the detailed steps of capturing the identification value of the humanoid image are further explained. FIG. 3 shows a flowchart of the identification value acquisition procedure of some embodiments of this case, FIG. 4 shows a schematic diagram of body position points of some embodiments of this case, and FIGS. 5 to 8 show diagrams of a scoring area of some embodiments of this case. Please also refer to Figure 1 to Figure 8. In some embodiments, the identification value capturing process includes the following steps: detecting a human figure image to obtain skeleton information (step S310); obtaining a plurality of body position points according to the skeleton information (step S320), and capturing a human figure according to the body position points The image is a recognized humanoid picture (step S330); the recognized humanoid picture is divided into multiple scoring regions; (step S340); and the color feature value corresponding to each scoring region and the texture feature value corresponding to each scoring region are obtained according to The identification value of the humanoid image (step S350).

偵測人形影像以獲得骨架資訊（步驟S310）。具體而言，在一些實施例中，辨識值擷取模組210透過使用OpenPose演算法能藉由人形影像運算獲得骨架資訊。Detect a humanoid image to obtain skeleton information (step S310). Specifically, in some embodiments, the identification value capturing module 210 can obtain skeleton information through human image operations by using the OpenPose algorithm.

依據骨架資訊獲得多個身體位置點（步驟S320）。具體而言，在一些實施例中，辨識值擷取模組210透過OpenPose演算法獲得的骨架資訊之中包括人形影像的多個身體位置點。如圖4所示，在一些實施例中，身體位置點包括頭頂點K0、頸關節點K1、肩關節點K2、K5、肘關節點K3、K6、腕關節點K4、K7、髖關節點K8、K11、膝關節點K9、K12及踝關節點K10、K13等位置。Obtain multiple body position points according to the skeleton information (step S320). Specifically, in some embodiments, the skeleton information obtained by the recognition value capturing module 210 through the OpenPose algorithm includes multiple body position points of the human image. As shown in Figure 4, in some embodiments, the body position points include head vertices K0, neck joint points K1, shoulder joint points K2, K5, elbow joint points K3, K6, wrist joint points K4, K7, hip joint points K8, K11, knee joint points K9, K12, ankle joint points K10, K13, etc.

依據此些身體位置點擷取人形影像為辨識人形圖片900（步驟S330）。具體而言，依據此些身體位置點的資訊，辨識值擷取模組210能定位人形影像在影片中任一個時間的位置。也因此辨識值擷取模組210能擷取影片在對應時間點中的人形影像為辨識人形圖片900，也就是對應時間點的靜態人形影像，並以此辨識人形圖片900作進一步特徵分析。According to these body position points, the human figure image is captured as the human figure picture 900 (step S330). Specifically, based on the information of these body position points, the identification value capturing module 210 can locate the position of the humanoid image at any time in the video. Therefore, the identification value capturing module 210 can capture the humanoid image of the video at the corresponding time point as the humanoid image 900, that is, the static humanoid image at the corresponding time point, and use the humanoid image 900 for further feature analysis.

切分此辨識人形圖片為多個評分區域（步驟S340）。具體而言，辨識值擷取模組210依據身體位置點劃分辨識人形圖片900中的部分區域為評分區域。如圖5至圖8所示，在一些實施例中，評分區域共有4個，包括第一評分區域A1（圖5）、第二評分區域A2（圖6）、第三評分區域A3（圖7）及第四評分區域A4（圖8）。第一評分區域A1為四邊形，以頭頂點K0至頸關節點K1的距離為長，以頭頂點K0至頸關節點K1的二分之一距離為寬，並且由頸關節點K1做為下邊中點。第二評分區域A2為肩關節點K2、K5及肘關節點K3、K6所圍成的四邊形。第三評分區域A3為肩關節點K2、K5及髖關節點K8、K11所圍成的四邊形。第四評分區域A4為髖關節點K8、K11及膝關節點K9、K12所圍成的四邊形。The recognizable humanoid picture is divided into multiple scoring regions (step S340). Specifically, the recognition value capturing module 210 divides a partial area in the recognized human figure picture 900 into a scoring area according to body position points. As shown in Figures 5 to 8, in some embodiments, there are 4 scoring regions, including a first scoring region A1 (Figure 5), a second scoring region A2 (Figure 6), and a third scoring region A3 (Figure 7). ) And the fourth scoring area A4 (Figure 8). The first scoring area A1 is a quadrilateral, with the distance from the head vertex K0 to the neck joint point K1 as the length, the half distance from the head vertex K0 to the neck joint point K1 as the width, and the neck joint point K1 as the lower midpoint. The second scoring area A2 is a quadrilateral surrounded by the shoulder joint points K2, K5 and the elbow joint points K3, K6. The third scoring area A3 is a quadrilateral surrounded by shoulder joint points K2, K5 and hip joint points K8, K11. The fourth scoring area A4 is a quadrilateral surrounded by hip joint points K8 and K11 and knee joint points K9 and K12.

依據各個評分區域對應的顏色特徵值、及各個評分區域對應的紋理特徵值以獲得人形影像的辨識值（步驟S350）。具體而言，在一些實施例，顏色特徵值是辨識值擷取模組210藉由色矩法（Color moments）以計算第一評分區域A1、第二評分區域A2、第三評分區域A3及第四評分區域A4可得出，其中色矩法就是以HSV色彩模型（色相H、飽和度S、明度V）計算區域對應的第一階矩（mean）、第二階矩（standard）及第三階矩（skewness）而得出九個數值（mean _h、mean _s、mean _v _,、std _h、std _s、std _v、skewness _h、skewness _s、skewness _v），因此每一個評分區域對應的顏色特徵值即為此九個數值的集合，但本案不以此為限。依據一些實施例，紋理特徵值則是辨識值擷取模組210藉由習知的Gabor filter公式計算第一評分區域A1、第二評分區域A2、第三評分區域A3及第四評分區域A4可得出，但本案不以此為限。由於人形影像的辨識值即為各個評分區域對應的顏色特徵值、及各個評分區域對應的紋理特徵值的集合。因此，辨識值擷取模組210能藉由各個評分區域對應的顏色特徵值、及各個評分區域對應的紋理特徵值獲得人形影像的辨識值。 The identification value of the human image is obtained according to the color feature value corresponding to each scoring area and the texture feature value corresponding to each scoring area (step S350). Specifically, in some embodiments, the color feature value is the identification value capturing module 210 using color moments to calculate the first scoring area A1, the second scoring area A2, the third scoring area A3, and the The four scoring area A4 can be obtained, where the color moment method is to calculate the first moment (mean), second moment (standard) and third moment (standard) corresponding to the area using the HSV color model (hue H, saturation S, lightness V) order moment (Skewness) derived nine values _{_{(mean h, mean s, mean}} v,, std h, std s, std v, skewness h, skewness s, skewness v), and therefore the color characteristics of each area corresponding to a rating The value is the set of nine values, but this case is not limited to this. According to some embodiments, the texture feature value is the identification value acquisition module 210 using the conventional Gabor filter formula to calculate the first scoring area A1, the second scoring area A2, the third scoring area A3, and the fourth scoring area A4. Result, but this case is not limited by this. Because the identification value of the human image is the set of the color feature value corresponding to each scoring area and the texture feature value corresponding to each scoring area. Therefore, the identification value acquisition module 210 can obtain the identification value of the human image based on the color feature value corresponding to each scoring area and the texture feature value corresponding to each scoring area.

需特別說明的是，在一些實施例中，人形影像的辨識值不包括第一評分區域A1的顏色特徵值。因為第一評分區域A1位於人形影像的頭部，而頭部通常沒有被衣物覆蓋而呈現膚色，造成顏色特徵不明顯，所以人形影像的辨識值不採用第一評分區域A1的顏色特徵值。因此，任一影片在任一時間點的人形影像的辨識值可視為「第一評分區域A1的紋理特徵值、第二評分區域A2的顏色特徵值及紋理特徵值、第三評分區域A3的顏色特徵值及紋理特徵值、及第四評分區域A4的顏色特徵值及紋理特徵值」之集合，並且可由{V _TA，V _CB，V _TB，V _CC，V _TC，V _CD，V _TD}表示。 It should be noted that, in some embodiments, the identification value of the humanoid image does not include the color feature value of the first scoring area A1. Because the first scoring area A1 is located on the head of the humanoid image, and the head is usually not covered by clothing and presents skin color, resulting in unobvious color features, the identification value of the humanoid image does not use the color feature value of the first scoring area A1. Therefore, the identification value of the humanoid image of any movie at any point in time can be regarded as "the texture feature value of the first scoring area A1, the color feature value and texture feature value of the second scoring area A2, and the color feature of the third scoring area A3. Value and texture feature value, and the color feature value and texture feature value of the fourth scoring area A4", and can be represented by {V _TA , V _CB , V _TB , V _CC , V _TC , V _CD , V _TD }.

同理，目標人物的目標特徵值也可以透過目標特徵值擷取裝置400依據辨識值擷取程序分析目標圖片中的目標人物而獲得。先是切分出目標人物在各個方向的目標圖片的各個評分區域，再進一步獲得各個評分區域的顏色特徵值及紋理特徵值，因此最後可獲得目標特徵值，也就是「各個評分區域的顏色特徵值及紋理特徵值」的集合。依據一些實施例，目標特徵值也不包括第一評分區域A1的顏色特徵值，因此目標特徵值可由{V _X1，V _X2，V _X3，V _X4，V _X5，V _X6，V _X7}表示，並且目標特徵值{V _X1，V _X2，V _X3，V _X4，V _X5，V _X6，V _X7}的各項依序對應於辨識值{V _TA，V _CB，V _TB，V _CC，V _TC，V _CD，V _TD}的各項。在一些實施例，由於各個目標特徵值分別經由8個方向拍攝的影像獲得，因此共有8個目標特徵值。 In the same way, the target feature value of the target person can also be obtained by analyzing the target person in the target picture by the target feature value extraction device 400 according to the identification value extraction process. Firstly, each scoring area of the target image of the target person in each direction is segmented, and then the color feature value and texture feature value of each scoring area are further obtained, so finally the target feature value can be obtained, that is, the color feature value of each scoring area And texture feature values". According to some embodiments, the target feature value does not include the color feature value of the first scoring area A1, so the target feature value can be represented by {V _X1 , V _X2 , V _X3 , V _X4 , V _X5 , V _X6 , V _X7 }, And the target feature values {V _X1 , V _X2 , V _X3 , V _X4 , V _X5 , V _X6 , V _X7 } correspond to the identification values {V _TA , V _CB , V _TB , V _CC , V _TC in sequence , V _CD , V _TD }. In some embodiments, since each target feature value is obtained from images taken in 8 directions, there are 8 target feature values in total.

請續參閱圖1及圖2。信賴係數計算步驟（步驟S240）包括：比對辨識值與此些目標特徵值，而獲得信賴係數。具體而言，信賴係數計算模組220用於比對每一個影片對應的辨識值與各個目標特徵值之間的差異，而辨識值與每一個目標特徵值之間的差異可分別一個估計值來表示。而信賴係數計算模組220能從判斷此些估計值的大小，並且挑選之中挑出此些估計值的最小值做為信賴係數。依據一些實施例，辨識值{V _TA，V _CB，V _TB，V _CC，V _TC，V _CD，V _TD}與目標特徵值{V _X1，V _X2，V _X3，V _X4，V _X5，V _X6，V _X7}之間的差值是透過單一面向估計公式獲得，單一面向估計公式如下所列： E _x=0.3*ED(v _x1,V _TA)+0.4*ED({v _x2,v _x4,v _x6},{V _CB,V _CC,V _CD}) +0.3*ED({v _x3,v _x5,v _x7},{V _TB,V _TC,V _TD})。其中，E _x為第x方向的估計值，而第x方向對應的即為目標圖片的拍攝方向。ED(v _x1,V _TA)代表v _x1與V _TA之間的歐氏距離（Euclidean Distance），ED({v _x2,v _x4,v _x6},{V _CB,V _CC,V _CD})代表{v _x2,v _x4,v _x6}與{V _CB,V _CC,V _CD}之間的歐氏距離，ED({v _x3,v _x5,v _x7},{V _TB,V _TC,V _TD})代表{v _x3,v _x5,v _x7}與{ V _TB,V _TC,V _TD}之間的歐氏距離。 Please continue to refer to Figure 1 and Figure 2. The reliability coefficient calculation step (step S240) includes: comparing the identification value with these target characteristic values to obtain the reliability coefficient. Specifically, the reliability coefficient calculation module 220 is used to compare the difference between the identification value corresponding to each movie and each target feature value, and the difference between the identification value and each target feature value can be an estimated value. Said. The reliability coefficient calculation module 220 can determine the size of these estimated values and select the minimum value of these estimated values as the reliability coefficient. According to some embodiments, the identification value {V _TA , V _CB , V _TB , V _CC , V _TC , V _CD , V _TD } and the target characteristic value {V _X1 , V _X2 , V _X3 , V _X4 , V _X5 , V The difference between _X6 and V _X7 } is obtained through a single-dimensional estimation formula, which is listed as follows: E _x =0.3*ED(v _x1 ,V _TA )+0.4*ED({v _x2 ,v _x4 ,v _x6 },{V _CB ,V _CC ,V _CD }) +0.3*ED({v _x3 ,v _x5 ,v _x7 },{V _TB ,V _TC ,V _TD }). Among them, E _x is the estimated value of the x-th direction, and the x-th direction corresponds to the shooting direction of the target picture. ED(v _x1 ,V _TA ) represents the Euclidean Distance between v _x1 and V _TA , ED({v _x2 ,v _x4 ,v _x6 },{V _CB ,V _CC ,V _CD }) represents Euclidean distance between {v _x2 ,v _x4 ,v _x6 } and {V _CB ,V _CC ,V _CD }, ED({v _x3 ,v _x5 ,v _x7 },{V _TB ,V _TC ,V _TD }) represents the Euclidean distance between {v _x3 ,v _x5 ,v _x7 } and {V _TB ,V _TC ,V _TD }.

承上，信賴係數的獲得公式為T＝Min(E _x)，其中x=1~n，T為信賴係數。代表信賴係數為第一方向至第n方向的各個估計值中的最小值。因此，在目標特徵值是經由8個方向拍攝的目標圖片獲得的例子中，信賴係數為T＝Min(E _x)，其中x=1~8。代表信賴係數為第一方向至第八方向的各個估計值中的最小值。 In conclusion, the formula for obtaining the trust factor is T=Min(E _x ), where x=1~n, and T is the trust factor. The representative reliability coefficient is the minimum value among the estimated values in the first direction to the nth direction. Therefore, in an example where the target feature value is obtained through a target picture taken in 8 directions, the reliability coefficient is T=Min(E _x ), where x=1~8. The representative reliability coefficient is the minimum value among the estimated values in the first direction to the eighth direction.

在一些實施例中，當人形影像中的第四評分區域A4不存在，則單一面向估計公式可不計算第四評分區域A4的顏色特徵值及紋理特徵值的差異，而此時的單一面向估計公式如下所列： E _x=0.3*ED(v _x1,V _TA)+0.4*ED({v _x2,v _x4},{V _CB,V _CC})+0.3*ED({v _x3,v _x5},{V _TB,V _TC})。其中，ED({v _x2,v _x4},{V _CB,V _CC})代表{v _x2,v _x4}與{V _CB,V _CC}之間的歐氏距離，ED({v _x3,v _x5},{V _TB,V _TC})代表{v _x3,v _x5}與{V _TB,V _TC}之間的歐氏距離。 In some embodiments, when the fourth scoring area A4 in the humanoid image does not exist, the single-oriented estimation formula may not calculate the difference between the color feature values and texture feature values of the fourth scoring area A4, and the single-oriented estimation formula at this time Listed as follows: E _x =0.3*ED(v _x1 ,V _TA )+0.4*ED({v _x2 ,v _x4 },{V _CB ,V _CC })+0.3*ED({v _x3 ,v _x5 } ,{V _TB ,V _TC }). Among them, ED({v _x2 ,v _x4 },{V _CB ,V _CC }) represents the Euclidean distance between {v _x2 ,v _x4 } and {V _CB ,V _CC }, ED({v _x3 ,v _x5 },{V _TB ,V _TC }) represents the Euclidean distance between {v _x3 ,v _x5 } and {V _TB ,V _TC }.

影片指定步驟（步驟S250）包括：判斷信賴係數是否落入信賴值區間，其中，若信賴係數落入信賴區間時，選擇此些影片中具有最小的信賴係數的影片作為指定影片，反之，若信賴係數未落入信賴區間時，選擇此些影片中的預設影片為指定影片。具體而言，影片指定模組230用於判斷信賴係數是否落入信賴值區間。若有任一個信賴係數落入信賴區間時，影片指定模組230在落入信賴區間的此些信賴係數之中挑出最小的信賴係數，並且影片指定模組230選擇具有此最小的信賴係數的影片作為指定影片。因為數值越小的信賴係數代表影片的辨識值與目標特徵值的差異越小，也就是影片中越可能存在目標人物。並且，指定影片對應的信賴係數也需要落入信賴區間中。反之，若此些信賴係數皆未落入信賴區間時，影片指定模組230選擇此些影片中的預設影片為指定影片。也就是，如果目前每一個影片對應的信賴係數皆不在信賴區間中，則影片指定模組230選擇事先預設的影像輸入裝置100所拍攝的影片做為指定影片，例如前文提及的第一攝影機。依據一些實施例，信賴區間為0.814～1.65。當信賴係數皆大於1.65時，代表各個影片中都不存在目標人物，因此影片指定模組230選擇以預設的影像輸入裝置100的影片做為指定影片。反之，當信賴係數小於0.814時，代表影片的辨識值與目標特徵值的差異極小，也就是影片中存在目標人物。由於信賴係數小於0.814的情形相對罕見，因此可摒除在信賴區間外。The film specifying step (step S250) includes: determining whether the reliability coefficient falls within the confidence interval, wherein, if the reliability coefficient falls within the confidence interval, the movie with the smallest reliability coefficient among these films is selected as the specified movie, otherwise, if the reliability is When the coefficient does not fall within the confidence interval, select the default video among these videos as the designated video. Specifically, the film specifying module 230 is used to determine whether the reliability coefficient falls within the confidence value interval. If any one of the reliability coefficients falls within the confidence interval, the film specifying module 230 selects the smallest reliability coefficient among the reliability coefficients falling within the confidence interval, and the movie specifying module 230 selects the one with the smallest reliability coefficient The video is the designated video. Because the smaller the value of the reliability coefficient, the smaller the difference between the identification value of the film and the target feature value, that is, the more likely the target person is in the film. Moreover, the reliability coefficient corresponding to the specified movie also needs to fall into the confidence interval. Conversely, if none of these reliability coefficients fall within the confidence interval, the video specifying module 230 selects the default video among these videos as the specified video. That is, if the current reliability coefficient corresponding to each video is not in the confidence interval, the video specification module 230 selects the video taken by the preset image input device 100 as the specified video, such as the first camera mentioned above. . According to some embodiments, the confidence interval is 0.814-1.65. When the reliability coefficients are all greater than 1.65, it means that there is no target person in each movie. Therefore, the movie specifying module 230 selects the preset movie of the image input device 100 as the specified movie. Conversely, when the trust factor is less than 0.814, it means that the difference between the identification value of the film and the target feature value is extremely small, that is, the target person exists in the film. Since it is relatively rare that the confidence coefficient is less than 0.814, it can be excluded from the confidence interval.

監控追蹤影片整合步驟（步驟S260）包括：將多個不同時間區間所分別對應的指定影片整合為監控追蹤影片。具體而言，雖然在每一個時間區間中只有一個指定影片，但是不同的時間區間之間的指定影片不一定相同。因此藉由監控追蹤影片整合模組240能將多個影片的片段（即為多個不同時間區間所分別對應的指定影片）剪接為單一的監控追蹤影片。在一些實施例中，監控追蹤影片整合模組240即時偵測目前的指定影片，並且將指定影片轉存為監控追蹤影片的一部份。The monitoring and tracking video integration step (step S260) includes: integrating a plurality of designated videos corresponding to different time intervals into a monitoring and tracking video. Specifically, although there is only one designated movie in each time interval, the designated movies between different time intervals are not necessarily the same. Therefore, the monitoring and tracking video integration module 240 can cut multiple video clips (that is, multiple specified videos corresponding to different time intervals) into a single monitoring and tracking video. In some embodiments, the monitoring and tracking video integration module 240 detects the current designated video in real time, and transfers the designated video as a part of the monitoring and tracking video.

需特別說明的是，在一些實施例，當同一時間點的兩個影片的信賴係數差異很小，也就是這兩個影片都有做為指定影片的資格時，如果兩個影片中的背景資訊（影片中除了人形影像之外的部分）重疊比例高於一設定值，監控追蹤影片整合模組240還能拼接這兩個影片的畫面以做為監控追蹤影片。It should be noted that, in some embodiments, when the difference in the trust factor of two videos at the same time point is very small, that is, when both videos are qualified as designated videos, if the background information in the two videos The overlap ratio (except for the human image in the film) is higher than a set value, and the monitoring and tracking video integration module 240 can also stitch the frames of the two videos as a monitoring and tracking video.

人物持續追蹤步驟（步驟S270）包括：依據指定影片中的第一時間點的第一人形圖片，獲得第一追蹤值；依據該指定影片中一第二時間點的第二人形圖片，獲得第二追蹤值，其中第二時間點晚於第一時間點；比對第一追蹤值及第二追蹤值，而獲得追蹤比對值，當追蹤比對值大於等於追蹤閥值時，認定第一人形圖片與第二人形圖片相符，反之，當追蹤比對值小於追蹤閥值時，認定第一人形圖片與第二人形圖片不相符，並取消設定指定影片；以及，結合第一追蹤值與第二追蹤值以設定更新追蹤值。由於多攝影機監控追蹤方法中的人物持續追蹤步驟（步驟S270）對應於多攝影機監控追蹤系統10中的人物持續追蹤程序，因此詳細說明請參閱下一段落。The step of continuously tracking the character (step S270) includes: obtaining a first tracking value according to the first humanoid picture at the first time point in the specified movie; obtaining the second humanoid picture according to the second humanoid picture at a second time point in the specified movie Tracking value, where the second time point is later than the first time point; the first tracking value and the second tracking value are compared to obtain the tracking comparison value. When the tracking comparison value is greater than or equal to the tracking threshold, the first human figure is determined The picture matches the second humanoid picture. On the contrary, when the tracking comparison value is less than the tracking threshold, the first humanoid picture is determined to be inconsistent with the second humanoid picture, and the specified video is canceled; and, combining the first tracking value and the second The tracking value is set to update the tracking value. Since the person continuous tracking step (step S270) in the multi-camera monitoring and tracking method corresponds to the person continuous tracking procedure in the multi-camera monitoring and tracking system 10, please refer to the next paragraph for detailed description.

圖9繪示本案一些實施例之人物持續追蹤程序的流程圖。請同時參照圖2及圖9。在一些實施例中，人物持續追蹤模組250依據人物持續追蹤程序以偵測指定影片中的人形影像以擷取追蹤比對值，並且人物持續追蹤模組250進一步依據人物持續追蹤程序判斷追蹤比對值是否大於等於追蹤閥值以判定第一人形圖片與第二人形圖片是否相符。在一些實施例，人物持續追蹤程序包括以下步驟：依據指定影片中的第一時間點的第一人形圖片，獲得第一追蹤值（步驟S910）；依據該指定影片中第二時間點的第二人形圖片，獲得第二追蹤值，其中第二時間點晚於第一時間點（步驟S920）；比對第一追蹤值及第二追蹤值，而獲得追蹤比對值（步驟S930）；結合第一追蹤值與第二追蹤值以設定更新追蹤值（步驟S940）判斷追蹤比對值是否大於等於追蹤閥值（步驟S950）；當追蹤比對值大於等於追蹤閥值時，認定第一人形圖片與第二人形圖片相符（步驟S960）；反之，當追蹤比對值小於追蹤閥值時，認定第一人形圖片與第二人形圖片不相符，並取消設定指定影片（步驟S970）。FIG. 9 shows a flowchart of a continuous tracking procedure of a person in some embodiments of this case. Please refer to Figure 2 and Figure 9 at the same time. In some embodiments, the person continuous tracking module 250 detects the human figure in the specified movie according to the person continuous tracking process to capture the tracking comparison value, and the person continuous tracking module 250 further determines the tracking ratio according to the person continuous tracking process. Whether the value is greater than or equal to the tracking threshold to determine whether the first figure picture matches the second figure picture. In some embodiments, the character continuous tracking procedure includes the following steps: obtaining a first tracking value according to the first humanoid picture at the first time point in the specified movie (step S910); according to the second time point in the specified movie The humanoid picture obtains a second tracking value, where the second time point is later than the first time point (step S920); the first tracking value and the second tracking value are compared to obtain the tracking comparison value (step S930); A tracking value and a second tracking value are set to update the tracking value (step S940) to determine whether the tracking comparison value is greater than or equal to the tracking threshold (step S950); when the tracking comparison value is greater than or equal to the tracking threshold, the first figure picture is determined It is consistent with the second humanoid picture (step S960); otherwise, when the tracking comparison value is less than the tracking threshold, it is determined that the first humanoid picture does not match the second humanoid picture, and the designated video is canceled (step S970).

依據指定影片中的第一時間點的第一人形圖片，獲得第一追蹤值（步驟S910）。依據該指定影片中第二時間點的第二人形圖片，獲得第二追蹤值，其中第二時間點晚於第一時間點（步驟S920）。具體而言，藉由前述的步驟S310至步驟S330，人物持續追蹤模組250能獲得指定影片中的第一時間點的第一人形圖片以及指定影片中第二時間點的第二人形圖片。在一些實施例中，透過使用OpenPose演算法能經由指定影片中的人形影像運算獲得骨架資訊，而骨架資訊之中包括人形影像的多個身體位置點。再依據此些身體位置點的資訊，就能定位人形影像在影片中任一個時間的位置。因此人物持續追蹤模組250能擷取指定影片中在對應時間點中的辨識人形圖片900，也就是第一時間點的第一人形圖片及第二時間點的第二人形圖片。According to the first humanoid picture at the first time point in the designated movie, a first tracking value is obtained (step S910). According to the second humanoid picture at the second time point in the specified movie, a second tracking value is obtained, where the second time point is later than the first time point (step S920). Specifically, through the aforementioned steps S310 to S330, the character continuous tracking module 250 can obtain the first humanoid picture at the first time point in the specified movie and the second humanoid picture at the second time point in the specified movie. In some embodiments, by using the OpenPose algorithm, the skeleton information can be obtained through the calculation of the humanoid image in the specified video, and the skeleton information includes multiple body position points of the humanoid image. Based on the information of these body position points, the humanoid image can be located at any time in the film. Therefore, the character continuous tracking module 250 can capture the recognized humanoid picture 900 at the corresponding time point in the specified movie, that is, the first humanoid picture at the first time point and the second humanoid picture at the second time point.

承上，在一些實施例，人物持續追蹤模組250藉由ORB演算法能從第一人形圖片獲得第一追蹤值，以及從第二人形圖片獲得第二追蹤值。其中第一追蹤值對應於第一人形圖片的特徵，第二追蹤值對應於第二人形圖片的特徵。需特別說明的是，在一些實施例中，第一時間點是人物持續追蹤模組250執行步驟S910的時間點，而第二時間點是人物持續追蹤模組250執行步驟S920的時間點，因此第二時間點晚於第一時間點。並且依據一些實施例，第一時間點代表此段指定影片的起始時間點，而第二時間點比第一時間點要晚一個單位時間。In conclusion, in some embodiments, the person continuous tracking module 250 can obtain the first tracking value from the first humanoid picture and the second tracking value from the second humanoid picture through the ORB algorithm. The first tracking value corresponds to the feature of the first humanoid picture, and the second tracking value corresponds to the feature of the second humanoid picture. It should be noted that, in some embodiments, the first time point is the time point when the character continuous tracking module 250 performs step S910, and the second time point is the time point when the character continuous tracking module 250 performs step S920. Therefore, The second time point is later than the first time point. And according to some embodiments, the first time point represents the start time point of the specified movie, and the second time point is one unit time later than the first time point.

比對第一追蹤值及第二追蹤值，而獲得追蹤比對值（步驟S930）。具體而言，依據一些實施例，人物持續追蹤模組250藉由漢明比對（Hamming match）第一追蹤值及第二追蹤值以獲得追蹤比對值。而追蹤比對值即為第一追蹤值及第二追蹤值之間的差異，也就是第一人形圖片的特徵與第二人形圖片的特徵之間的差異。The first tracking value and the second tracking value are compared to obtain a tracking comparison value (step S930). Specifically, according to some embodiments, the person continuous tracking module 250 obtains the tracking comparison value by Hamming match between the first tracking value and the second tracking value. The tracking comparison value is the difference between the first tracking value and the second tracking value, that is, the difference between the characteristics of the first humanoid picture and the characteristics of the second humanoid picture.

結合第一追蹤值與第二追蹤值以設定更新追蹤值（步驟S940）。具體而言，更新追蹤值為第一追蹤值加上第二追蹤值的值，因此更新追蹤值能同時代表第一時間點及第二時間點的指定影片，並進一步去跟下一個時間點去比較（例如第三時間點）。在一些實施例中，人物持續追蹤模組250能進一步用更新追蹤值與第三追蹤值做漢明比對以獲得另一追蹤比對值，其中第三追蹤值對應於指定影片中的第三時間點的第三人形圖片，第三時間點比第二時間點要晚一個單位時間。因此另一追蹤比對值能同時代表，第三追蹤值與第一追蹤值之間的差異以及第三追蹤值與第二追蹤值之間的差異。也就是第三人形圖片的特徵與第一人形圖片的特徵之間的差異，以及第三人形圖片的特徵與第二人形圖片的特徵之間的差異。以此類推，藉由人物持續追蹤模組250不斷的更新設定更新追蹤值，人物持續追蹤模組250即能持續追蹤指定影片中的目標人物。The first tracking value and the second tracking value are combined to set the updated tracking value (step S940). Specifically, the updated tracking value is the value of the first tracking value plus the second tracking value, so the updated tracking value can represent the specified video at the first time point and the second time point at the same time, and further follow the next time point Comparison (for example, the third point in time). In some embodiments, the person continuous tracking module 250 can further use the updated tracking value to perform a Hamming comparison with the third tracking value to obtain another tracking comparison value, where the third tracking value corresponds to the third tracking value in the specified movie. The third person picture at the time point, the third time point is one unit time later than the second time point. Therefore, the other tracking comparison value can simultaneously represent the difference between the third tracking value and the first tracking value and the difference between the third tracking value and the second tracking value. That is, the difference between the characteristics of the third humanoid picture and the characteristics of the first humanoid picture, and the difference between the characteristics of the third humanoid picture and the characteristics of the second humanoid picture. By analogy, by continuously updating the setting and updating tracking value of the character continuous tracking module 250, the character continuous tracking module 250 can continuously track the target person in the specified movie.

判斷追蹤比對值是否大於等於追蹤閥值（步驟S950）；當追蹤比對值大於等於追蹤閥值時，認定第一人形圖片與第二人形圖片相符（步驟S960）；反之，當追蹤比對值小於追蹤閥值時，認定第一人形圖片與第二人形圖片不相符，並取消設定指定影片（步驟S970）。具體而言，人物持續追蹤模組250用於判斷追蹤比對值是否大於等於追蹤閥值。而追蹤閥值對應於第一人形圖片與第二人形圖片之間匹配的特徵數量，也就是匹配的特徵數量越大代表第一人形圖片與第二人形圖片的特徵越接近。當追蹤比對值大於等於追蹤閥值時，人物持續追蹤模組250認定第一人形圖片與第二人形圖片相符。也就是，人物持續追蹤模組250判斷指定影片中仍然有目標人物，並且與此段指定影片的起始時間點中的目標人物差異有限。當追蹤比對值小於追蹤閥值時，人物持續追蹤模組250認定第一人形圖片與第二人形圖片不相符。也就是，人物持續追蹤模組250判斷指定影片中的目標人物與此段指定影片的起始時間點中的目標人物有明顯差異。因此，人物持續追蹤模組250取消設定指定影片，而多攝影機監控追蹤系統10必須重新選擇另一個指定影片以監控目標人物。It is determined whether the tracking comparison value is greater than or equal to the tracking threshold (step S950); when the tracking comparison value is greater than or equal to the tracking threshold, it is determined that the first figure picture matches the second figure picture (step S960); otherwise, when the tracking comparison value is When the value is less than the tracking threshold, it is determined that the first humanoid picture does not match the second humanoid picture, and the designated video is canceled (step S970). Specifically, the person continuous tracking module 250 is used to determine whether the tracking comparison value is greater than or equal to the tracking threshold. The tracking threshold corresponds to the number of matching features between the first figure picture and the second figure picture, that is, the larger the number of matching features, the closer the features of the first figure picture and the second figure picture are. When the tracking comparison value is greater than or equal to the tracking threshold, the person continuous tracking module 250 determines that the first figure picture matches the second figure picture. That is, the character continuous tracking module 250 determines that there is still a target character in the specified movie, and the difference from the target character in the starting time point of the specified movie is limited. When the tracking comparison value is less than the tracking threshold, the person continuous tracking module 250 determines that the first humanoid picture does not match the second humanoid picture. That is, the character continuous tracking module 250 determines that there is a significant difference between the target character in the specified movie and the target character in the starting time point of the specified movie. Therefore, the character continuous tracking module 250 cancels the setting of the designated video, and the multi-camera monitoring and tracking system 10 must reselect another designated video to monitor the target person.

依據一些實施例，特徵閥值為15，代表追蹤比對值以15個匹配的特徵為分界。如果追蹤比對值小於15，則視為目標人物於當下的指定影片中已無法辨識或消失，因此人物持續追蹤模組250不再執行人物持續追蹤程序，而多攝影機監控追蹤系統10需要重新尋找目標人物。反之，如果追蹤比對值大於等於15，則視為目標人物仍存在於當下的指定影片，因此人物持續追蹤模組250繼續執行人物持續追蹤程序。According to some embodiments, the feature threshold value is 15, which means that the tracking comparison value is divided by 15 matching features. If the tracking comparison value is less than 15, it is deemed that the target person cannot be identified or disappeared in the current specified video. Therefore, the person continuous tracking module 250 no longer executes the person continuous tracking process, and the multi-camera monitoring and tracking system 10 needs to search again Target person. Conversely, if the tracking comparison value is greater than or equal to 15, it is deemed that the target person still exists in the current designated movie, and therefore the person continuous tracking module 250 continues to execute the person continuous tracking process.

在一些實施例，當控制器200是在執行人物持續追蹤步驟（步驟S270）時，由於只需分析指定影片而不必分析每一個影片，因此對於多攝影機監控追蹤系統10的運算負擔較低。反之，當控制器200是在執行影片擷取步驟（步驟S220）至影片指定步驟（步驟S250）時，由於控制器200要對每一個影片進行分析以獲得辨識值，因此對於多攝影機監控追蹤系統10的運算負擔較大。依據一些實施例，控制器200包括中央處理器（CPU）及圖形處理器（GPU）。控制器200是在執行影片擷取步驟（步驟S220）至影片指定步驟（步驟S250）時，控制器200會將部分中央處理器的運算負擔分配給圖形處理器，已降低中央處理器的使用率。相對於，控制器200是在執行人物持續追蹤步驟（步驟S270）時，控制器200只以中央處理器的做為運算，因為此時運算上的複雜度低。所以多攝影機監控追蹤系統10具有低中央處理器使用率的優點。In some embodiments, when the controller 200 is performing the person continuous tracking step (step S270), since it only needs to analyze the specified movie and not every movie, the calculation burden for the multi-camera monitoring and tracking system 10 is relatively low. Conversely, when the controller 200 is performing the video capturing step (step S220) to the video specifying step (step S250), since the controller 200 needs to analyze each video to obtain the identification value, it is therefore a multi-camera monitoring and tracking system The calculation burden of 10 is greater. According to some embodiments, the controller 200 includes a central processing unit (CPU) and a graphics processing unit (GPU). When the controller 200 executes the video capturing step (step S220) to the video specifying step (step S250), the controller 200 will allocate part of the computing burden of the CPU to the graphics processor, which has reduced the utilization rate of the CPU . In contrast, when the controller 200 executes the person continuous tracking step (step S270), the controller 200 only uses the central processing unit as the calculation, because the calculation complexity is low at this time. Therefore, the multi-camera monitoring and tracking system 10 has the advantage of low CPU usage.

綜上所述，本案中一些實施例的多攝影機監控追蹤系統及其方法能藉由比對目標人物的目標特徵值與各個影片中的人形影像的辨識值以監控各個影片中可能出現的目標人物。在一些實施例，多攝影機監控追蹤系統及其方法也能整合多個不同時間區間所分別對應的指定影片為監控追蹤影片，因此能讓使用者以單一畫面持續監控目標人物。在一些實施例，多攝影機監控追蹤系統及其方法能偵測指定影片中的人形影像的追蹤比對值以判別是否要重新選擇指定影片，因此也具有低運算量的優點。In summary, the multi-camera monitoring and tracking system and method thereof in some embodiments of this case can monitor the target person that may appear in each movie by comparing the target feature value of the target person with the identification value of the human image in each movie. In some embodiments, the multi-camera monitoring and tracking system and method thereof can also integrate multiple designated videos corresponding to different time intervals into monitoring and tracking videos, so that the user can continuously monitor the target person on a single screen. In some embodiments, the multi-camera monitoring and tracking system and the method thereof can detect the tracking comparison value of the human image in the specified movie to determine whether to reselect the specified movie, and therefore has the advantage of low computational complexity.

10:監控系統 100:影像輸入裝置 200:控制器 210:辨識值擷取模組 220:信賴係數計算模組 230:影片指定模組 240:監控追蹤影片整合模組 250:人物持續追蹤模組 300:影像輸出裝置 400:目標特徵值擷取裝置 900:辨識人形圖片 K0:頭頂點 K1:頸關節點 K2、K5:肩關節點 K3、K6:肘關節點 K4、K7:腕關節點 K8、K11:髖關節點 K9、K12:膝關節點 K10、K13:踝關節點 A1:第一評分區域 A2:第二評分區域 A3:第三評分區域 A4:第四評分區域 S210-S270:步驟 S310-S350:步驟 S910-S970:步驟10: Monitoring system 100: Video input device 200: Controller 210: Identification value acquisition module 220: Reliability coefficient calculation module 230: Video designated module 240: Monitoring and tracking video integration module 250: Character Continuous Tracking Module 300: Video output device 400: Target feature value extraction device 900: Recognize humanoid pictures K0: head vertex K1: neck joint point K2, K5: Shoulder joint points K3, K6: elbow joint point K4, K7: wrist joint points K8, K11: hip joint points K9, K12: knee joint points K10, K13: Ankle joint point A1: The first scoring area A2: Second scoring area A3: The third scoring area A4: The fourth scoring area S210-S270: steps S310-S350: steps S910-S970: steps

圖1繪示本案一些實施例之多攝影機監控追蹤系統的示意圖。圖2繪示本案一些實施例之多攝影機監控追蹤方法的示意圖。圖3繪示本案一些實施例之辨識值擷取程序的流程圖。圖4繪示本案一些實施例之身體位置點的示意圖。圖5繪示本案一些實施例之第一評分區域的示意圖。圖6繪示本案一些實施例之第二評分區域的示意圖。圖7繪示本案一些實施例之第三評分區域的示意圖。圖8繪示本案一些實施例之第四評分區域的示意圖。圖9繪示本案一些實施例之人物持續追蹤程序的流程圖。 Figure 1 shows a schematic diagram of a multi-camera monitoring and tracking system according to some embodiments of the present case. FIG. 2 is a schematic diagram of a multi-camera monitoring and tracking method according to some embodiments of this case. FIG. 3 shows a flowchart of the identification value acquisition procedure of some embodiments of the present application. FIG. 4 is a schematic diagram of the body position points of some embodiments of this case. FIG. 5 is a schematic diagram of the first scoring area in some embodiments of this case. FIG. 6 is a schematic diagram of the second scoring area in some embodiments of this case. FIG. 7 is a schematic diagram of the third scoring area in some embodiments of this case. FIG. 8 is a schematic diagram of the fourth scoring area in some embodiments of this case. FIG. 9 shows a flowchart of a continuous tracking procedure of a person in some embodiments of this case.

10:多攝影機監控追蹤系統 10: Multi-camera monitoring and tracking system

100:影像輸入裝置 100: Video input device

200:控制器 200: Controller

210:辨識值擷取模組 210: Identification value acquisition module

220:信賴係數計算模組 220: Reliability coefficient calculation module

230:影片指定模組 230: Video designated module

240:監控追蹤影片整合模組 240: Monitoring and tracking video integration module

250:人物持續追蹤模組 250: Character Continuous Tracking Module

300:影像輸出裝置 300: Video output device

400:目標特徵值擷取裝置 400: Target feature value extraction device

Claims

A multi-camera monitoring and tracking method includes: A target feature value capturing step, capturing multiple target pictures of a target person, and obtaining multiple target feature values according to the target pictures; In a video capturing step, a video is captured by a plurality of image input devices; An identification value capturing step, when it is determined that there is a humanoid image in each movie, an identification value is captured for the humanoid image; A step of calculating a trust factor, comparing the identification value with the target characteristic values to obtain a trust factor; A film designation step to determine whether the trust factor falls within a confidence interval, and if the trust factor falls within the trust interval, select the movie with the smallest trust factor among the movies as a designated movie; and A monitoring and tracking video integration step is to integrate the designated videos corresponding to a plurality of different time intervals into a monitoring and tracking video.

The multi-camera monitoring and tracking method according to claim 1, wherein the reliability coefficient calculation step includes, if the reliability coefficient does not fall within the confidence interval, selecting a preset film among the films as the designated film.

The multi-camera monitoring and tracking method described in claim 2 further includes a step of continuously tracking a person, including: Obtaining a first tracking value according to a first humanoid picture at a first time point in the designated movie; Obtaining a second tracking value according to a second humanoid picture at a second time point in the designated movie, wherein the second time point is later than the first time point; Compare the first tracking value and the second tracking value to obtain a tracking comparison value, and when the tracking comparison value is greater than or equal to a tracking threshold, it is determined that the first figure picture matches the second figure picture; and Combine the first tracking value and the second tracking value to set an updated tracking value.

The multi-camera monitoring and tracking method according to claim 3, wherein the continuous tracking step of the person includes: when the tracking comparison value is less than the tracking threshold, determining that the first humanoid picture does not match the second humanoid picture, And unset the specified video.

The multi-camera monitoring and tracking method according to claim 1, wherein the identification value capturing step includes: Detecting the humanoid image to obtain a skeleton information; Obtain multiple body position points according to the skeleton information; Capturing the humanoid image according to the body position points as a recognizable humanoid image; Segment the recognizable humanoid image into multiple scoring areas; and The identification value of the human image is obtained according to a color feature value corresponding to each scoring area and a texture feature value corresponding to each scoring area.

A multi-camera monitoring and tracking system includes: A target feature value capturing device for capturing multiple target pictures of a target person, and obtaining multiple target feature values according to the target pictures; A plurality of image input devices are respectively used to capture a video; A controller for receiving the target feature values and the films, the controller includes: An identification value acquisition module for detecting a humanoid image in each video to acquire an identification value according to an identification value acquisition process; A trust factor calculation module for comparing the identification value with the target characteristic values to obtain a trust factor; A film designation module for determining whether the reliability coefficient falls within a confidence interval, wherein, if the reliability coefficient falls within the confidence interval, the film designation module selects the film with the smallest reliability coefficient among the films The video is regarded as a designated video. On the contrary, if the reliability factor does not fall within the confidence interval, the video designation module selects a default video among the videos as the designated video; and A monitoring and tracking video integration module for integrating the designated videos corresponding to multiple different time intervals into a monitoring and tracking video; and An image output device for outputting the monitoring and tracking video.

The multi-camera monitoring and tracking system according to claim 6, wherein the controller further includes a character continuous tracking module, and the character continuous tracking module detects the humanoid image in the specified movie according to a character continuous tracking procedure. A tracking comparison value is captured, and the person continuous tracking module determines whether the tracking comparison value is greater than or equal to a tracking threshold to determine whether the first human figure picture matches the second human figure picture, wherein, if the tracking comparison When the value is greater than or equal to a tracking threshold, the person continuous tracking module determines that the first humanoid picture matches the second humanoid image; otherwise, if the tracking comparison value is less than the tracking threshold, the person continuous tracking module It is determined that the first humanoid picture does not match the second humanoid picture, and the designated video is canceled.

The multi-camera monitoring and tracking system according to claim 7, wherein the continuous tracking program for the person includes: Obtaining a first tracking value according to a first humanoid picture at a first time point in the designated movie; Obtaining a second tracking value according to a second humanoid picture at a second time point in the designated movie, wherein the second time point is later than the first time point; Compare the first tracking value and the second tracking value to obtain a tracking comparison value; and Combine the first tracking value and the second tracking value to set an updated tracking value.

The multi-camera monitoring and tracking system according to claim 6, wherein the identification value acquisition step includes: Detecting the humanoid image to obtain a skeleton information; Obtain multiple body position points according to the skeleton information; Capturing the humanoid image according to the body position points as a recognizable humanoid image; Segment the recognizable humanoid image into multiple scoring areas; and The identification value of the human image is obtained according to a color feature value corresponding to each scoring area and a texture feature value corresponding to each scoring area.

The multi-camera monitoring and tracking system according to claim 6, wherein the target feature value capturing device captures the target pictures by performing ring photography on the target person.