TW202022632A

TW202022632A - Customizable method for detecting specific style objects

Info

Publication number: TW202022632A
Application number: TW107143927A
Authority: TW
Inventors: 林多常; 鄭維恆; 簡大為; 張慶年
Original assignee: 中華電信股份有限公司
Priority date: 2018-12-06
Filing date: 2018-12-06
Publication date: 2020-06-16
Also published as: TWI670610B

Abstract

Basically, a computer program is used to synthesize a specific style object image sample or a specific style object image sample is generated by using a video screenshot or image samples of specific style objects in the image database of various specific style objects prepared in the past are used for learning by a streamlined and dedicated CNN image classifier in order to enable classification and identification of objects of a specific style; further, a CNN image object detector having completed training is generally used to detect the image area containing objects in an image, and the screenshot is sent to the previously trained special CNN image classifier to determine whether it is a specific style object so as to achieve detection of specific style objects. In particular, the specific style training samples of each object in the database can be collected cumulatively and continuously, and more and more suitable training samples can be provided for the future to be selected immediately when required, such that the training of the proprietary CNN image classifier can be completed more quickly, especially for the detection and identification of customized specific style objects.

Description

Customizable methods for detecting objects in specific styles

本發明係關於一種可客製化用於偵測特定樣式物件之方法，特別是指訓練卷積神經網路圖像分類器，配合既有完成訓練之卷積神經網路影像物件偵測器，以偵測特定樣式物件之方法。 The present invention relates to a customizable method for detecting objects of specific styles, in particular to training a convolutional neural network image classifier, in conjunction with an existing trained convolutional neural network image object detector, To detect specific style objects.

一般既有單一大型的CNN(Convolutional Neural Networks，卷積神經網路)影像物件偵測器如Faster RCNN(Region-Convolutional Neural Network)、Single Shot MultiBox Detector及Yolo(You only look once)等等，都可對影像內訓練過的一般物件做不錯的偵測辨識，但若想在訓練過的一般物件外，再新增額外特定樣式物件種類做訓練時，除了需要更多的訓練時間，重做訓練及可能會劣化對先前一般物件偵測的能力外，另要以人工方式取得大量特定樣式物件影像做標示訓練，將尤其困難。 Generally, there are a single large-scale CNN (Convolutional Neural Networks, convolutional neural network) image object detector such as Faster RCNN (Region-Convolutional Neural Network), Single Shot MultiBox Detector and Yolo (You only look once), etc. It can do good detection and recognition of the general training objects in the image, but if you want to add additional specific style object types for training in addition to the trained general objects, in addition to the need for more training time, redo the training In addition to the possibility of deteriorating the ability to detect previous general objects, it is particularly difficult to manually obtain a large number of images of specific objects for marking training.

鑑於過去既有單一大型的CNN(卷積神經網路)影像物件偵測方法，若因應客製化需求，對物件的特定樣式做偵測，將需要重新選用各種物件大量的訓練樣本做全尺寸監控影像的物件偵測訓練，除了會影響先前對一般物件之偵測辨識準度，另外，在短時間內取得大量物件特定樣式樣本做偵測辨識訓練，將難以達成。 In view of the existing single large-scale CNN (Convolutional Neural Network) image object detection method in the past, if the detection of a specific style of the object meets the needs of customization, it will be necessary to re-select a large number of training samples of various objects to make full size Supervise Object detection training for controlling images will affect the previous detection and recognition accuracy of general objects. In addition, it is difficult to obtain a large number of object-specific pattern samples for detection and recognition training in a short period of time.

由此可見，既有的方法無法因應客製化需求，對偵測物件的特定樣式，在短時間內做進一步偵測辨識訓練的問題，實非良善之設計，而亟待加以改良。 It can be seen that the existing methods cannot meet the needs of customization. For the specific pattern of the detection object, the problem of further detection and recognition training in a short time is not a good design and needs to be improved.

為達上述之目的，本發明提供一種可客製化用於偵測特定樣式物件之方法，其包括：訓練步驟，其包括：建置訓練用樣本資料庫，其中，該訓練用樣本資料庫包括含有特定樣式物件的影像訓練樣本及不含特定樣式物件的影像訓練樣本；提供該訓練用樣本資料庫至CNN圖像分類器訓練學習，以產生經訓練之分類模型參數；以及提供該經訓練之分類模型參數至特定樣式物件CNN圖樣分類器；偵測步驟，其包括：提供待偵測之監控影像；透過既有CNN影像物件偵測器擷取該待偵測之監控影像中含有物件的區域影像；以及提供該含有物件的區域影像至該特定樣式物件CNN圖樣分類器以辨識出特定樣式物件，進而輸出該特定樣式物件的偵測結果；以及優化步驟，其包括：判讀該特定樣式物件CNN圖樣分類器所辨識該特定樣式物件，以將該特定樣式物件CNN圖樣分類器所辨識不正確的該含有物件的區域影像重新製成正確標示訓練樣本；以及輸入該重新製成正確標示訓練樣本至該訓練用樣本資料庫，提供該重新製成正確標示訓練樣本至該CNN圖像分類器訓練學習以產生經優化之分類模型參數。 To achieve the above objective, the present invention provides a customizable method for detecting objects of a specific style, which includes a training step, which includes: building a training sample database, wherein the training sample database includes Image training samples containing objects with specific styles and image training samples without objects with specific styles; providing the training sample database to CNN image classifier training and learning to generate trained classification model parameters; and providing the trained Classify model parameters to a CNN pattern classifier for objects of a specific style; the detection step includes: providing a surveillance image to be detected; capturing the area containing objects in the surveillance image to be detected through an existing CNN image object detector Image; and providing the region image containing the object to the specific style object CNN pattern classifier to identify the specific style object, and then output the detection result of the specific style object; and the optimization step, which includes: interpreting the specific style object CNN The pattern classifier recognizes the specific style object to remake the image of the region containing the object that was incorrectly identified by the CNN pattern classifier of the particular pattern object into a correctly labeled training sample; and input the recreated correctly labeled training sample to The training sample database provides the reconstructed correctly labeled training sample to the training of the CNN image classifier Learning to generate optimized classification model parameters.

在前述之方法中，該建置訓練用樣本資料庫之步驟更包括：透過物件影像資料庫或網路圖像搜尋引擎以搜尋物件影像及特徵影像；以及以不同大小、位置、角度、及變形方式，配合背景影像將該物件影像及該特徵影像透過計算機合成為該含有特定樣式物件的影像訓練樣本，以將該含有特定樣式物件的影像訓練樣本輸入至該訓練用樣本資料庫。 In the aforementioned method, the steps of building a training sample database further include: searching for object images and feature images through an object image database or a network image search engine; and using different sizes, positions, angles, and deformations In this way, the object image and the characteristic image are synthesized by a computer into the image training sample containing the specific style object in accordance with the background image, so that the image training sample containing the specific style object is input to the training sample database.

在前述之方法中，更包括：將不含有該特徵影像的其他影像與該背景影像合成為該不含特定樣式物件的影像訓練樣本，以將該不含特定樣式物件的影像訓練樣本輸入至該訓練用樣本資料庫。 In the aforementioned method, it further includes: synthesizing other images without the characteristic image and the background image into the image training sample without the specific style object, so as to input the image training sample without the specific style object to the Sample database for training.

在前述之方法中，該建置訓練用樣本資料庫之步驟更包括：輸入監控錄影影像至該既有CNN影像物件偵測器以擷取該待監控錄影影像中含有物件的區域影像；以及判讀該待監控錄影影像的該含有物件的區域影像，以辨識是否該物件為該特定樣式物件。若該物件是該特定樣式物件，歸類該含有物件的區域影像為該含有特定樣式物件的影像訓練樣本，而若該物件不是該特定樣式物件，歸類該含有物件的區域影像為該不含特定樣式物件的影像訓練樣本。 In the aforementioned method, the step of constructing the training sample database further includes: inputting the monitored video image to the existing CNN image object detector to capture the image of the area containing the object in the video image to be monitored; and interpreting The object-containing area image of the video image to be monitored is used to identify whether the object is the specific style object. If the object is the specific style object, the region image that contains the object is classified as the image training sample containing the specific style object, and if the object is not the specific style object, the region image that contains the object is classified as the non-exclusive Video training samples of specific style objects.

在前述之方法中，該建置訓練用樣本資料庫之步驟更包括：提供多個角度的該特定樣式物件的錄影影像，若該特定樣式物件佔據該錄影影像整體，歸類該錄影影像為該含有特定樣式物件的影像訓練樣本，而若該特定樣式物件未佔據該錄影影像整體，輸入該錄影影像至該既有CNN影像物件偵測器擷取該錄影影像中含有物件的區域影像並歸類為該含有特定樣式物件的影像訓練樣本。 In the aforementioned method, the step of building a training sample database further includes: providing a video image of the specific style object at multiple angles, and if the specific style object occupies the entire video image, classify the video image as the Image training samples containing a specific style object, and if the specific style object Without occupying the whole of the recorded image, input the recorded image to the existing CNN image object detector to capture the image of the area containing the object in the recorded image and classify it as the image training sample containing the object of a specific pattern.

在前述之方法中，該建置訓練用樣本資料庫之步驟更包括：輸入含有物件但非特定樣式物件的監控錄影影像至該既有CNN影像物件偵測器以擷取該監控錄影影像中含有物件的區域影像並歸類為該不含特定樣式物件的影像訓練樣本。 In the aforementioned method, the step of building a training sample database further includes: inputting a surveillance video image containing objects but non-specific style objects to the existing CNN image object detector to capture the surveillance video image containing The area image of the object is classified as the image training sample that does not contain the specific style object.

在前述之方法中，該特定樣式物件CNN圖樣分類器係具有二個卷積層、二個池化層、二個全連接層、及三十六個分類輸的CNN圖像分類器。 In the aforementioned method, the CNN pattern classifier for the specific style object is a CNN image classifier with two convolutional layers, two pooling layers, two fully connected layers, and thirty-six classification inputs.

在前述之方法中，該既有CNN影像物件偵測器是Yolo(You only look once)CNN影像物件偵測器。 In the aforementioned method, the existing CNN image object detector is a Yolo (You only look once) CNN image object detector.

本發明之可客製化用於偵測特定樣式物件的方法，係利用精簡專用的CNN圖像分類器，就既有單一大型CNN影像物件偵測器所偵測出之物件區域影像，是否為物件的特定樣式做進一步分類辨識，以達成對新增特定樣式物件的偵測辨識；藉由計算機程式依設計大量合成製作或以錄影擷圖或在過去製備之各式特定樣式物件影像資料庫內選取，來產生物件特定樣式的訓練影像樣本，迅速完成精簡專用CNN圖像分類器的訓練，提供所需物件特定樣式之分類辨識能力；尤有甚者，訓練樣本資料庫中之各物件特定樣式訓練樣本，可持續依需求加以收集製造，與時俱進的使日後CNN圖像分類器的模型參數的訓練學習能更多樣及更精確，而可適合用於客製化特定樣式物件的偵測辨識。 The customizable method of the present invention for detecting objects of specific styles uses a streamlined and dedicated CNN image classifier to determine whether the object area image detected by a single large CNN image object detector is The specific style of the object is further classified and identified to achieve the detection and identification of the newly added specific style object; a large number of synthetic productions or video captures or various specific style object image databases prepared in the past are designed by computer programs Select to generate training image samples of object-specific styles, quickly complete the training of the streamlined dedicated CNN image classifier, and provide the ability to classify and recognize the required object-specific styles; especially, the specific styles of each object in the training sample database Training samples can be continuously collected and manufactured according to needs, and keep pace with the times to make the training and learning of model parameters of CNN image classifiers more diverse in the future And more accurate, and can be used to customize the detection and recognition of objects with specific styles.

101‧‧‧物件影像資料庫 101‧‧‧Object Image Database

102‧‧‧網路圖像搜尋引擎 102‧‧‧Internet image search engine

103‧‧‧物件影像 103‧‧‧Object image

104‧‧‧特徵影像 104‧‧‧Feature image

105‧‧‧背景影像 105‧‧‧Background image

106‧‧‧計算機程式影像合成 106‧‧‧Computer program image synthesis

107‧‧‧特定樣式物件影像訓練樣本 107‧‧‧Specific style object image training sample

108‧‧‧合成特徵抹除影像訓練樣本 108‧‧‧Synthetic feature erasure image training sample

109‧‧‧訓練用樣本資料庫 109‧‧‧Sample database for training

110‧‧‧CNN圖像分類器 110‧‧‧CNN image classifier

111‧‧‧監控錄影影像 111‧‧‧Monitoring video

112‧‧‧特定樣式物件的錄影影像 112‧‧‧Video images of specific style objects

114‧‧‧既有CNN影像物件偵測器 114‧‧‧Existing CNN image object detector

117‧‧‧判定是否為特定樣式物件 117‧‧‧Determine whether it is a specific style object

118‧‧‧一般物件影像訓練樣本 118‧‧‧General object image training sample

201‧‧‧監控影像 201‧‧‧Monitoring image

210‧‧‧特定樣式物件CNN圖樣分類器 210‧‧‧ CNN pattern classifier for specific style objects

214‧‧‧既有完成物件偵測訓練之CNN影像物件偵測器 214‧‧‧Existing CNN image object detector that completed object detection training

215‧‧‧正確標示訓練樣本 215‧‧‧Correctly label training samples

217‧‧‧判讀挑選錯誤分類 217‧‧‧Wrong classification of interpretation selection

219‧‧‧特定樣式物件偵測結果 219‧‧‧Specific style object detection result

請參閱有關本發明之詳細說明及其附圖，將可進一步瞭解本發明之技術內容及其目的功效。 Please refer to the detailed description of the present invention and its accompanying drawings to further understand the technical content of the present invention and its objectives and effects.

第1圖為本發明之第一具體實施例之方法流程示意圖。 Figure 1 is a schematic diagram of the method flow of the first embodiment of the present invention.

第2圖為本發明之第二具體實施例之方法流程示意圖。 Figure 2 is a schematic diagram of the method flow of the second embodiment of the present invention.

第3圖為本發明之第三具體實施例之方法流程示意圖。 Figure 3 is a schematic diagram of the method flow of the third embodiment of the present invention.

第4圖為本發明之第四具體實施例之方法流程示意圖。 Figure 4 is a schematic diagram of the method flow of the fourth embodiment of the present invention.

本發明之可客製化用於偵測特定樣式物件之方法，其係透過計算機程式合成製作特定樣式物件影像樣本、或使用錄影擷圖產生特定樣式物件影像樣本、或以經製備完成之各式特定樣式物件影像訓練用樣本資料庫109內之特定樣式物件影像樣本，提供至精簡專用的CNN(Convolutional Neural Networks，卷積神經網路)圖像分類器110學習，以產生可分類辨識出特定樣式物件之模型參數，再以既有完成物件偵測訓練之CNN影像物件偵測器214(例如：Yolo CNN影像物件偵測器)，偵測影像中含物件之影像區域，輸入先前訓練好之特定樣式物件CNN圖樣分類器210判定是否為特定樣式物件，以達成對特定樣式物件的偵測。 The customizable method of the present invention for detecting objects of a specific style is to synthesize image samples of objects of specific styles through computer program synthesis, or use video captures to generate image samples of objects of specific styles, or use various prepared types The specific style object image samples in the sample database 109 for specific style object image training are provided to the streamlined and dedicated CNN (Convolutional Neural Networks, convolutional neural network) image classifier 110 for learning, so as to generate a classification that can recognize a specific style Model parameters of the object, and then use the existing CNN image object detector 214 (for example: Yolo CNN image object detector) that has completed object detection training to detect the image area containing the object in the image, and input the specific training The style object CNN pattern classifier 210 determines that it is Whether it is a specific style object to achieve the detection of a specific style object.

該特定樣式物件CNN圖樣分類器210，可選用一精簡的卷積神經網路(CNN)圖樣分類器，例如網路層僅含有二個卷積層(Convolutional layers)；二個池化層(Pool layers)、二個全連接層(Fully connected layers)及三十六個分類輸出及輸入像素40x40的CNN圖像分類器，除可提供36種分類，對一般與物件大小比例相近的特定樣式物件，具有充分的分類辨識能力，而既有完成物件偵測訓練之CNN影像物件偵測器214則可選用目前效能較佳的Yolo(You only look once)CNN影像物件偵測器。 For the specific style object CNN pattern classifier 210, a simplified convolutional neural network (CNN) pattern classifier can be selected. For example, the network layer contains only two convolutional layers; two pooling layers (Pool layers). ), two fully connected layers and thirty-six classification output and input pixels of 40x40 CNN image classifier, in addition to providing 36 types of classification, for specific style objects that are generally close to the object size Sufficient classification and recognition capabilities, and the existing CNN image object detector 214 that has completed object detection training can choose the current better performance Yolo (You only look once) CNN image object detector.

本發明之可客製化用於偵測特定樣式物件之方法的具體實施方式，其主要可分為訓練步驟、偵測步驟及優化步驟，以下列四個實施例提供說明。 The specific implementation of the method for detecting objects with a specific style of the present invention can be customized, which can be mainly divided into a training step, a detection step, and an optimization step. The following four embodiments provide an explanation.

本發明之第一實施例： The first embodiment of the present invention:

如第1圖所示，在訓練步驟中，當特定樣式物件影像為某種物件影像中包含某種特定特徵時，在物件影像資料庫101或網路圖像搜尋引擎102上找出該種物件影像103的各種形態影像及該特定特徵影像104的各種形態影像，並預先取得各種合適的背景影像105，將各形態的物件影像103及其特定特徵影像104以不同大小、位置、角度、及變形的方式，以計算機程式影像合成106並配合各種背景影像105，自動大量合成各種可能的特定樣式物件影像訓練樣本107，再將該特定樣式物件影像訓練樣本107輸入訓練用樣本資料庫109中，以成為含有特定樣式物件的影像訓練樣本而供CNN圖像分類器110訓練學習使用；另，為避免計算機程式影像合成106所產生不自然合成的特定特徵，在日後經CNN圖像分類器110訓練學習後，會被一同歸類成特定樣式物件影像訓練樣本107中特定樣式物件的特徵，因此將特徵影像104以不含有該特定特微的其他任意背景影像取代，並以相同於先前計算機程式影像合成106之方式製作合成特徵抹除影像訓練樣本108，放入特定樣式物件訓練用樣本資料庫109中，作為不含特定樣式物件的訓練樣本，以一併提供CNN圖像分類器110訓練學習使用，俾排除合成特徵對分類辨識的影響；在偵測步驟中，經CNN圖像分類器110訓練學習求得最佳的分類模型參數後，傳送分類模型參數供特定樣式物件CNN圖樣分類器210使用。並且，利用既有完成物件偵測訓練之CNN影像物件偵測器214，從監控影像201中偵測擷取出含有該種類物件之區域影像，交由特定樣式物件CNN圖樣分類器210執行辨識分類，當特定樣式物件被分類辨識出時，則輸出特定樣式物件偵測結果219，以顯示特定樣式物件被偵測到的影像時間及影像偵測圖框；在優化步驟中，為了能與時俱進的提升分類辨識正確率，可在特定樣式物件CNN圖樣分類器210之下，加入判讀挑選錯誤分類217，當特定樣式物件CNN圖樣分類器210判定分類辨識不正確時，以人工方式將先前既有完成物件偵測訓練之CNN影像物件偵測器214所偵測擷取之區域影像，予以重新製成正確標示訓練樣本215後，輸入訓練用樣本資料庫109中，再一同提供給CNN圖像分類器110訓練學習使用，以訓練計算得更佳的分類模型參數，讓往後的特定樣式物件CNN圖樣分類器210更加精確。 As shown in Figure 1, in the training step, when a certain type of object image contains a certain characteristic in a certain type of object image, the object image database 101 or network image search engine 102 is used to find the type of object Various morphological images of the image 103 and various morphological images of the specific feature image 104, and various appropriate background images 105 are obtained in advance, and the object image 103 of each morphology and the specific feature image 104 are different in size, position, angle, and deformation In the method, computer program image synthesis 106 and various background images 105 are used to automatically synthesize various possible specific style object image training samples 107, and then input the specific style object image training samples 107 into the training sample database 109 to Become a specific style object The image training samples are used for training and learning of the CNN image classifier 110; in addition, in order to avoid the unnaturally synthesized specific features generated by the computer program image synthesis 106, they will be returned together after being trained and learned by the CNN image classifier 110 in the future It is classified as a feature of a specific style object in the training sample 107 of a specific style object image, so the feature image 104 is replaced with any other background image that does not contain the specific feature, and the composite feature is created in the same way as the previous computer program image synthesis 106 Erase the image training sample 108 and put it into the specific style object training sample database 109, as a training sample that does not contain the specific style object, and provide the CNN image classifier 110 for training and learning, so as to eliminate the synthetic feature pair classification The impact of identification; in the detection step, after the CNN image classifier 110 is trained and learned to obtain the best classification model parameters, the classification model parameters are transmitted for use by the CNN pattern classifier 210 of a specific style object. In addition, the existing CNN image object detector 214, which has completed object detection training, is used to detect and extract the area image containing the object of the type from the surveillance image 201, and submit it to the specific style object CNN pattern classifier 210 to perform identification and classification. When a specific style object is recognized by classification, output the specific style object detection result 219 to display the detected image time and image detection frame of the specific style object; in the optimization step, in order to keep pace with the times To improve the accuracy of classification recognition, you can add the interpretation selection error classification 217 under the CNN pattern classifier 210 of the specific style object. When the CNN pattern classifier 210 of the specific style object determines that the classification and recognition is incorrect, the previous existing After completing the object detection training, the CNN image object detector 214 detects and captures the regional image, and then remakes it into the training sample database 109 after correctly labeling the training sample 215. It is then provided to the CNN image classifier 110 for training and learning to train better-calculated classification model parameters, so that the CNN pattern classifier 210 for specific style objects in the future is more accurate.

本發明之第二實施例： The second embodiment of the present invention:

如第2圖所示，在訓練步驟中，當有特定物件(如某人或某車)會經常出現在監控錄影影像中且背景過多而不適合直接輸入CNN圖像分類器110學習時，利用既有CNN影像物件偵測器114從監控錄影影像111中，偵測擷取含該種(人或車)物件之區域影像，藉由步驟117，判定是否為特定樣式物件，俾將該偵測擷取物件之區域影像分類標示為含有特定樣式物件影像訓練樣本107或不含特定樣式物件之一般物件影像訓練樣本118，然後，將之輸入訓練用樣本資料庫109中，一同供CNN圖像分類器110訓練學習。在訓練步驟中，經訓練取得的模型參數便可讓特定樣式物件CNN圖樣分類器210具有該特定物件(如某人或某車)的分類辨識能力，如此便可如同第1實施例，利用既有完成物件偵測訓練之CNN影像物件偵測器214從監控影像201中，偵測擷取出含有該種物件之區域影像，交由特定樣式物件CNN圖樣分類器210做辨識分類，當特定樣式物件被分類辨識出時，輸出特定樣式物件偵測結果219，以顯示特定樣式物件被偵測到的影像時間及影像偵測圖框；在優化步驟中，為了能與時俱進的提升分類辨識正確率，可在特定樣式物件CNN圖樣分類器210之下，加入判讀挑選錯誤分類217，當特定樣式物件CNN圖樣分類器210判定分類辨識不正確時，以人工方式將先前既有完成物件偵測訓練之CNN影像物件偵測器214所偵測擷取之區域影像，予以重新製成正確標示訓練樣本215後，輸入訓練用樣本資料庫109中，再一同提供給CNN圖像分類器110訓練學習使用，以訓練計算得更佳的分類模型參數，讓往後的特定樣式物件CNN圖樣分類器210更加精確。 As shown in Figure 2, in the training step, when a specific object (such as a person or a car) often appears in the surveillance video and the background is too much to be directly input to the CNN image classifier 110 for learning, use the existing A CNN image object detector 114 detects and captures the area image containing the object (person or car) from the surveillance video 111, and determines whether it is an object of a specific style through step 117, so as to capture the detection Take the regional image classification of the object and mark it as the training sample 107 containing the specific style object image or the general object image training sample 118 without the specific style object, and then input it into the training sample database 109 for the CNN image classifier 110 training and learning. In the training step, the model parameters obtained through training can enable the CNN pattern classifier 210 for a specific object (such as a person or a car) to have the classification and recognition capabilities of the specific object (such as a person or a car). The CNN image object detector 214 that has completed the object detection training detects and extracts the area image containing the object from the surveillance image 201, and sends it to the specific style object CNN pattern classifier 210 for identification and classification. When it is recognized by classification, output the detection result of specific style object 219 to display the image time and image detection frame when the specific style object is detected; in the optimization step, in order to advance with the times, the classification recognition is correct Rate, you can add the interpretation selection error classification 217 under the CNN pattern classifier 210 of the specific style object, when the CNN pattern classifier 210 determines the score for the specific style object When the class recognition is not correct, manually recreate the region image detected and captured by the object detector 214 of the existing CNN image that has completed the object detection training, and then re-create the training sample 215 with the correct label, and then input the training sample The database 109 is then provided to the CNN image classifier 110 for training and learning, so as to train better calculated classification model parameters, so that the CNN pattern classifier 210 for specific style objects in the future is more accurate.

本發明之第三實施例： The third embodiment of the present invention:

如第3圖所示，在訓練步驟中，當有實體的特定樣式物件時，可直接以多個角度對該實體特定樣式物件錄影，取得各角度的特定樣式物件的錄影影像112，若該特定樣式物件剛好充滿整個錄影影像，則可直接標示存為特定樣式物件影像訓練樣本107(含有特定樣式物件的影像訓練樣本)，再將之輸入訓練用樣本資料庫109中。若該特定樣式物件未充滿錄影影像，則利用既有CNN影像物件偵測器114，擷取含有該實體特定樣式物件之區域影像，並分類標示為特定樣式物件影像訓練樣本107，以輸入訓練用樣本資料庫109中；為了不讓類似於實體特定樣式物件的一般物件影像不被誤判為該特定樣式物件，以提高對該實體特定樣式物件的辨識正確率，利用既有CNN影像物件偵測器114從有類似於(但不是)該實體特定樣式物件影像(意即，含有物件但非特定樣式物件)的監控錄影影像111中，偵測擷取含該物件之區域影像，將之標示分類為不含特定樣式物件之一般物件影像訓練樣本118，再放入特定樣式物件訓練用樣本資料庫109中，以CNN圖像分類器110訓練學習，訓練取得的模型參數便可讓特定樣式物件CNN圖樣分類器210具有減少類似誤辨的能力。有了先前製作的特定樣式物件訓練用樣本資料庫109，挑選需要分類樣本來供CNN圖像分類器110訓練學習使用，在偵測步驟中，訓練取得的分類模型參數便可讓特定樣式物件CNN圖樣分類器210具有對該實體特定物件的分類辨識能力，如第一實施例中，利用既有完成物件偵測訓練之CNN影像物件偵測器214從監控影像201中，偵測擷取出含該種物件之區域影像，交由特定樣式物件CNN圖樣分類器210執行辨識分類，當特定樣式物件被分類辨識出時，則輸出特定樣式物件偵測結果219，以顯示特定樣式物件被偵測到的影像時間及影像偵測圖框；在優化步驟中，為了能與時俱進的提升分類辨識正確率，可在特定樣式物件CNN圖樣分類器210之下，加入判讀挑選錯誤分類217，當特定樣式物件CNN圖樣分類器210判定分類辨識不正確時，以人工方式將先前既有完成物件偵測訓練之CNN影像物件偵測器214所偵測擷取之區域影像，予以重新製成正確標示訓練樣本215後，輸入訓練用樣本資料庫109中，再一同提供給CNN圖像分類器110訓練學習使用，以訓練計算得更佳的分類模型參數，讓往後的特定樣式物件CNN圖樣分類器210更加精確 As shown in Figure 3, in the training step, when there is a specific style object of the entity, the entity specific style object can be directly recorded from multiple angles to obtain the recorded image 112 of the specific style object at each angle. If the style object just fills the entire video image, it can be directly marked and saved as a specific style object image training sample 107 (an image training sample containing a specific style object), and then input it into the training sample database 109. If the specific style object is not full of the recorded image, the existing CNN image object detector 114 is used to capture the area image containing the entity specific style object, and classify and mark it as the specific style object image training sample 107 for input for training In the sample database 109; in order to prevent the general object image similar to the entity specific style object from being misjudged as the specific style object, so as to improve the recognition accuracy of the entity specific style object, the existing CNN image object detector is used 114 From the surveillance video images 111 that are similar to (but not) the specific style object image of the entity (that is, contain the object but not the specific style object), detect and capture the area image containing the object, and classify it as General object image training samples 118 that do not contain objects of specific styles are then put into the sample database 109 for training objects of specific styles, and trained with the CNN image classifier 110 By learning, the model parameters obtained through training can enable the CNN pattern classifier 210 of a specific style object to have the ability to reduce similar misidentifications. With the previously created sample database 109 for training objects of specific styles, select the samples that need to be classified for training and learning of the CNN image classifier 110. In the detection step, the classification model parameters obtained by training can be used for the specific style object CNN The pattern classifier 210 has the ability to classify and recognize the specific object of the entity. As in the first embodiment, the existing CNN image object detector 214 that has completed object detection training is used to detect and extract the object containing the object from the surveillance image 201. The area image of the object is sent to the specific style object CNN pattern classifier 210 to perform identification and classification. When the specific style object is recognized by classification, the specific style object detection result 219 is output to show the detected specific style object Image time and image detection frame; in the optimization step, in order to keep up with the times to improve the accuracy of classification and recognition, you can add interpretation selection error classification 217 under the CNN pattern classifier 210 of specific style objects. When the object CNN pattern classifier 210 determines that the classification recognition is incorrect, it manually reproduces the image of the area detected and captured by the object detector 214, which has previously completed object detection training, into the correct training sample After 215, it is input into the training sample database 109, and then provided to the CNN image classifier 110 for training and learning, so as to train better calculated classification model parameters, so that the future CNN pattern classifier 210 of specific style objects will be more accurate

本發明之第四實施例： The fourth embodiment of the present invention:

如第4圖所示，在訓練步驟中，當特定樣式物件影像樣本已存在過去建立的訓練用樣本資料庫109中時，從訓練用樣本資料庫109中挑選需要的訓練樣本(包括特定樣式物件影像訓練樣本及一般物件影像訓練樣本)供CNN圖像分類器110訓練學習後，訓練所取得的分類模型參數可輸入至特定樣式物件CNN圖樣分類器210分類辨識所需要的特定樣式物件，並且可在新增分類樣本時，繼續多次對各特定樣式物件做持續累積的分類辨識訓練，持續增進該CNN圖像分類器110的辨識種類及正確率；在訓練步驟中，如同第一實施例中，利用既有完成物件偵測訓練之CNN影像物件偵測器214從監控影像201中，偵測擷取出含該種物件之區域影像，交由特定樣式物件CNN圖樣分類器210執行辨識分類，當特定樣式物件被分類辨識出時，輸出特定樣式物件偵測結果219，以顯示特定樣式物件被偵測到的影像時間及影像偵測圖框；在優化步驟中，為了能與時俱進的提升分類辨識正確率，可在特定樣式物件CNN圖樣分類器210之下，加入判讀挑選錯誤分類217，當特定樣式物件CNN圖樣分類器210判定分類辨識不正確時，以人工方式將先前既有完成物件偵測訓練之CNN影像物件偵測器214所偵測擷取之區域影像，予以重新製成正確標示訓練樣本215後，輸入訓練用樣本資料庫109中，再一同提供給CNN圖像分類器110訓練學習使用，以訓練計算得更佳的分類模型參數，讓往後的特定樣式物件CNN圖樣分類器210更加精確。 As shown in Figure 4, in the training step, when the image sample of the object with a specific style has been stored in the training sample database 109 established in the past, the training The training sample database 109 is used to select the required training samples (including training samples of object images of specific styles and training samples of general object images) for training of the CNN image classifier 110. After training, the classification model parameters obtained by training can be input to the specific styles The object CNN pattern classifier 210 classifies and recognizes the specific style objects needed, and can continue to perform continuous and cumulative classification and recognition training for each specific style object when adding classification samples, so as to continuously improve the CNN image classifier 110 Identify the type and accuracy rate; in the training step, as in the first embodiment, the existing CNN image object detector 214 that has completed object detection training is used to detect and extract the area containing the object from the surveillance image 201 The image is handed over to the specific style object CNN pattern classifier 210 to perform identification and classification. When the specific style object is classified and recognized, the specific style object detection result 219 is output to display the image time and image detection of the specific style object detected. Measurement frame; in the optimization step, in order to improve the accuracy of classification and recognition with the times, you can add interpretation selection error classification 217 under the specific style object CNN pattern classifier 210, when the specific style object CNN pattern classifier 210 When it is judged that the classification recognition is incorrect, manually recreate the area image detected and captured by the object detector 214 of the CNN image that has previously completed object detection training, and then re-create the correct training sample 215 and input the training The sample database 109 is then provided to the CNN image classifier 110 for training and learning, so as to train better-calculated classification model parameters, so that the CNN pattern classifier 210 for specific style objects in the future is more accurate.

本發明所提供之客製化偵測特定樣式物件之方法，與其他習用技術相互比較時，更具備下列優點： When compared with other conventional techniques, the customized method for detecting objects with specific styles provided by the present invention has the following advantages:

1.本發明僅就所偵測出之物件區域影像，是否具有特定樣式物件進一步分類辨識，可僅選用一精簡專用CNN影像分類器提供所需物件特定樣式之分類辨識能力，分類訓練學習將更為迅速單純。 1. The present invention only further classifies and recognizes the detected object area image, whether the object has a specific style, and can only select a simplified dedicated CNN image classifier to provide the required object specific style classification and identification capability, and the classification training and learning will be improved For speed and simplicity.

2.若新增額外特定樣式物件種類進行訓練，不會劣化既有完成訓練之CNN影像物件偵測器的物件偵測能力。 2. If you add additional specific style object types for training, it will not degrade the object detection capabilities of the CNN image object detector that has completed the training.

3.訓練用樣本資料庫中之各特定樣式物件的訓練樣本，可持續依需求加以收集製造，與時俱進的提供更多樣更適切的各式訓練樣本，供未來需要時即時選用，而更佳更快速的完成專有CNN影像分類器的訓練，適合應用於客製化特定樣式物件偵測。 3. The training samples of each specific style object in the training sample database can be continuously collected and manufactured according to the needs, and keep pace with the times to provide more various and more appropriate training samples for immediate selection when needed in the future, and It is better and faster to complete the training of the proprietary CNN image classifier, which is suitable for customized object detection in specific styles.

上列詳細說明係針對本發明之可行實施例之具體說明，惟該實施例並非用以限制本發明之專利範圍，凡未脫離本發明技藝精神所為之等效實施或變更，均應包含於本案之專利範圍中。 The above detailed description is a specific description of the possible embodiments of the present invention, but the embodiment is not intended to limit the scope of the present invention. Any equivalent implementation or modification that does not deviate from the technical spirit of the present invention should be included in this case. The scope of patents.

綜上所述，本案不但在技術思想上確屬創新，並能較習用物品增進上述多項功效，應以充分符合新穎性及進步性之法定發明專利要件，爰依法提出申請，懇請貴局核准本件發明專利申請案，以勵發明，至感德便。 To sum up, this case is not only innovative in terms of technical ideas, but also can enhance the above-mentioned multiple functions compared with conventional articles. It should fully meet the requirements of novel and progressive statutory invention patents. An application should be filed in accordance with the law. Please approve this case. Invention patent applications, to encourage invention, to feel good.

101‧‧‧物件影像資料庫 101‧‧‧Object Image Database

102‧‧‧網路圖像搜尋引擎 102‧‧‧Internet image search engine

103‧‧‧物件影像 103‧‧‧Object image

104‧‧‧特徵影像 104‧‧‧Feature image

105‧‧‧背景影像 105‧‧‧Background image

109‧‧‧訓練用樣本資料庫 109‧‧‧Sample database for training

110‧‧‧CNN圖像分類器 110‧‧‧CNN image classifier

201‧‧‧監控影像 201‧‧‧Monitoring image

Claims

A customizable method for detecting objects of a specific style, comprising: a training step, including: building a training sample database, wherein the training sample database includes image training samples containing objects of the specific style and Image training samples that do not contain specific style objects; provide the training sample database to the CNN image classifier for training and learning to generate trained classification model parameters; and provide the trained classification model parameters to the specific style object CNN pattern A classifier; a detecting step, which includes: providing a surveillance image to be detected; capturing an image of an area containing an object in the surveillance image to be detected through an existing CNN image object detector; and providing the area containing the object The image is sent to the specific style object CNN pattern classifier to identify the specific style object, and then the detection result of the specific style object is output; and the optimization step includes: recognizing the specific style object recognized by the CNN pattern classifier of the specific style object , To recreate the correctly labeled training sample from the image of the area containing the object that was incorrectly identified by the CNN pattern classifier of the specific style object; and input the recreated correctly labeled training sample to the training sample database to provide the Re-create the correctly labeled training samples to the CNN The image classifier is trained and learned to generate optimized classification model parameters.

Such as the method described in item 1 of the scope of patent application, wherein the steps of building a training sample database further include: searching for object images and feature images through an object image database or a network image search engine; and using different The size, position, angle, and deformation method are combined with the background image to synthesize the object image and the characteristic image into the image training sample containing the specific style object through the computer, so as to input the image training sample containing the specific style object to the training Use the sample database.

For example, the method described in item 2 of the scope of patent application further includes: synthesizing other images that do not contain the characteristic image and the background image into the image training sample without the specific style object, so that the The image training samples are input to the training sample database.

Such as the method described in item 1 of the scope of patent application, wherein the step of constructing a training sample database further includes: inputting a monitoring video image to the existing CNN image object detector to capture the video image to be monitored An image of the area containing the object; and interpreting the image of the area containing the object of the video to be monitored to identify whether the object is the specific style object; wherein, if the object is the specific style object, classify the area containing the object The image is the image training sample that contains the specific style object, and if the object is not the specific style object, classify the containing object The region image of is the training sample of the image without specific style objects.

The method described in item 1 of the scope of patent application, wherein the step of building a training sample database further includes: providing a video image of the specific style object at multiple angles, if the specific style object occupies the entire video image , Categorize the recorded image as the image training sample containing the specific style object, and if the specific style object does not occupy the entire recorded image, input the recorded image to the existing CNN image object detector to capture the recorded image The area image containing the object is classified as the image training sample containing the object of the specific style.

For example, the method described in item 1 of the scope of patent application, wherein the step of constructing a training sample database further includes: inputting surveillance video images containing objects but non-specific style objects to the existing CNN image object detector. Capture the image of the area containing the object in the surveillance video image and classify it as the image training sample without the object of the specific style.

The method described in item 1 of the scope of the patent application, wherein the CNN pattern classifier for the specific style object has two convolutional layers, two pooling layers, two fully connected layers, and thirty-six classification output CNNs Image classifier.

The method described in item 1 of the scope of patent application, wherein the existing CNN image object detector is a Yolo (You only look once) CNN image object detector.