TWI780409B - Method and system for training object detection model - Google Patents
Method and system for training object detection model Download PDFInfo
- Publication number
- TWI780409B TWI780409B TW109105277A TW109105277A TWI780409B TW I780409 B TWI780409 B TW I780409B TW 109105277 A TW109105277 A TW 109105277A TW 109105277 A TW109105277 A TW 109105277A TW I780409 B TWI780409 B TW I780409B
- Authority
- TW
- Taiwan
- Prior art keywords
- image
- target object
- training
- specific
- detection model
- Prior art date
Links
Images
Landscapes
- Radar Systems Or Details Thereof (AREA)
- Image Analysis (AREA)
Abstract
Description
本發明是有關於一種模型訓練技術,且特別是有關於一種訓練物件偵測模型的方法及系統。The present invention relates to a model training technique, and in particular to a method and system for training an object detection model.
隨著攝影機、網路技術、與人工智慧的快速發展,智慧型視訊監控技術的應用有大幅度的成長。在監控系統的應用中,特定類型物件(如人物、車輛等)的偵測是一個核心功能。現有的偵測影像中特定類型物件的方法,包括使用深度類神經網路的技術,都是透過大量已標記資料的訓練,來習得目標類型物件的特徵。With the rapid development of cameras, network technology, and artificial intelligence, the application of intelligent video surveillance technology has grown significantly. In the application of monitoring system, the detection of specific types of objects (such as people, vehicles, etc.) is a core function. Existing methods for detecting specific types of objects in images, including the use of deep neural network-like techniques, learn the characteristics of target types of objects through training with a large amount of labeled data.
一般而言,標記資料僅包含正向樣本(欲偵測的物件類型),而負向樣本則是從訓練影像的背景中自動取得。然而,負向樣本的多樣性受限於訓練資料,因此對於多變的影像,常有誤判背景為目標類型物件的情形,進而導致誤偵測。In general, labeled data contains only positive samples (types of objects to be detected), while negative samples are automatically obtained from the background of the training images. However, the diversity of negative samples is limited by the training data. Therefore, for variable images, it is often misjudged that the background is a target type object, which leads to false detection.
此外,為了能夠判斷極為多樣化的負向樣本,辨識效果良好的模型傾向於複雜龐大,在使用上需要較多的運算資源,這在實際使用上所造成的限制就是攝影機裝置端無法負擔偵測模型的運算量,因此必須透過網路將影像資料傳輸至伺服器再進行偵測,造成網路頻寬與伺服器運算量的負擔。In addition, in order to be able to judge extremely diverse negative samples, models with good recognition effects tend to be complex and large, requiring more computing resources in use. The limitation caused by this in actual use is that the camera device cannot afford detection Due to the computational complexity of the model, the image data must be transmitted to the server through the network for detection, resulting in a burden on the network bandwidth and the computational complexity of the server.
大致上而言,現有技術在使用上的困難部分源自於對於物件偵測模型之泛用性的要求,也就是同一個模型可以用在多樣的影像與場景。對於視訊監控的應用,各攝影機各有所負責的場景,不同場景所容易產生的誤偵測是不同的,但對只負責單一場景(通常只有單一視角)的攝影機,其影像中所產生的誤偵測卻是相當重覆的。Generally speaking, the difficulties in the use of existing technologies partly stem from the requirement for the versatility of the object detection model, that is, the same model can be used in various images and scenes. For the application of video surveillance, each camera is responsible for the scene, and the false detections that are likely to occur in different scenes are different. Detection is quite repetitive.
因此,若是各攝影機可以有各自的偵測模型,且該偵測模型已經過針對該場景的訓練以避免該場景的誤偵測,則能夠使用較簡單的模型達到所需的偵測效能,並且提升將物件偵測的功能置於攝影機端而非伺服器端(也就是邊端運算的概念)的可行性,降低對伺服器運算能力與網路頻寬的需求。Therefore, if each camera can have its own detection model, and the detection model has been trained for the scene to avoid false detection of the scene, then the simpler model can be used to achieve the required detection performance, and Improve the feasibility of placing the object detection function on the camera side instead of the server side (that is, the concept of edge computing), and reduce the demand for server computing power and network bandwidth.
有鑑於此,本發明提供一種訓練物件偵測模型的方法及系統,其可用於解決上述技術問題。In view of this, the present invention provides a method and system for training an object detection model, which can be used to solve the above technical problems.
本發明提供一種訓練物件偵測模型的方法,包括:取得一目標物件影像及一標記結果,其中目標物件影像包括至少一目標物件,且標記結果包括對應於各目標物件的一標記;取得一第一固定式攝影機對一第一特定場景拍攝的至少一第一特定影像;依據至少一第一特定影像取得對應於至少一第一特定影像的一第一背景影像;將至少一目標物件與第一背景影像合成為一第一訓練影像;基於第一訓練影像及對應於各目標物件的標記結果訓練專屬於第一固定式攝影機的一第一目標物件偵測模型。The present invention provides a method for training an object detection model, comprising: obtaining a target object image and a marking result, wherein the target object image includes at least one target object, and the marking result includes a mark corresponding to each target object; obtaining a first object A fixed camera shoots at least one first specific image of a first specific scene; obtains a first background image corresponding to the at least one first specific image according to the at least one first specific image; combines at least one target object with the first The background image is synthesized into a first training image; a first object detection model dedicated to the first fixed camera is trained based on the first training image and the marking results corresponding to each object.
本發明提供一種訓練物件偵測模型的系統,其包括一資料子系統及一訓練子系統。資料子系統經配置以:取得一目標物件影像,其中目標物件影像包括至少一目標物件及對應於各目標物件的一標記結果;取得一第一固定式攝影機對一第一特定場景拍攝的至少一第一特定影像;依據至少一第一特定影像取得對應於至少一第一特定影像的一第一背景影像;將至少一目標物件與第一背景影像合成為一第一訓練影像。訓練子系統基於第一訓練影像及對應於各目標物件的標記結果訓練專屬於第一固定式攝影機的一第一目標物件偵測模型。The invention provides a system for training an object detection model, which includes a data subsystem and a training subsystem. The data subsystem is configured to: obtain an object image, wherein the object image includes at least one object and a labeling result corresponding to each object; obtain at least one image of a first specific scene captured by a first stationary camera. The first specific image; obtaining a first background image corresponding to the at least one first specific image according to the at least one first specific image; synthesizing at least one target object and the first background image into a first training image. The training subsystem trains a first object detection model dedicated to the first fixed camera based on the first training image and the marking results corresponding to each object.
請參照圖1,其是依據本發明之一實施例繪示的訓練物件偵測模型的系統示意圖。如圖1所示,系統10包括偵測子系統200、訓練子系統300及資料子系統400。在不同的實施例中,偵測子系統、訓練子系統300及資料子系統400可採用獨立的設備/裝置(例如個人電路、伺服器、工作站等)實現,或是整合地實現為單一個設備/裝置,但本發明可不限於此。Please refer to FIG. 1 , which is a schematic diagram of a system for training an object detection model according to an embodiment of the present invention. As shown in FIG. 1 , the
在本發明的實施例中,偵測子系統200、訓練子系統300及資料子系統400可協同運作以實現本發明提出的訓練物件偵測模型的方法。概略而言,本發明提出的方法可基於通用的正向資料以及專屬於某固定式攝影機所提供的負向資料來產生新的訓練資料,並使用此訓練資料訓練專屬於上述固定式攝影機的目標物件偵測模型,相關細節將在之後詳述。In the embodiment of the present invention, the
為便於說明,以下將以圖1中的第一固定式攝影機100為例進行說明。在本發明的實施例中,第一固定式攝影機100例如是固定地設置於一特定地點,並經配置以基於一固定取像範圍對一第一特定場景199進行拍攝的攝影機。For the convenience of description, the first
請參照圖2,其是依據本發明之一實施例繪示的訓練物件偵測模型的方法流程圖。本實施例的方法可由圖1的系統10執行,以下即搭配圖1所示的元件說明圖2各步驟的細節。Please refer to FIG. 2 , which is a flowchart of a method for training an object detection model according to an embodiment of the present invention. The method of this embodiment can be executed by the
首先,在步驟S210中,資料子系統400可取得目標物件影像及標記結果。為便於理解,以下另輔以圖3作說明。Firstly, in step S210, the
請參照圖3,其是依據本發明的一實施例繪示的目標物件影像及標記結果的示意圖。在本實施例中,目標物件影像310例如可僅包括目標物件310a~310e(例如人物)而不包括其他的影像成分(例如背景)。此外,對應於目標物件影像310的標記結果320可包括對應於目標物件310a~310e的標記320a~320e。Please refer to FIG. 3 , which is a schematic diagram of a target object image drawn and a marking result according to an embodiment of the present invention. In this embodiment, the
在本發明的實施例中,目標物件影像310中的目標物件310a~310e可理解為先前提及的正向資料,但本發明可不限於此。在其他實施例中,目標物件影像310(及標記結果320)除了可搭配對應於固定式攝影機100的負向資料訓練專屬於固定式攝影機100的第一目標物件偵測模型M1之外,還可用於訓練專屬於其他固定式攝影機的目標物件偵測模型。換言之,目標物件影像310(及標記結果320)可廣泛地用於訓練多個固定式攝影機的目標物件偵測模型,其相關細節將在之後另述。In the embodiment of the present invention, the
之後,在步驟S220中,資料子系統400可取得第一固定式攝影機100對第一特定場景199拍攝的第一特定影像。之後,在步驟S230中,資料子系統400可依據第一特定影像取得對應於第一特定影像的第一背景影像。為便於理解,以下另輔以圖4作說明。After that, in step S220 , the
請參照圖4,其是依據本發明之一實施例繪示的第一特定影像及其對應的第一背景影像的示意圖。在本實施例中,第一特定影像410例如是第一固定式攝影機100對第一特定場景199所拍攝的影像。在不同的實施例中,第一固定式攝影機100可定期或不定期地將所拍攝的影像作為第一特定影像410傳送至資料子系統400,以降低相關的頻寬需求,但可不限於此。Please refer to FIG. 4 , which is a schematic diagram of a first specific image and its corresponding first background image according to an embodiment of the present invention. In this embodiment, the first
在取得對應於同樣場景(即,第一特定場景199)的多個第一特定影像410之後,資料子系統400例如可透過影像平均、背景分離、或具有類似效果的影像處理技術來取得對應於前述一或多個第一特定影像410的第一背景影像420。如圖4所示,所取得的第一背景影像420中僅包括第一特定場景199中的背景物件(例如物件420a~420c)而未包括前景物件及/或目標物件(例如人物),但可不限於此。在本發明的實施例中,第一背景影像420可理解為先前提及的負向資料,但本發明可不限於此。After obtaining a plurality of first
之後,在步驟S240中,資料子系統400可將目標物件310a~310e與第一背景影像420合成為第一訓練影像,而此第一訓練影像可理解為同時包括正向資料及負向資料。接著,在步驟S250中,訓練子系統300可基於第一訓練影像及對應於各目標物件310a~310e的標記結果320訓練專屬於第一固定式攝影機100的第一目標物件偵測模型M1。為便於理解,以下另輔以圖5作說明。Afterwards, in step S240 , the
請參照圖5,其是依據3及圖4繪示的第一訓練影像及對應的標記結果示意圖。在本實施例中,第一訓練影像510例如是由資料子系統400將目標物件310a~310e插入第一背景影像420而得,但可不限於此。在一些實施例中,在資料子系統400將目標物件310a~310e與第一背景影像420合成時,可同時對目標物件310a~310e進行相關的資料強化處理,例如縮放、旋轉、平移、顏色調整、部分裁切等。Please refer to FIG. 5 , which is a schematic diagram of the first training image and corresponding labeling results shown in FIG. 3 and FIG. 4 . In this embodiment, the
如此一來,資料子系統400即可產生專屬於第一固定式攝影機100的訓練資料,並可由訓練子系統300輔以對應的標記結果320訓練專屬於第一固定式攝影機100的第一目標物件偵測模型M1。In this way, the
進一步而言,有別於習知具較高泛用性的物件偵測模型,由本發明的方法訓練而得的第一目標物件偵測模型M1係專用於偵測出現於第一特定場景199中的目標物件(例如人物),因而可有效地改善相關的偵測效能,並降低誤偵測的機率。Furthermore, different from the conventional object detection model with higher versatility, the first target object detection model M1 trained by the method of the present invention is specially used to detect objects appearing in the first
在一實施例中,為降低頻寬的需求,偵測子系統200亦可配置於第一固定式攝影機100中,但可不限於此。在訓練子系統300完成上述第一目標物件偵測模型M1的訓練之後,訓練子系統300可將第一目標物件偵測模型M1傳輸至偵測子系統200。之後,偵測子系統200可取得第一固定式攝影機100對第一特定場景199拍攝的第一影像IM1,並以第一目標物件偵測模型M1偵測出現於第一影像IM1中的目標物件(例如人物)。在一些實施例中,偵測子系統200可在偵測第一影像IM1中的目標物件之後產生相應的偵測結果,其可包括例如目標物件的類別、大小、位置、輪廓、信心值等資訊,但可不限於此。In one embodiment, in order to reduce the bandwidth requirement, the
在其他實施例中,第一影像IM1還可進一步作為先前提及的第一特定影像410使用,藉以進一步優化所取得的第一背景影像420。之後,資料子系統400可將優化後的第一背景影像420再與其他目標物件影像中的目標物件合成為其他的訓練影像,以供訓練子系統300進一步訓練第一目標物件偵測模型M1,以得到具更佳偵測效能的第一目標物件偵測模型M1,但本發明可不限於此。In other embodiments, the first image IM1 can be further used as the aforementioned first
此外,如先前提及的,本發明的系統10還可基於目標物件影像310訓練專屬於其他固定式攝影機的目標物件偵測模型。舉例而言,資料子系統400可經配置以:取得第二固定式攝影機對第二特定場景拍攝的第二特定影像;依據第二特定影像取得對應於第二特定影像的第二背景影像;將目標物件與第二背景影像合成為第二訓練影像。之後,訓練子系統300可基於第二訓練影像及對應於各目標物件的標記結果訓練專屬於第二固定式攝影機的第二目標物件偵測模型。以上技術手段的相關細節可參考先前實施例中的說明,於此不另贅述。In addition, as mentioned earlier, the
綜上所述,本發明提出的系統及方法可在不需透過人工為不同場景進行資料標記的情形下,為不同場景建立對應的偵測模型,具易用性且可有效地改善相關的偵測效能,並降低誤偵測的機率。In summary, the system and method proposed by the present invention can establish corresponding detection models for different scenes without manually marking data for different scenes, which is easy to use and can effectively improve related detection models. Test performance, and reduce the probability of false detection.
雖然本發明已以實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明的精神和範圍內,當可作些許的更動與潤飾,故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed above with the embodiments, it is not intended to limit the present invention. Anyone with ordinary knowledge in the technical field may make some changes and modifications without departing from the spirit and scope of the present invention. The scope of protection of the present invention should be defined by the scope of the appended patent application.
100:第一固定式攝影機
199:第一特定場景
200:偵測子系統
300:訓練子系統
310:目標物件影像
310a~310e:目標物件
320:標記結果
320a~320e:標記
400:資料子系統
410:第一特定影像
420:第一背景影像
420a~420c:物件
510:第一訓練影像
IM1:第一影像
M1:第一目標物件偵測模型
S210~S250:步驟100: The first fixed camera
199: The first specific scene
200: Detection Subsystem
300: training subsystem
310:
圖1是依據本發明之一實施例繪示的訓練物件偵測模型的系統示意圖。 圖2是依據本發明之一實施例繪示的訓練物件偵測模型的方法流程圖。 圖3是依據本發明的一實施例繪示的目標物件影像及標記結果的示意圖。 圖4是依據本發明之一實施例繪示的第一特定影像及其對應的第一背景影像的示意圖。 圖5是依據3及圖4繪示的第一訓練影像及對應的標記結果示意圖。FIG. 1 is a schematic diagram of a system for training an object detection model according to an embodiment of the present invention. FIG. 2 is a flowchart of a method for training an object detection model according to an embodiment of the present invention. FIG. 3 is a schematic diagram of a target object image drawn and a marking result according to an embodiment of the present invention. FIG. 4 is a schematic diagram of a first specific image and its corresponding first background image according to an embodiment of the present invention. FIG. 5 is a schematic diagram of the first training image and corresponding labeling results shown in FIG. 3 and FIG. 4 .
S210~S250:步驟S210~S250: steps
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW109105277A TWI780409B (en) | 2020-02-19 | 2020-02-19 | Method and system for training object detection model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW109105277A TWI780409B (en) | 2020-02-19 | 2020-02-19 | Method and system for training object detection model |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202133037A TW202133037A (en) | 2021-09-01 |
TWI780409B true TWI780409B (en) | 2022-10-11 |
Family
ID=78777658
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW109105277A TWI780409B (en) | 2020-02-19 | 2020-02-19 | Method and system for training object detection model |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI780409B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991684A (en) * | 2017-03-15 | 2017-07-28 | 上海信昊信息科技有限公司 | Foreground extracting method and device |
CN108460414A (en) * | 2018-02-27 | 2018-08-28 | 北京三快在线科技有限公司 | Generation method, device and the electronic equipment of training sample image |
US20190251401A1 (en) * | 2018-02-15 | 2019-08-15 | Adobe Inc. | Image composites using a generative adversarial neural network |
CN110163285A (en) * | 2019-05-23 | 2019-08-23 | 阳光保险集团股份有限公司 | Ticket recognition training sample synthetic method and computer storage medium |
CN110472544A (en) * | 2019-08-05 | 2019-11-19 | 上海英迈吉东影图像设备有限公司 | A kind of training method and system of article identification model |
-
2020
- 2020-02-19 TW TW109105277A patent/TWI780409B/en active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991684A (en) * | 2017-03-15 | 2017-07-28 | 上海信昊信息科技有限公司 | Foreground extracting method and device |
US20190251401A1 (en) * | 2018-02-15 | 2019-08-15 | Adobe Inc. | Image composites using a generative adversarial neural network |
CN108460414A (en) * | 2018-02-27 | 2018-08-28 | 北京三快在线科技有限公司 | Generation method, device and the electronic equipment of training sample image |
CN110163285A (en) * | 2019-05-23 | 2019-08-23 | 阳光保险集团股份有限公司 | Ticket recognition training sample synthetic method and computer storage medium |
CN110472544A (en) * | 2019-08-05 | 2019-11-19 | 上海英迈吉东影图像设备有限公司 | A kind of training method and system of article identification model |
Also Published As
Publication number | Publication date |
---|---|
TW202133037A (en) | 2021-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Walia et al. | Recent advances on multicue object tracking: a survey | |
US20180189573A1 (en) | Real-time detection, tracking and occlusion reasoning | |
CN109919977B (en) | Video motion person tracking and identity recognition method based on time characteristics | |
WO2018103244A1 (en) | Live streaming video processing method, device, and electronic apparatus | |
US11037308B2 (en) | Intelligent method for viewing surveillance videos with improved efficiency | |
WO2016004673A1 (en) | Intelligent target recognition device, system and method based on cloud service | |
US10635936B2 (en) | Method and system for training a neural network to classify objects or events | |
KR102223478B1 (en) | Eye state detection system and method of operating the same for utilizing a deep learning model to detect an eye state | |
CN113228626B (en) | Video monitoring system and method | |
CN111856445B (en) | Target detection method, device, equipment and system | |
WO2022205329A1 (en) | Object detection method, object detection apparatus, and object detection system | |
CN111654668B (en) | Monitoring equipment synchronization method and device and computer terminal | |
TWI780409B (en) | Method and system for training object detection model | |
US11044399B2 (en) | Video surveillance system | |
Ul Huda et al. | Estimating the number of soccer players using simulation-based occlusion handling | |
TWI777689B (en) | Method of object identification and temperature measurement | |
KR102614895B1 (en) | Real-time object tracking system and method in moving camera video | |
CN112378409B (en) | Robot RGB-D SLAM method based on geometric and motion constraint in dynamic environment | |
Kim et al. | Robust multi-object tracking to acquire object oriented videos in indoor sports | |
Lin et al. | Enhanced multi-view dancing videos synchronisation | |
CN112561795A (en) | Spark and OpenCV-based real-time panoramic image generation implementation system and method | |
CN110826455A (en) | Target identification method and image processing equipment | |
WO2022185403A1 (en) | Image processing device, image processing method, and program | |
CN110675377B (en) | State monitoring system and method for substation relay protection device | |
da Silva et al. | Online video-based sequence synchronization for moving camera object detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
GD4A | Issue of patent certificate for granted invention patent |