TWI672639B

TWI672639B - Object recognition system and method using simulated object images

Info

Publication number: TWI672639B
Application number: TW107141572A
Authority: TW
Inventors: 陳昱達; 梁峰銘; 鄭景鴻
Original assignee: 台達電子工業股份有限公司
Priority date: 2018-11-22
Filing date: 2018-11-22
Publication date: 2019-09-21
Also published as: TW202020736A

Abstract

本發明係提供一種使用模擬物件影像之物件辨識方法，該方法包括：(A)取得包括一或多張物件影像之一物件影像集合以及包括複數張背景影像之一背景影像集合；(B)依據該物件影像集合及該背景影像集合產生包括複數張模擬物件影像之一模擬物件影像集合；(C)依據該模擬物件影像集合以訓練出一待測物辨識模型；以及(D)將由一待測場景所取得的一待測影像輸入該待測物辨識模型以取得一物件辨識結果。The present invention provides an object recognition method using a simulated object image, the method comprising: (A) acquiring an image collection of an object including one or more object images and a background image set including a plurality of background images; (B) The object image collection and the background image collection generate a simulated object image collection including a plurality of simulated object images; (C) training a test object identification model according to the simulated object image set; and (D) is to be tested by a test object A to-be-measured image obtained by the scene is input into the object identification model to obtain an object identification result.

Description

Object recognition system using simulated object image and method thereof

本發明係有關於物件辨識，特別是有關於一種使用模擬物件影像之物件辨識系統及其方法。The present invention relates to object recognition, and more particularly to an object recognition system using a simulated object image and a method thereof.

辨識模型的訓練建立在大量的標註數據，數據量的多寡與數據的品質影響訓練模型的辨識率。對於一些任務或領域而言，這些數據可以透過長時間的收集得到，幫助該領域解決問題。因此，在模型訓練之前必須花費時間蒐集數據並對其分類、標註。The training of the identification model is based on a large amount of annotation data, and the amount of data and the quality of the data affect the recognition rate of the training model. For some tasks or areas, this data can be collected over a long period of time to help solve problems in the field. Therefore, it takes time to collect data and classify and label it before training the model.

以辨識系統來說，辨識率的高低取決於是否有足夠充分的數據樣本，樣本的多樣性越高，越能克服在各個場域遇到的問題。因此一個好的辨識模型會耗費許多時間在蒐集數據與標註數據上。另外當特定場域辨識率無法達到標準的狀況下，可以透過蒐集該場域的數據，施以針對性的訓練與調整來提高該場域的辨識率。但也導致整體建置時間延長、提高初期建置成本。另一方面對於個資保護較為縝密的地區，則面臨難以取得大量數據的窘境，必須花費更多資源在收集數據上。In the case of an identification system, the recognition rate depends on whether there are enough data samples, and the higher the diversity of the samples, the more the problems encountered in each field can be overcome. Therefore, a good identification model will take a lot of time to collect data and annotation data. In addition, when the specific field identification rate cannot reach the standard, the data of the field can be collected, and targeted training and adjustment can be applied to improve the recognition rate of the field. However, it also led to an increase in overall construction time and an increase in initial construction costs. On the other hand, in areas where private protection is more rigorous, it faces the dilemma of obtaining large amounts of data, and more resources must be spent on collecting data.

本發明係提供一種使用模擬物件影像之物件辨識系統及其方法以解決傳統辨識系統所遇到的問題。The present invention provides an object recognition system using a simulated object image and a method thereof to solve the problems encountered in the conventional identification system.

本發明更提供一種使用模擬物件影像之物件辨識系統，包括：一非揮發性記憶體，用以儲存一物件辨識程式；以及一運算單元，用以執行該物件辨識程式以進行下列步驟：取得包括複數張物件影像之一物件影像集合以及包括複數張背景影像之一背景影像集合；依據該物件影像集合及該背景影像集合產生包括複數張模擬物件影像之一模擬物件影像集合；依據該模擬物件影像集合以訓練出一待測物辨識模型；以及將由一待測場景所取得的一待測影像輸入該待測物辨識模型以取得一物件辨識結果。The invention further provides an object recognition system using a simulated object image, comprising: a non-volatile memory for storing an object recognition program; and an operation unit for executing the object recognition program to perform the following steps: obtaining a collection of object images of a plurality of object images and a background image set including a plurality of background images; generating, according to the object image set and the background image set, a set of simulated object images including a plurality of simulated object images; The set is used to train a test object identification model; and a to-be-tested image obtained by a scene to be tested is input into the object identification model to obtain an object recognition result.

以下敘述列舉本發明的多種實施方式。以下敘述介紹本發明的基本概念，且並非意圖限制本發明內容。實際發明範圍應依照申請專利範圍界定之。The following description sets forth various embodiments of the invention. The following description sets forth the basic concepts of the invention and is not intended to limit the invention. The scope of the actual invention shall be defined in accordance with the scope of the patent application.

第1圖係顯示依據本發明一實施例中之物件辨識系統的方塊圖。1 is a block diagram showing an object recognition system in accordance with an embodiment of the present invention.

在一實施例中，物件辨識系統100可實現於一電子裝置，例如一個人電腦、一伺服器、或一可攜式裝置。物件辨識系統100包括一運算單元110、一影像擷取裝置120、一儲存單元130、及一顯示器150。In an embodiment, the object recognition system 100 can be implemented in an electronic device, such as a personal computer, a server, or a portable device. The object recognition system 100 includes an operation unit 110, an image capture device 120, a storage unit 130, and a display 150.

運算單元110可透過多種方式實施，例如以專用硬體電路或者通用硬體（例如：單一處理器、具平行處理能力之多處理器、圖形處理器或者其它具有運算能力之處理器），且於執行與本發明各個模型以及流程有關之程式碼或者軟體時，提供之後所描述的功能。影像擷取裝置120例如為一照相機，用以對一待測場景擷取一待測影像。The computing unit 110 can be implemented in various manners, such as a dedicated hardware circuit or a general-purpose hardware (for example, a single processor, a multi-processor with parallel processing capability, a graphics processor, or other processor with computing power), and When the code or software related to each model and process of the present invention is executed, the functions described later are provided. The image capturing device 120 is, for example, a camera for capturing a to-be-tested image for a scene to be tested.

儲存單元130包括一揮發性記憶體131及一非揮發性記憶體132。非揮發性記憶體132係用以儲存各種不同影像集合之資料庫、在物件辨識流程中所需要的資料以及各種程式碼，例如各種演算法和/或物件辨識模型等等。非揮發性記憶體132例如可為一硬碟機(hard disk drive)、一固態硬碟機(solid-state disk)、一快閃記憶體(flash memory)、或一唯讀記憶體(read-only memory)，但本發明並不限於此。揮發性記憶體131可為一隨機存取記憶體，例如是一靜態隨機存取記憶體(SRAM)或一動態隨機存取記憶體(DRAM)，但本發明並不限於此。揮發性記憶體131例如可暫存在物件辨識流程中之中間資料及影像。The storage unit 130 includes a volatile memory 131 and a non-volatile memory 132. The non-volatile memory 132 is used to store a database of various image collections, data required in the object identification process, and various code codes, such as various algorithms and/or object recognition models. The non-volatile memory 132 can be, for example, a hard disk drive, a solid-state disk, a flash memory, or a read-only memory (read- Only memory), but the invention is not limited thereto. The volatile memory 131 can be a random access memory such as a static random access memory (SRAM) or a dynamic random access memory (DRAM), but the invention is not limited thereto. The volatile memory 131 can temporarily store intermediate data and images in the object identification process, for example.

在一實施例中，非揮發性記憶體132係儲存一物件辨識程式133，且運算單元110係將物件辨識程式133由該非揮發性記憶體132讀取至揮發性記憶體131並執行，其中物件辨識程式133係包括一物件辨識方法之程式碼。In one embodiment, the non-volatile memory 132 stores an object recognition program 133, and the operation unit 110 reads the object recognition program 133 from the non-volatile memory 132 to the volatile memory 131 and executes the object. The identification program 133 includes a code for an object identification method.

顯示單元150可為顯示面板（例如，薄膜液晶顯示面板、有機發光二極體面板或者其它具顯示能力的面板），用以顯示輸入的字元、數字、符號、拖曳鼠標的移動軌跡或者應用程式所提供的使用者介面，以提供給使用者觀看。物件辨識系統100更可包括一輸入裝置（未繪示），用以供使用者執行對應之操作，例如滑鼠、觸控筆、或鍵盤等，但本發明並不限於此。The display unit 150 can be a display panel (for example, a thin film liquid crystal display panel, an organic light emitting diode panel, or other display capable panel) for displaying input characters, numbers, symbols, dragging a mouse's movement track or an application. The user interface provided is provided for viewing by the user. The object recognition system 100 further includes an input device (not shown) for the user to perform a corresponding operation, such as a mouse, a stylus, or a keyboard, but the present invention is not limited thereto.

在一實施例中，非揮發性記憶體132更包括一第一資料庫135、一第二資料庫136、一第三資料庫137、一第四資料庫138、一第五資料庫139、一第六資料庫140、及一待測物辨識模型141。舉例來說，第一資料庫135係儲存複數張物件場景影像。各物件場景影像例如可包括一或多種類型之物件，其中上述物件例如可為文字(例如A～Z、0～9、或其他字體)、人體、車牌、零組件、及標誌等等，但本發明並不限於此。In one embodiment, the non-volatile memory 132 further includes a first database 135, a second database 136, a third database 137, a fourth database 138, a fifth database 139, and a The sixth database 140 and a sample identification model 141. For example, the first database 135 stores a plurality of object scene images. Each object scene image may include, for example, one or more types of objects, wherein the objects may be, for example, text (eg, A to Z, 0 to 9, or other fonts), human body, license plate, components, and logos, etc., but The invention is not limited to this.

第二資料庫136係儲存複數張背景影像，例如為一背景影像集合。其中，上述背景影像例如可為在不同拍攝條件下所取得的任意真實場景之真實背景影像，並不限於待測場景之背景影像，且亦可不包括待測物件。在一些實施例中，上述背景影像更包括電腦視覺技術模擬出之虛擬背景影像。The second database 136 stores a plurality of background images, such as a background image collection. The background image may be, for example, a real background image of any real scene obtained under different shooting conditions, and is not limited to the background image of the scene to be tested, and may not include the object to be tested. In some embodiments, the background image further includes a virtual background image simulated by computer vision technology.

第三資料庫137係儲存複數張物件影像，例如為一物件影像集合，其中各物件影像例如是可由第一資料庫135中之複數張物件場景影像擷取而得。第四資料庫138係儲存複數張模擬物件影像，例如為一模擬物件影像集合。The third database 137 stores a plurality of object images, for example, an object image collection, wherein each object image is obtained by, for example, a plurality of object scene images in the first database 135. The fourth database 138 stores a plurality of simulated object images, for example, a simulated object image collection.

其中，運算單元110係依據在第三資料庫137中之物件影像集合、以及第二資料庫136中之背景影像集合以產生第四資料庫138中之模擬物件影像集合，其細節將詳述於後。The computing unit 110 generates the simulated object image set in the fourth database 138 according to the object image set in the third database 137 and the background image set in the second database 136, the details of which are detailed in Rear.

第2A－2M圖係顯示依據本發明一實施例在物件辨識流程中所使用的不同影像之示意圖。請同時參考第1圖及第2A－2M圖。為了便於說明，在後述實施例中所使用的待測物件為車牌。The 2A-2M diagram shows a schematic diagram of different images used in the object recognition process in accordance with an embodiment of the present invention. Please also refer to Figure 1 and Figure 2A-2M. For convenience of explanation, the object to be tested used in the embodiment to be described later is a license plate.

第一資料庫135所儲存之各物件場景影像例如可為真實的車牌影像，例如需包括所有的車牌字元(例如A～Z、0～9、或其他字體)，如第2A圖所示。舉例來說，運算單元110例如可對各物件場景影像進行影像擷取處理以取得車牌中之各字元的影像(意即物件影像)，如第2B圖所示。運算單元110並利用光學字元辨識(optical character recognition)技術或是其他物件辨識技術以取得所有車牌字元，且各車牌字元為單獨的物件影像，如第2C圖所示，其中共有10個數字的物件影像，以及26個字母的物件影像，其中所有車牌字元之物件影像例如可儲存於第三資料庫137。The image of each object scene stored in the first database 135 can be, for example, a real license plate image, for example, including all license plate characters (for example, A to Z, 0 to 9, or other fonts), as shown in FIG. 2A. For example, the operation unit 110 can perform image capture processing on each object scene image to obtain an image of each character in the license plate (ie, an object image), as shown in FIG. 2B. The computing unit 110 uses optical character recognition technology or other object recognition technology to obtain all license plate characters, and each license plate character is a separate object image, as shown in FIG. 2C, 10 of which are The digital object image, and the 26-letter object image, wherein the object images of all the license plate characters can be stored, for example, in the third database 137.

接著，運算單元110係依據一預定規則由一或多張物件影像組成一或多個訓練用物件。因為在此實施例中是以車牌為例，故上述預定規則為車牌制訂規則，例如包括車牌長寬、字體間距、字元限制、字元佈局、字體顏色、車牌顏色、螺絲孔大小及位置等等。第2D圖所示為汽車(自用小客車)車牌之制訂規則，但本發明並不限定於汽車車牌，其他車輛類型之車牌亦可使用，例如是大型重型機車、普通重型機車、大客車、大貨車等等。意即，不同車輛類型之車牌有相應的一車牌制訂規則，運算單元110可依據所選擇的車牌制訂規則使用車牌字元之物件影像之不同組合以產生一或多個訓練用物件(例如模擬車牌影像)，如第2E圖所示。需注意的是，在第2E圖中之模擬車牌影像是由在第三資料庫137中不同的車牌字元之物件影像所組成，且模擬車牌並沒有加入雜訊、模糊、型態變化或真實場景的各種影像特徴。Next, the computing unit 110 composes one or more training objects from one or more object images according to a predetermined rule. Since the license plate is taken as an example in this embodiment, the predetermined rule is a rule for the license plate, including, for example, the license plate length and width, the font spacing, the character limit, the character layout, the font color, the license plate color, the screw hole size and the position, and the like. Wait. Figure 2D shows the rules for the development of license plates for cars (self-use passenger cars), but the invention is not limited to automobile license plates, and license plates for other vehicle types can also be used, such as large heavy locomotives, ordinary heavy locomotives, buses, large Trucks and so on. That is, the license plate of different vehicle types has a corresponding license plate making rule, and the operation unit 110 can use the different combinations of the object images of the license plate characters according to the selected license plate to generate one or more training objects (for example, a simulated license plate). Image), as shown in Figure 2E. It should be noted that the simulated license plate image in Figure 2E is composed of object images of different license plate characters in the third database 137, and the simulated license plate is not added with noise, blur, type change or real. Various image features of the scene.

運算單元110接著執行第一影像處理對模擬車牌影像(即訓練用物件)加入一或多個物件影像特徴及一或多個背景影像特徴。舉例來說，物件影像特徴例如可為在真實場景中之待測物件，受到環境的影響而產生視覺上的差異。物件影像特徴例如包括：模糊(blurriness)、刮痕或汙損(scratches or stains)、陰影(shadow)、遮蔽(shading)、過曝(overexposure)、變形(distortion)、及色差(color aberration)，但本發明並不限於此。第2F圖則顯示了包含不同的物件影像特徴之車牌的示意圖。因為物件影像特徴及背景影像特徴均包括多種不同類型的影像特徴，運算單元110可進行第一影像處理將一或多個物件影像特徴加入各訓練用物件(例如：模擬車牌影像)以產生一或多個模擬待測物件(經過處理的模擬車牌影像)，例如第2H-1～2H-6圖係顯示在第2E圖中之模擬車牌影像分別加入刮痕、色差、陰影、模糊、雜訊、變形、及色差等物件影像特徴後所產生的模擬待測物件。需注意的是，本發明並不限定於僅加入其中一種物件影像特徴至各訓練用物件(例如：模擬車牌影像)。The computing unit 110 then performs the first image processing to add one or more object image features and one or more background image features to the simulated license plate image (ie, the training object). For example, the object image feature can be, for example, an object to be tested in a real scene, which is visually affected by the influence of the environment. Object image features include, for example, blurriness, scratches or stains, shadows, shadings, overexposures, distortions, and color aberrations. However, the invention is not limited to this. Figure 2F shows a schematic diagram of a license plate containing different image features. Because the object image features and the background image features include a plurality of different types of image features, the computing unit 110 can perform the first image processing to add one or more object image features to each training object (eg, a simulated license plate image) to generate one or A plurality of simulated objects to be tested (processed simulated license plate images), for example, the 2H-1 to 2H-6 images show that the simulated license plate images in FIG. 2E are respectively added with scratches, chromatic aberrations, shadows, blurs, noises, The simulated object to be tested generated after the image distortion of the object such as deformation and chromatic aberration. It should be noted that the present invention is not limited to adding only one of the object images to each training object (for example, a simulated license plate image).

背景影像特徴例如可為真實場景中所拍攝之影像所產生的雜訊，且背景影像特徴亦可稱為環境雜訊特徴。背景影像特徴例如包括：模糊、刮痕或汙損、陰影、雜訊、遮蔽、過曝、變形、及色差，但本發明並不限於此。第2G圖則顯示了包含不同的背景影像特徴之真實場景的示意圖。物件影像特徴及背景影像特徴之細節將詳述於後。Background image features such as noise generated by images captured in real scenes, and background image features may also be referred to as environmental noise features. Background image features include, for example, blurring, scratches or stains, shadows, noise, shadowing, overexposure, distortion, and chromatic aberration, but the invention is not limited thereto. The 2G graph shows a schematic representation of a real scene containing different background image features. The details of the object image features and background image features will be detailed later.

在一些實施例中，運算單元110可進行第一影像處理將一或多個物件影像特徴及一或多個背景影像特徴加入各訓練用物件(例如：模擬車牌影像)以產生一或多個模擬待測物件。舉例來說，除了車牌可能出現的物件影像特徴之外，車牌影像在真實場景中亦會受到背景的環境雜訊影像，故運算單元110亦可將一或多個物件影像特徴及一或多個背景影像特徴加入各訓練用物件以產生一或多個模擬待測物件。In some embodiments, the computing unit 110 may perform a first image processing to add one or more object image features and one or more background image features to each training object (eg, a simulated license plate image) to generate one or more simulations. Object to be tested. For example, in addition to the image features of the object that may appear on the license plate, the license plate image may also receive background noise images in the real scene, so the computing unit 110 may also feature one or more object images and one or more The background image is specially added to each of the training objects to generate one or more simulated objects to be tested.

在一實施例中，儲存於第二資料庫136之背景影像集合中之背景影像例如第2I圖所示。需注意的是，第2I圖中之背景影像可以不包括車牌。In one embodiment, the background image stored in the background image set of the second database 136 is shown, for example, in FIG. It should be noted that the background image in FIG. 2I may not include a license plate.

接著，運算單元110係由第二資料庫136所儲存的背景影像集合隨機挑選一背景影像，其中所挑選的背景影像例如可為在背景影像集合之其中一張真實背景影像的全部或是一部分(例如感興趣區域)，分別如第2J-1及2J-2圖所示。假定以第2J-2圖之感興趣區域的背景影像為準(例如為第一背景影像)，運算單元110則進行一第二影像處理將一或多個背景影像特徴加入第一背景影像以產生一模擬背景影像。舉例來說，運算單元110可在第一背景影像中加入例如模糊、刮痕或汙損、陰影、雜訊、遮蔽、過曝、變形等等的一或多個背景影像特徴，使得在第一背景影像之場景得以融入原本未拍攝到之不同的影像特徴，故可採用較少數量的背景影像以達到不同拍攝條件下之背景環境的影像效果。Then, the computing unit 110 randomly selects a background image from the background image set stored by the second database 136, wherein the selected background image may be, for example, all or part of one of the real background images in the background image set ( For example, the region of interest) is shown in Figures 2J-1 and 2J-2, respectively. Assuming that the background image of the region of interest of the 2J-2 image is used (for example, the first background image), the computing unit 110 performs a second image processing to add one or more background image features to the first background image to generate A simulated background image. For example, the operation unit 110 may add one or more background image features such as blur, scratch or stain, shadow, noise, shadow, overexposure, deformation, etc. to the first background image, so that the first The background image scene can be incorporated into different image features that were not originally captured, so a smaller number of background images can be used to achieve the image effect of the background environment under different shooting conditions.

在前述實施例中，運算單元110可進行第一影像處理將一或多個物件影像特徴及一或多個背景影像特徴加入各訓練用物件(例如：模擬車牌影像)以產生一或多個模擬待測物件，並進行第二影像處理將一或多個背景影像特徴加入第一背景影像以產生一模擬背景影像。因為模擬待測物件是針對車牌的部分以加入一或多個物件影像特徴所產生，且模擬背景影像是針對第一背景影像以加入一或多個背景影像特徴所產生，但是模擬待測物件及模擬背景影像之間可能並沒有關聯性。因此，運算單元110係進行一影像合成處理(image synthesis process)將模擬待測物件加入模擬背景影像以產生一模擬合成影像，如第2K圖所示。In the foregoing embodiment, the computing unit 110 may perform the first image processing to add one or more object image features and one or more background image features to each training object (eg, a simulated license plate image) to generate one or more simulations. The object to be tested is subjected to a second image processing to add one or more background image features to the first background image to generate a simulated background image. Because the simulated object to be tested is generated by adding one or more object image features to the part of the license plate, and the simulated background image is generated by adding one or more background image features to the first background image, but simulating the object to be tested and There may be no correlation between simulated background images. Therefore, the arithmetic unit 110 performs an image synthesis process to add the simulated object to be simulated to generate a simulated synthetic image, as shown in FIG. 2K.

舉例來說，上述影像合成處理可將模擬待測物件調整為適當的影像尺寸並貼上模擬背景影像中之任意位置(例如為在模擬背景影像中之一預定範圍內之位置)，並進行對貼上模擬背景影像的模擬待測物件進行邊緣平滑化處理以產生一模擬合成影像。需注意的是，加入模擬背景影像中之模擬待測物件本身並沒有在模擬背景影像中之模擬場景的影像特徴。因此，運算單元110更進行第二影像處理以將一或多個背景影像特徴加入上述模擬合成影像，並產生一模擬物件影像，其中上述流程是強化模擬待測物件與背景的一致性，才產生用於訓練的模擬物件影像。第2L-1～2L-4圖係分別顯示將背景影像特徴中之模糊、干擾、椒鹽雜訊、高斯雜訊等影像特徴加入模擬合成影像後之結果。在第2M圖所示的模擬物件影像例如是結合了第2L-1～2L-4圖中之不同的背景影像特徴所得到的結果。在本發明的上述流程中，將模擬待測物件覆蓋在任意背景影像上可以提高車牌之背景的複雜度，有助於增強後續物件辨識模型訓練的效果。For example, the image synthesizing process can adjust the simulated object to be measured to an appropriate image size and paste it at any position in the simulated background image (for example, in a predetermined range in the simulated background image), and perform the pairing. The simulated object to be tested, which is a simulated background image, is subjected to edge smoothing to generate a simulated synthetic image. It should be noted that the simulated object to be tested in the simulated background image itself does not have the image characteristics of the simulated scene in the simulated background image. Therefore, the computing unit 110 further performs second image processing to add one or more background image features to the simulated synthetic image, and generates a simulated object image, wherein the process is to enhance the consistency between the simulated object and the background to generate the image. Simulated object image for training. The 2L-1~2L-4 images respectively show the results of adding image, such as blur, interference, salt and pepper noise, and Gaussian noise, to the simulated synthetic image. The simulated object image shown in FIG. 2M is, for example, a result obtained by combining different background image characteristics in the second L-1 to 2L-4. In the above process of the present invention, overlaying the simulated object to be covered on any background image can improve the complexity of the background of the license plate and help to enhance the effect of the subsequent object recognition model training.

運算單元110可選擇不同的物件影像特徴及背景影像特徴之組合及選擇不同的真實背景影像，並重複執行前述實施例中之流程，以產生不同的模擬物件影像。因此，運算單元110可得到複數張模擬物件影像以形成模擬物件影像集合，並將上述模擬物件影像儲存於第四資料庫138中。The computing unit 110 can select different combinations of object image features and background image features and select different real background images, and repeat the processes in the foregoing embodiments to generate different simulated object images. Therefore, the computing unit 110 can obtain a plurality of simulated object images to form a simulated object image set, and store the simulated object image in the fourth database 138.

接著，運算單元110係依據第四資料庫138中之模擬物件影像集合以訓練出一待測物辨識模型141。舉例來說，運算單元110可使用支持向量機(support vector machine，SVM)、卷積神經網路(convolutional neural network)、深度神經網路(deep neural network)等技術以訓練出待測物辨識模型141，但本發明並不限於此。需注意的是，在開始訓練待測物辨識模型141之過程中，運算單元110均是使用模擬物件影像集合中之模擬物件影像。因為模擬物件影像是經由模擬不同的場景及不同的訓練用物件(例如模擬車牌影像)之變化所得到，故可大幅涵蓋在待測現場實地拍攝而無法取得之情況，故運算單元110可不使用真實場景之影像，而是使用模擬物件影像集合中之模擬物件影像以訓練出待測物辨識模型141。Next, the computing unit 110 trains a DUT identification model 141 according to the simulated object image set in the fourth database 138. For example, the operation unit 110 may use a support vector machine (SVM), a convolutional neural network, a deep neural network, or the like to train the object identification model. 141, but the invention is not limited thereto. It should be noted that, in the process of starting the training of the object identification model 141, the computing unit 110 uses the simulated object image in the simulated object image collection. Since the simulated object image is obtained by simulating different scenes and different training objects (for example, analog license plate images), it can greatly cover the situation in the field to be tested and cannot be obtained, so the arithmetic unit 110 can not use the real image. Instead of the image of the scene, the simulated object image in the simulated object image set is used to train the object identification model 141.

在一實施例中，當待測物辨識模型141訓練完成後，運算單元110即可將來自外部主機或是由影像擷取裝置120所擷取的待測場景(例如為具有車輛之場景)之待測影像輸入待測物辨識模型141以得到一物件辨識結果，其中上述物件辨識結果例如為待測影像中之車牌號碼。In an embodiment, after the training of the object identification model 141 is completed, the computing unit 110 can capture the scene to be tested (for example, a scene with a vehicle) from an external host or captured by the image capturing device 120. The image to be tested is input into the object identification model 141 to obtain an object recognition result, wherein the object identification result is, for example, a license plate number in the image to be tested.

在另一實施例中，非揮發性記憶體132中之第五資料庫139，其係儲存包括複數張測試影像之測試影像集合，其中測試影像集合亦可稱為未標註(unlabeled)的測試影像集合。上述測試影像例如為在包括車輛及其車牌之真實場景拍攝所得到的影像。運算單元110例如可將測試影像集合中之各測試影像輸入待測物辨識模型141以得到相應的物件辨識結果，並將各測試影像相應的物件辨識結果儲存至非揮發性記憶體132中之第五資料庫139。選擇性地，運算單元110可將各測試影像相應的物件辨識結果標示於各測試影像上，並將標示後的各測試影像另外儲存至非揮發性記憶體132中之第六資料庫140。In another embodiment, the fifth database 139 of the non-volatile memory 132 stores a test image set including a plurality of test images, wherein the test image set may also be referred to as an unlabeled test image. set. The above test image is, for example, an image obtained by photographing a real scene including a vehicle and its license plate. For example, the computing unit 110 can input each test image in the test image set into the object identification model 141 to obtain a corresponding object identification result, and store the corresponding object identification result of each test image into the non-volatile memory 132. Five databases 139. Optionally, the computing unit 110 may mark the corresponding object identification result of each test image on each test image, and store the labeled test image separately into the sixth database 140 in the non-volatile memory 132.

在一實施例中，因為各種環境變化的影響，待測物辨識模型141之辨識結果並無法百分之百準確，故使用者可以經由人工檢視的方式以檢查測試影像集合中之各測試影像相應的物件辨識結果是否正確。若判斷有一特定測試影像相應的物件辨識結果不正確，則運算單元110可將該特定測試影像加入第四資料庫138，並將該特定測試影像相應的正確物件辨識結果輸入至待測物辨識模型141，藉以再訓練並更新待測物辨識模型141，故可提高在類似情況下，待測物辨識模型141之辨識率。類似地，若由待測場景所擷取的待測影像輸入至待測物辨識模型141後的物件辨識結果不正確，則運算單元110可將該待測影像加入第四資料庫138，並將該特定測試影像相應的正確物件辨識結果輸入至待測物辨識模型141，藉以再訓練並更新待測物辨識模型141。In an embodiment, the identification result of the object identification model 141 is not 100% accurate due to various environmental changes, so the user can check the object identification of each test image in the test image set by manual inspection. The result is correct. If it is determined that the object identification result corresponding to a specific test image is incorrect, the operation unit 110 may add the specific test image to the fourth database 138, and input the correct object identification result corresponding to the specific test image to the object identification model. 141, by retraining and updating the analyte identification model 141, the recognition rate of the analyte identification model 141 in a similar situation can be improved. Similarly, if the object recognition result after the image to be tested captured by the scene to be tested is input to the object identification model 141 is incorrect, the operation unit 110 may add the image to be tested to the fourth database 138, and The correct object identification result corresponding to the specific test image is input to the object identification model 141, thereby retraining and updating the object identification model 141.

在另一實施例中，使用者可先將各測試影像及正確物件辨識結果預先儲存於第五資料庫139中。且運算單元110在初始階段訓練待測物辨識模型141後，即可將第五資料庫139中之各測試影像輸入待測物辨識模型141以產生物件辨識結果，並將所產生的物件辨識結果與預先儲存的正確物件辨識結果進行比對。若所產生的物件辨識結果與預先儲存的正確物件辨識結果不相符(意即物件辨識結果為”失敗”)，則運算單元110可將所產生的物件辨識結果相應的測試影像加入第四資料庫138，並將正確物件辨識結果輸入待測物辨識模型141，藉以再訓練並更新待測物辨識模型141，故可提高待測物辨識模型141之辨識率。In another embodiment, the user may first store the test images and the correct object identification results in the fifth database 139. After the operation unit 110 trains the object identification model 141 in the initial stage, each test image in the fifth database 139 can be input into the object identification model 141 to generate an object identification result, and the generated object identification result is generated. Compare with the correct object identification results stored in advance. If the generated object identification result does not match the pre-stored correct object identification result (that is, the object recognition result is “failed”), the operation unit 110 may add the corresponding test image of the generated object identification result to the fourth database. 138, and input the correct object identification result into the object identification model 141, so as to retrain and update the object identification model 141, so that the recognition rate of the object identification model 141 can be improved.

詳細而言，本發明中之待測物辨識模型141之訓練過程是以模擬物件影像為主，並可利用真實場景之待測影像或在第五資料庫139中的測試影像以輔助修正及更新待測物辨識模型141。In detail, the training process of the object identification model 141 in the present invention is mainly based on the simulated object image, and can use the image to be tested of the real scene or the test image in the fifth database 139 to assist in the correction and update. The analyte identification model 141.

在一實施例中，在真實場景所拍攝到的物件影像(例如車牌影像)，均有可能受到環境的影像而產生視覺上的差異，此即為前述的物件影像特徴，亦可稱為待測物件(例如車牌)特徴。物件影像特徴例如包括：模糊、刮痕或汙損、陰影、遮蔽、過曝、變形、及色差。各物件影像特徴例如可分別使用不同的表示方式。In an embodiment, the image of the object captured in the real scene (such as the license plate image) may be visually different from the image of the environment, which is the object image feature described above, and may also be referred to as the test object. Objects (such as license plates) are special. Object image features include, for example, blurring, scratches or stains, shadows, shadowing, overexposure, distortion, and chromatic aberration. For example, different image representations can be used for each object image feature.

舉例來說，以模糊特徴為例，當車速過快、對焦失敗、或是車輛距離過遠時，都可能造成車牌影像模糊。因此，模糊特徴例如可用一模糊遮罩表示，例如可為一M*N之矩陣，且在模糊遮罩的中心像素乘上M*N之矩陣以得到模糊化的中心像素。舉例來說，模糊遮罩中之車牌影像的三列像素由左而右、由上而下例如分別為a1～a3、b1～b3、及c1～c3，其中b2即為中心像素，如第3A圖所示。模糊遮罩例如可為一3x3矩陣，如第3B圖所示。3x3矩陣中之係數值例如均為1，但本發明並不限定於上述模糊遮罩，本發明亦可利用本發明領域中之習知的模糊遮罩進行處理。因此，經由模糊遮罩處理後的中心像素b2會更新為b2 = (a1*1 + a2*1 + a3*1 + b1*1 + b2*1 + b3*1 + c1*1 + c2*1 + c3*1)* (1/9)。For example, in the case of a fuzzy feature, when the vehicle speed is too fast, the focus fails, or the vehicle is too far away, the license plate image may be blurred. Thus, the blurring feature can be represented, for example, by a blur mask, such as a matrix of M*N, and multiplied by a matrix of M*N at the center pixel of the blur mask to obtain a blurred center pixel. For example, the three columns of pixels of the license plate image in the blur mask are from left to right, from top to bottom, for example, a1 to a3, b1 to b3, and c1 to c3, respectively, where b2 is the center pixel, such as the 3A. The figure shows. The blur mask can be, for example, a 3x3 matrix, as shown in Figure 3B. The coefficient values in the 3x3 matrix are, for example, one, but the present invention is not limited to the above-described fuzzy mask, and the present invention can also be processed by the conventional fuzzy mask in the field of the present invention. Therefore, the central pixel b2 processed by the blur mask is updated to b2 = (a1*1 + a2*1 + a3*1 + b1*1 + b2*1 + b3*1 + c1*1 + c2*1 + C3*1)* (1/9).

以刮痕或汙損特徴為例，車牌上的字元可能會有刮痕或汙損，且刮痕例如會以直線或曲線存在，且汙損會以平面存在。因此，運算單元110可分別使用直線方程式或曲線方程式以模擬車牌上的刮痕，並以平面方程式模擬車牌上之汙損。For example, in the case of scratches or stains, the characters on the license plate may be scratched or stained, and the scratches may exist, for example, in a straight line or a curved line, and the stain may exist in a plane. Therefore, the arithmetic unit 110 can respectively use the straight line equation or the curve equation to simulate the scratch on the license plate, and simulate the stain on the license plate in a plane equation.

以陰影特徴為例，光源與環境作用會使車牌影像中之特定區域產生陰影。因此，運算單元110可將一亮度遮罩用於車牌影像以產生陰影之影像效果。舉例來說，若亮度遮罩中之車牌影像的三列像素由上而下例如分別為a1～a3、b1～b3、及c1～c3，其中b2即為中心像素，如第3A圖所示。亮度遮罩例如可為一3x3矩陣，如第3C圖所示，其中亮度遮罩之三列係數由左而右、由上而下例如分別為h1～h3、i1～i3、j1～j3，其中h1～h3、i1～i3、j1～j3之數值可為大於1、或小於/等於1的正數，端視亮度遮罩之設計需求而定。因此，運算單元110可將在亮度遮罩中之車牌影像像素a1更新為a1=a1*h1，車牌影像像素a2更新為a2=a2*h2，依此類推。In the case of shadow features, the light source and the environment cause shadows in specific areas of the license plate image. Therefore, the arithmetic unit 110 can use a brightness mask for the license plate image to produce a shadow image effect. For example, if the three columns of the license plate image in the brightness mask are from top to bottom, for example, a1 to a3, b1 to b3, and c1 to c3, respectively, where b2 is the center pixel, as shown in FIG. 3A. The brightness mask can be, for example, a 3×3 matrix, as shown in FIG. 3C, wherein the three columns of the luminance mask are from left to right, from top to bottom, for example, h1 to h3, i1 to i3, and j1 to j3, respectively. The values of h1 to h3, i1 to i3, and j1 to j3 may be positive numbers greater than 1, or less than/equal to 1, depending on the design requirements of the brightness mask. Therefore, the arithmetic unit 110 can update the license plate image pixel a1 in the brightness mask to a1=a1*h1, the license plate image pixel a2 to a2=a2*h2, and so on.

以遮蔽特徴為例，天氣（沙塵、雨、雪）或是其他物體（落葉、昆蟲等）覆蓋於車牌上均會產生遮蔽效果。因此，因此，運算單元110可使用一或多個平面方程式做為遮罩以遮蔽車牌影像中之部分區域，且遮罩之大小以不破壞車牌上之字元為原則。For example, in the case of sheltering features, weather (dust, rain, snow) or other objects (leaves, insects, etc.) covering the license plate will produce a shadowing effect. Therefore, the arithmetic unit 110 can use one or more plane equations as a mask to shield a part of the area of the license plate image, and the size of the mask is not to destroy the characters on the license plate.

以過曝特徴為例，來自車燈的光源無法抑制而造成車燈附近區域產生過曝的情況。因此，運算單元110可將一亮度遮罩用於車牌影像以產生過曝之影像效果。舉例來說，若亮度遮罩中之車牌影像的三列像素由上而下例如分別為a1～a3、b1～b3、及c1～c3，其中b2即為中心像素，如第3A圖所示。亮度遮罩例如可為一3x3矩陣，如第3C圖所示，其中亮度遮罩之三列係數由左而右、由上而下例如分別為h1～h3、i1～i3、j1～j3，其中參數h1～h3、i1～i3、j1～j3之數值可為大於1、或小於/等於1的正數，端視亮度遮罩之設計需求而定，且用於過曝特徴之亮度遮罩的參數數值與用於陰影特徴之亮度遮罩的參數數值不同。因此，運算單元110可將在亮度遮罩中之車牌影像像素a1更新為a1=a1*h1，車牌影像像素a2更新為a2=a2*h2，依此類推。Taking the overexposure feature as an example, the light source from the lamp can not be suppressed and the area near the lamp is overexposed. Therefore, the arithmetic unit 110 can use a brightness mask for the license plate image to generate an overexposed image effect. For example, if the three columns of the license plate image in the brightness mask are from top to bottom, for example, a1 to a3, b1 to b3, and c1 to c3, respectively, where b2 is the center pixel, as shown in FIG. 3A. The brightness mask can be, for example, a 3×3 matrix, as shown in FIG. 3C, wherein the three columns of the luminance mask are from left to right, from top to bottom, for example, h1 to h3, i1 to i3, and j1 to j3, respectively. The values of the parameters h1 to h3, i1 to i3, and j1 to j3 may be positive numbers greater than 1, or less than/equal to 1, depending on the design requirements of the brightness mask, and parameters for the brightness mask of the overexposed feature. The value is different from the parameter value for the brightness mask of the shadow feature. Therefore, the arithmetic unit 110 can update the license plate image pixel a1 in the brightness mask to a1=a1*h1, the license plate image pixel a2 to a2=a2*h2, and so on.

以變形特徴為例，攝影機之不同視角會對所擷取的車牌影像產生三軸(X軸、Y軸、Z軸)旋轉。因此，運算單元110可將一透視變換矩陣用於車牌影像以產生變形之影像效果。舉例來說，運算單元110例如可依據方程式(1)以計算透視變換矩陣(transparent transformation matrix)： (1) Taking the deformation feature as an example, the different perspectives of the camera will produce three axes (X-axis, Y-axis, Z-axis) rotation of the captured license plate image. Therefore, the operation unit 110 can use a perspective transformation matrix for the license plate image to produce a deformed image effect. For example, the operation unit 110 can calculate a transparent transformation matrix according to Equation (1), for example: (1)

運算單元110可依據需求設定在3x3矩陣中之參數a ₁₁～a ₃₃之數值，並將模擬物件(例如用不同字元所組成之模擬車牌)經過透視變換矩陣後(例如可將 (x, y)之像素值用 (x’/w’, y’/w’)之像素值代替)以模擬不同視角的車牌影像。 The operation unit 110 can set the values of the parameters a ₁₁ to a ₃₃ in the 3×3 matrix according to requirements, and pass the simulated object (for example, the simulated license plate composed of different characters) through the perspective transformation matrix (for example, (x, y) instead of the pixel value) of pixel values (x '/ w', y '/ w') of) the image plate to simulate different views.

以色差特徴為例，攝影機會受環境影響導致光源經過透鏡成像產生顏色偏差，故運算單元110可對車牌影像進行一色彩空間轉換以達到色差之影像效果。Taking the color difference characteristic as an example, the photo opportunity is affected by the environment, and the light source is imaged by the lens to generate a color deviation. Therefore, the operation unit 110 can perform a color space conversion on the license plate image to achieve the image effect of the color difference.

在一實施例中，背景影像特徴例如可為真實場景中所拍攝之影像所產生的雜訊，且背景影像特徴亦可稱為環境雜訊特徴。背景影像特徴例如包括：模糊、刮痕或汙損、陰影、雜訊、遮蔽、過曝、變形、及色差，但本發明並不限於此。各背景影像特徴例如可分別使用不同的表示方式。需注意的是，在物件影像特徴及背景影像特徴中有部分影像特徴之名稱相同，這些名稱相同的影像特徴的處理方式類似，但是物件影像特徴是針對各個訓練用物件(例如：模擬車牌影像)進行處理，背景影像特徴則是針對整張背景影像(可不包括車牌)或模擬合成影像進行處理，所以物件影像特徴及背景影像特徴在設定相應類型的遮罩、矩陣、方程式之參數也不同。In an embodiment, the background image features, for example, may be noise generated by images captured in a real scene, and the background image features may also be referred to as environmental noise features. Background image features include, for example, blurring, scratches or stains, shadows, noise, shadowing, overexposure, distortion, and chromatic aberration, but the invention is not limited thereto. For each background image feature, for example, different representations can be used. It should be noted that some of the image features are the same in the image image feature and the background image feature. The image features with the same name are treated similarly, but the image features are specific to each training object (for example, analog license plate image). For processing, the background image feature is processed for the entire background image (may not include the license plate) or the analog composite image, so the object image features and background image characteristics are different in setting the corresponding types of masks, matrices, and equations.

在一實施例中，相較於物件影像特徴，背景影像特徴更包含了雜訊特徴。舉例來說，運算單元110可對待處理影像(例如訓練用物件、背景影像或模擬合成影像)加入不同類型的雜訊，例如椒鹽雜訊(salt-and-pepper noise)、高斯雜訊(Gaussian noise)、斑點雜訊(speckle noise)、或週期雜訊(periodic noise)。關於椒鹽雜訊，運算單元110可設定椒鹽雜訊為待處理影像之影像面積的x%，並將椒鹽雜訊隨機地加入於待處理影像中，其中x之數值可視實際情況調整。關於高斯雜訊、斑點雜訊、及週期雜訊，運算單元110可利用習知技術將這些雜訊加入待處理影像中，故其細節於此不再詳述。In one embodiment, the background image feature contains noise characteristics as compared to the image features of the object. For example, the computing unit 110 can add different types of noise to the image to be processed (for example, training objects, background images, or analog synthetic images), such as salt-and-pepper noise, Gaussian noise. ), speckle noise, or periodic noise. For the salt and pepper noise, the computing unit 110 can set the salt and pepper noise as x% of the image area of the image to be processed, and randomly add the salt and pepper noise to the image to be processed, wherein the value of x can be adjusted according to actual conditions. Regarding the Gaussian noise, the spot noise, and the periodic noise, the operation unit 110 can add these noises to the image to be processed by using a conventional technique, so the details thereof will not be described in detail herein.

第4A-4F圖係顯示依據本發明另一實施例中在物件辨識流程中所使用之訓練用物件的示意圖。在另一實施例中，運算單元110所產生的訓練用物件並不限定於模擬車牌影像。舉例來說，訓練用物件亦包括人體、車牌、零組件、及標誌。在此實施例中，第一資料庫135所儲存的複數張物件場景影像，例如為包括一或多個人體姿勢之人體影像，且運算單元110係由各物件場景影像中辨識出人體區域並擷取為物件影像，並將所擷取物件影像儲存至第三資料庫137。4A-4F are schematic views showing training objects used in the object identification process in accordance with another embodiment of the present invention. In another embodiment, the training object generated by the computing unit 110 is not limited to the simulated license plate image. For example, training objects also include human bodies, license plates, components, and signs. In this embodiment, the image of the plurality of objects stored in the first database 135 is, for example, a human body image including one or more human postures, and the computing unit 110 identifies the human body region from the image of each object scene. The image is taken as an object image, and the captured object image is stored in the third database 137.

如第4A-4F圖所示，在第三資料庫137中之物件影像例如可為在不同背景及擷取位置所得到的人體影像。在此實施例中，預定規則例如為可直接使用第三資料庫137中之物件影像以做為訓練用物件，故運算單元110可直接由第三資料庫137所儲存的複數張物件影像中選擇其中一者以做為訓練用物件。在一些實施例中，預定規則例如可為以一預定方式或間距排列不同的一或多個物件影像以產生訓練用物件，但本發明並不限於此。類似地，當欲辨識的物件為文字、零組件、或標誌等等，本發明亦可在第一資料庫135中儲存相應類型的物件場景影像，並由物件場景影像中擷取出物件影像，並利用前述實施例之流程產生相應類型之模擬物件影像以形成模擬物件影像集合，再依據模擬物件影像集合以訓練出待測物辨識模型141。As shown in Figures 4A-4F, the image of the object in the third database 137 can be, for example, a human image obtained at different backgrounds and locations. In this embodiment, the predetermined rule is, for example, that the object image in the third database 137 can be directly used as the training object, so the operation unit 110 can directly select from the plurality of object images stored in the third database 137. One of them is used as a training object. In some embodiments, the predetermined rule may be, for example, arranging different one or more object images in a predetermined manner or pitch to produce a training article, but the invention is not limited thereto. Similarly, when the object to be recognized is a character, a component, or a logo, etc., the present invention can also store an image of the corresponding type of object in the first database 135, and extract the image of the object from the image of the object scene, and The simulation object image of the corresponding type is generated by using the process of the foregoing embodiment to form a simulated object image set, and the sample object recognition model 141 is trained according to the simulated object image set.

第5圖係顯示依據本發明一實施例中使用模擬物件影像之物件辨識方法的流程圖。請同時參考第1圖及第5圖。Figure 5 is a flow chart showing an object recognition method using a simulated object image in accordance with an embodiment of the present invention. Please also refer to Figures 1 and 5.

在步驟S510，取得包括複數張物件影像之一物件影像集合以及包括複數張背景影像之一背景影像集合。物件影像集合例如儲存於第三資料庫137，上述物件影像例如可為包括一或多種類型之物件的影像，其中上述物件例如可為文字、人體、車牌、零組件、及標誌等等，但本發明並不限於此。背景影像集合例如儲存於第二資料庫136。其中，上述背景影像例如可為在不同拍攝條件下所取得的任意真實場景之真實背景影像，並不限於待測場景之背景影像，且亦可不包括待測物件。在一些實施例中，上述背景影像更包括電腦視覺技術模擬出之虛擬背景影像。In step S510, an object image set including one of the plurality of object images and a background image set including one of the plurality of background images are obtained. The object image collection is stored, for example, in the third database 137, and the object image may be, for example, an image including one or more types of objects, such as text, human body, license plate, components, and logos, etc., but The invention is not limited to this. The background image collection is stored, for example, in the second database 136. The background image may be, for example, a real background image of any real scene obtained under different shooting conditions, and is not limited to the background image of the scene to be tested, and may not include the object to be tested. In some embodiments, the background image further includes a virtual background image simulated by computer vision technology.

在步驟S520，依據該物件影像集合及該背景影像集合產生包括複數張模擬物件影像之一模擬物件影像集合。舉例來說，運算單元110係依據一預定規則由該一或多個物件影像組成一或多個訓練用物件，進行一第一影像處理將一或多個物件影像特徴加入該一或多個訓練用物件之每一者，以產生一或多個模擬待測物件。其中運算單元110可依據一或多個模擬待測物件及背景影像集合以產生模擬物件影像集合。上述一或多個物件影像特徴例如可由第一資料庫135中的物件場景影像擷取而得，或是透過以方程式、矩陣運算以模擬訓練用物件之物件影像特徴。運算單元110接著由第二資料庫136中之背景影像集合取得第一背景影像，並進行一第二影像處理將該一或多個背景影像特徴加入該第一背景影像以產生一模擬背景影像。運算單元110例如可依據一或多個模擬待測物件及模擬背景影像以產生模擬物件影像集合。接著，運算單元110係進行一影像合成處理將該模擬待測物件加入該模擬背景影像以產生一模擬合成影像，並進行該第二影像處理將該一或多個背景影像特徴加入該模擬合成影像以產生該等模擬物件影像之其中一者。In step S520, a set of simulated object images including a plurality of simulated object images is generated according to the object image set and the background image set. For example, the computing unit 110 composes one or more training objects from the one or more object images according to a predetermined rule, and performs a first image processing to add one or more object image features to the one or more trainings. Each of the objects is used to generate one or more simulated objects to be tested. The computing unit 110 can generate a simulated object image set according to one or more simulated objects to be tested and a background image set. The one or more object image features may be obtained, for example, from an image of the object scene in the first database 135, or may be simulated by an equation or matrix to simulate an image feature of the object of the training object. The computing unit 110 then obtains the first background image from the background image set in the second database 136, and performs a second image processing to add the one or more background images to the first background image to generate a simulated background image. The computing unit 110 can generate a simulated object image set according to, for example, one or more simulated objects to be tested and a simulated background image. Next, the computing unit 110 performs an image synthesis process to add the simulated object to be tested to the simulated background image to generate a simulated composite image, and performs the second image processing to add the one or more background images to the simulated composite image. To generate one of the images of the simulated objects.

在步驟S530，依據該模擬物件影像集合以訓練出一待測物辨識模型。舉例來說，在一實施例中，運算單元110可先透過模擬物件影像集合訓練出待測物辨識模型141(意即可不使用真實影像進行訓練)。在另一實施例中，運算單元110可直接將真實物件影像加入模擬物件影像集合以產生一混合物件影像集合，並依據該混合物件影像集合以訓練出該待測物辨識模型。In step S530, a simulated object recognition model is trained according to the simulated object image set. For example, in an embodiment, the computing unit 110 may first train the object identification model 141 through the simulated object image collection (ie, it is not necessary to use the real image for training). In another embodiment, the computing unit 110 can directly add the real object image to the simulated object image set to generate a mixture component image set, and train the DUT identification model according to the mixture component image set.

在步驟S540，將由一待測場景所取得的一待測影像輸入該待測物辨識模型以取得一物件辨識結果。舉例來說，使用者可先將各測試影像及正確物件辨識結果預先儲存於第五資料庫139中。運算單元110在初始階段訓練出待測物辨識模型141後，即可將第五資料庫139中之各測試影像輸入待測物辨識模型141以產生物件辨識結果，並將所產生的物件辨識結果與預先儲存的正確物件辨識結果進行比對。若所產生的物件辨識結果與預先儲存的正確物件辨識結果不相符(意即物件辨識結果為”失敗”)。此外，當待測物辨識模型141對待測影像之該物件辨識結果為失敗時，運算單元110可將該待測影像加入該模擬物件影像集合以產生一混合物件影像集合，並依據該混合物件影像集合及該待測影像之一正確物件辨識結果再訓練該待測物辨識模型141。In step S540, a to-be-measured image obtained by a scene to be tested is input into the object recognition model to obtain an object recognition result. For example, the user may first store the test images and the correct object identification results in the fifth database 139. After the operation unit 110 trains the object identification model 141 in the initial stage, each test image in the fifth database 139 can be input into the object identification model 141 to generate an object identification result, and the generated object identification result is generated. Compare with the correct object identification results stored in advance. If the generated object identification result does not match the pre-stored correct object identification result (meaning the object recognition result is "failed"). In addition, when the object identification result of the object to be tested 141 is unsuccessful, the operation unit 110 may add the image to be tested to the simulated object image set to generate a mixture image image, and according to the mixture image The set and the correct object identification result of the image to be tested are used to train the test object identification model 141.

綜上所述，本發明係提供一種使用模擬物件影像之物件辨識系統及其方法，可使用少量的資料影像抽取物件特徵與環境特徵，並以此產生已標註的大量的模擬物件影像與模擬背景影像，提高訓練資料集合(例如模擬物件影像集合)的多樣性。由於模擬數據貼近實際數據，因此本方法可以模擬數據為主，真實數據為輔，大幅降低資料準備之時間並改善數據取得不易時所遇到的窘境。 In summary, the present invention provides an object recognition system and method for simulating an object image, which can extract object features and environmental features using a small amount of data images, and thereby generate a large number of simulated object images and simulated backgrounds. Imagery that increases the diversity of training material collections (eg, simulated object image collections). Since the analog data is close to the actual data, the method can simulate the data mainly, supplemented by the real data, greatly reducing the time for data preparation and improving the dilemma encountered when the data is difficult to obtain.

本發明之方法，或特定型態或其部份，可以以程式碼的型態包含於實體媒體，如軟碟、光碟片、硬碟、或是任何其他機器可讀取(如電腦可讀取)儲存媒體，其中，當程式碼被機器，如電腦載入且執行時，此機器變成用以參與本發明之裝置或系統。本發明之方法、系統與裝置也可以以程式碼型態透過一些傳送媒體，如電線或電纜、光纖、或是任何傳輸型態進行傳送，其中，當程式碼被機器，如電腦接收、載入且執行時，此機器變成用以參與本發明之裝置或系統。當在一般用途處理器實作時，程式碼結合處理器提供一操作類似於應用特定邏輯電路之獨特裝置。 The method of the present invention, or a specific type or part thereof, may be included in a physical medium such as a floppy disk, a compact disc, a hard disk, or any other machine (for example, a computer readable computer). A storage medium in which, when the code is loaded and executed by a machine, such as a computer, the machine becomes a device or system for participating in the present invention. The method, system and apparatus of the present invention may also be transmitted in a coded form via some transmission medium, such as a wire or cable, optical fiber, or any transmission type, wherein the code is received and loaded by a machine, such as a computer. And when executed, the machine becomes a device or system for participating in the present invention. When implemented in a general purpose processor, the code in conjunction with the processor provides a unique means of operation similar to application specific logic.

本發明雖以較佳實施例揭露如上，然其並非用以限定本發明的範圍，任何所屬技術領域中具有通常知識者，在不脫離本發明之精神和範圍內，當可做些許的更動與潤飾，因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。 The present invention has been disclosed in the above preferred embodiments, and is not intended to limit the scope of the present invention. Any one of ordinary skill in the art can make a few changes without departing from the spirit and scope of the invention. The scope of protection of the present invention is therefore defined by the scope of the appended claims.

100‧‧‧物件辨識系統 100‧‧‧ Object Identification System

110‧‧‧運算單元 110‧‧‧ arithmetic unit

120‧‧‧影像擷取裝置120‧‧‧Image capture device

130‧‧‧儲存單元130‧‧‧storage unit

131‧‧‧揮發性記憶體131‧‧‧ volatile memory

132‧‧‧非揮發性記憶體132‧‧‧ Non-volatile memory

133‧‧‧物件辨識程式133‧‧‧object identification program

135‧‧‧第一資料庫135‧‧‧First database

136‧‧‧第二資料庫136‧‧‧Second database

137‧‧‧第三資料庫137‧‧‧ third database

138‧‧‧第四資料庫138‧‧‧ Fourth database

139‧‧‧第五資料庫139‧‧‧ Fifth database

140‧‧‧第六資料庫140‧‧‧ Sixth Database

141‧‧‧待測物辨識模型141‧‧‧Test object identification model

150‧‧‧顯示器150‧‧‧ display

S510-S540‧‧‧步驟S510-S540‧‧‧Steps

第1圖係顯示依據本發明一實施例中之物件辨識系統的方塊圖。第2A－2M圖係顯示依據本發明一實施例在物件辨識流程中所使用的不同影像之示意圖。第3A圖係顯示依據本發明一實施例中在模糊遮罩中之訓練用物件之像素的示意圖。第3B圖係顯示依據本發明一實施例在模糊遮罩中之係數的示意圖。第3C圖係顯示依據本發明一實施例在亮度遮罩中之係數的示意圖。第4A-4F圖係顯示依據本發明另一實施例中物件影像的示意圖。第5圖係顯示依據本發明一實施例中使用模擬物件影像之物件辨識方法的流程圖。1 is a block diagram showing an object recognition system in accordance with an embodiment of the present invention. The 2A-2M diagram shows a schematic diagram of different images used in the object recognition process in accordance with an embodiment of the present invention. 3A is a schematic diagram showing pixels of a training object in a blurred mask in accordance with an embodiment of the present invention. Figure 3B is a schematic diagram showing the coefficients in a blur mask in accordance with an embodiment of the present invention. Figure 3C is a schematic diagram showing the coefficients in a luminance mask in accordance with an embodiment of the present invention. 4A-4F are schematic views showing an image of an object according to another embodiment of the present invention. Figure 5 is a flow chart showing an object recognition method using a simulated object image in accordance with an embodiment of the present invention.

Claims

An object recognition method using a simulated object image, the method comprising: (A) acquiring an image collection of an object including one or more object images and a background image set including one or more background images; (B) according to the object The image collection and the background image collection generate a simulation object image set including one of a plurality of simulated object images; (C) training a test object recognition model according to the simulated object image set; and (D) being subjected to a scene to be tested The obtained image to be tested is input into the object identification model to obtain an object identification result.

The method for identifying an object using a simulated object image according to claim 1, wherein the step (B) comprises: forming one or more training objects from the one or more object images according to a predetermined rule; The first image processing adds one or more object image features to each of the one or more training objects to generate one or more simulated objects to be tested; and according to the one or more simulated objects to be tested and the A collection of background images to produce a collection of simulated object images.

An object identification method using a simulated object image as described in claim 2, wherein the one or more object image features are obtained from the image of the object.

The method for identifying an object using a simulated object image as described in claim 2, wherein the step (B) further comprises: obtaining a first background image from the one or more background images; performing a second image processing The one or more background images are specifically added to the first background image to generate an analog background image; and the simulated object image set is generated according to the simulated background image and the one or more simulated objects to be tested.

The method for identifying an object using a simulated object image according to the fourth aspect of the patent application, wherein the step (B) further comprises: performing an image synthesis process to add the simulated object to be tested to the simulated background image to generate a simulated synthetic image. And performing the second image processing to add the one or more background images to the simulated composite image to generate one of the simulated object images.

The method for identifying an object using a simulated object image as described in claim 1 further includes: (E) adding the image to be tested to the simulated object image set to generate a mixture when the object identification result is a failure. And acquiring (F) the training object identification model according to the image collection of the mixture and the correct object identification result of the image to be tested.

The method for identifying an object using a simulated object image according to the first aspect of the invention, wherein the step (C) further comprises: adding one or more real object images to the simulated object image set to generate a mixture image collection; And training the test object identification model according to the image set of the mixture.

An object recognition system using a simulated object image, comprising: a non-volatile memory for storing an object recognition program; and an arithmetic unit for executing the object recognition program to perform the following steps: (A) obtaining a complex number (1) generating a set of simulated object images including a plurality of simulated object images according to the object image set and the background image set; (C) And (D) inputting a to-be-measured image obtained by a scene to be tested into the object identification model to obtain an object identification result.

An object identification system using a simulated object image according to claim 8 , wherein in the step (B), the computing unit further comprises one or more objects to form one or more training objects according to a predetermined rule. And performing a first image processing to add the one or more object image features to each of the one or more training objects to generate one or more simulated objects to be tested, and the computing unit is further based on the one or A plurality of simulated objects to be tested and the background image set to generate the simulated object image collection.

An object recognition system using a simulated object image according to claim 9, wherein the one or more object image features are obtained from the image of the object.

An object recognition system using a simulated object image according to claim 9, wherein in the step (B), the computing unit further acquires a first background image from the background images, and performs a second image processing. The one or more background image features are added to the first background image to generate a simulated background image, and the computing unit further generates the simulated object image set according to the simulated background image and the one or more simulated objects to be tested.

An object recognition system using a simulated object image according to claim 11, wherein in the step (B), the operation unit further performs an image synthesis process to add the simulated object to be tested to the simulated background image to generate a Simulating the synthesized image and performing the second image processing to add the one or more background images to the simulated synthetic image to generate one of the simulated object images.

The object recognition system using the simulated object image as described in claim 8 , wherein the operation unit further performs the following steps: (E) adding the image to be tested to the simulated object when the object identification result is a failure. The image is assembled to generate a mixture of image images; and (F) the object identification model is trained based on the image collection of the mixture and the correct object identification result of the image to be tested.

An object recognition system using a simulated object image according to claim 8 , wherein in the step (C), the computing unit further adds one or more real object images to the simulated object image set to generate a mixture piece. The image collection is based on the mixture image collection to train the object identification model.