TWI723383B

TWI723383B - Image recognition device and method

Info

Publication number: TWI723383B
Application number: TW108114041A
Authority: TW
Inventors: 杜承翰
Original assignee: 鴻齡科技股份有限公司
Priority date: 2019-04-22
Filing date: 2019-04-22
Publication date: 2021-04-01
Also published as: TW202040424A

Abstract

An image recognition method includes: obtaining a video stream by a first processing module; pre-processing the video stream to obtain an image queue arranged in a frame play order by the first processing module; storing the image queue into a storage module; reading a image frame of the image queue from the storage module and identifying the image frame to detect at least one target in the image frame by the second processing module; wherein the second processing module sequentially reads the image frame one by one according to a frame order of the image queue. An image recognition device is also provided.

Description

Image recognition device and method

本發明涉及影像處理技術領域，尤其涉及一種圖像識別裝置及方法。 The present invention relates to the field of image processing technology, in particular to an image recognition device and method.

隨著高清顯示技術之發展，市面上出現了4K(圖元解析度4096*2160)之視頻播放設備，未來甚至可上市8K(圖元解析度7680*4320)之視頻播放設備。該些視頻播放設備之幀頻一般為30FPS，或者更高。當需要對每幀圖像進行一個或多個目標物檢測時，由於視頻幀之圖元較高且幀頻較大，導致圖像識別工作量較大且識別速度要求要快，習知之圖像識別模組之處理器處理速度有限，無法做到對每一播放之超高清幀圖像進行即時目標物檢測。 With the development of high-definition display technology, 4K (picture element resolution 4096*2160) video playback equipment appeared on the market, and 8K (picture element resolution 7680*4320) video playback equipment may even be marketed in the future. The frame rate of these video playback devices is generally 30FPS or higher. When it is necessary to detect one or more target objects in each frame of image, due to the higher image elements of the video frame and the larger frame rate, the image recognition workload is larger and the recognition speed is required to be fast. The conventional image The processing speed of the processor of the recognition module is limited, and it is unable to perform real-time target detection for each UHD frame image played.

有鑑於此，有必要提供一種圖像識別裝置及方法，能夠提高圖像識別效率，實現對每一超高清幀圖像進行即時目標物檢測。 In view of this, it is necessary to provide an image recognition device and method that can improve the efficiency of image recognition and realize real-time target detection for each ultra-high-definition frame image.

本發明一實施方式提供一種圖像識別方法，應用於圖像識別裝置，所述圖像識別裝置包括第一處理模組、存儲模組、第二處理模組，所述方法包括：利用所述第一處理模組獲取待處理之視頻流，並對所述視頻流進行預處理得到按幀播放順序排列之圖像佇列；將所述圖像佇列存儲至所述存儲模組；及利用所述第二處理模組從所述存儲模組中讀取待識別之圖像幀，並對所述圖像幀進行識別，以檢測所述圖像幀中之至少一目標物；其中，所述第二處理模組根據所述圖像佇列之圖像幀排序進行依次讀取。 An embodiment of the present invention provides an image recognition method applied to an image recognition device. The image recognition device includes a first processing module, a storage module, and a second processing module. The method includes: using the The first processing module obtains the video stream to be processed, and preprocesses the video stream to obtain an image queue arranged in the order of frame playback; stores the image queue in the storage module; and uses The second processing module reads the image frame to be recognized from the storage module, and compares the image The frame is recognized to detect at least one target in the image frame; wherein, the second processing module reads sequentially according to the sequence of the image frames in the image queue.

本發明一實施方式還提供一種圖像識別裝置，包括第一處理模組、存儲模組、第二處理模組及通信匯流排，所述第一處理模組、所述存儲模組及所述第二處理模組藉由所述通信匯流排進行相互間之通信；所述第一處理模組用於獲取待處理之視頻流，並對所述視頻流進行預處理得到按幀播放順序排列之圖像佇列；所述第一處理模組還用於將所述圖像佇列存儲至所述存儲模組；所述第二處理模組用於從所述存儲模組中讀取待識別之圖像幀，並對所述圖像幀進行識別，以檢測所述圖像幀中之至少一目標物；其中，所述第二處理模組根據所述圖像佇列之圖像幀排序進行依次讀取。 An embodiment of the present invention also provides an image recognition device, including a first processing module, a storage module, a second processing module, and a communication bus. The first processing module, the storage module, and the The second processing module communicates with each other through the communication bus; the first processing module is used to obtain the to-be-processed video stream, and preprocess the video stream to obtain the order of frame playback Image queue; the first processing module is also used to store the image queue to the storage module; the second processing module is used to read the to-be-identified from the storage module To identify the image frame to detect at least one target in the image frame; wherein, the second processing module sorts the image frames according to the image queue Read sequentially.

與習知技術相比，上述圖像識別裝置及方法，利用第一處理模組進行視頻流解碼得到待識別之圖像幀，利用第二處理模組之多執行緒對圖像幀進行識別，從而提高圖像識別效率，實現對每一超高清幀圖像進行即時目標物檢測。 Compared with the prior art, the above-mentioned image recognition device and method use the first processing module to decode the video stream to obtain the image frame to be recognized, and use the multi-thread of the second processing module to recognize the image frame. Thereby improving the efficiency of image recognition and real-time target detection for each ultra-high-definition frame image.

10:第一處理模組 10: The first processing module

20:存儲模組 20: Storage module

30:第二處理模組 30: The second processing module

40:通信匯流排 40: Communication bus

100:圖像識別裝置 100: Image recognition device

P#1~P#N:執行緒 P#1~P#N: Thread

PU#1~PU#N:處理器單元 PU#1~PU#N: processor unit

圖1是本發明一實施方式之圖像識別裝置之架構示意圖。 FIG. 1 is a schematic diagram of the structure of an image recognition device according to an embodiment of the present invention.

圖2是本發明一實施方式之圖像識別裝置之功能模組圖。 Fig. 2 is a functional block diagram of an image recognition device according to an embodiment of the present invention.

圖3是本發明一實施方式之圖像識別方法之流程圖。 Fig. 3 is a flowchart of an image recognition method according to an embodiment of the present invention.

請參閱圖1，為本發明圖像識別裝置一較佳實施例之示意圖。 Please refer to FIG. 1, which is a schematic diagram of a preferred embodiment of the image recognition device of the present invention.

所述圖像識別裝置100包括第一處理模組10、存儲模組20、第二處理模組30及通信匯流排40。所述第一處理模組10、所述存儲模組20及所述第二處理模組30藉由所述通信匯流排40來完成相互間之通信。 The image recognition device 100 includes a first processing module 10, a storage module 20, a second processing module 30 and a communication bus 40. The first processing module 10, the storage module 20, and the The second processing module 30 uses the communication bus 40 to complete mutual communication.

所述存儲模組20可包括至少一記憶體。所述記憶體可包括高速隨機存取記憶體，還可包括非易失性記憶體，例如硬碟機、記憶體、插接式硬碟機，智慧存儲卡(Smart Media Card,SMC)，安全數位(Secure Digital,SD)卡，快閃記憶體卡(Flash Card)、至少一個磁碟記憶體件、快閃記憶體器件、或其他易失性固態記憶體件。所述通信匯流排40包括但不限於資料匯流排、電源匯流排、控制匯流排、狀態信號匯流排等。 The storage module 20 may include at least one memory. The memory may include high-speed random access memory, and may also include non-volatile memory, such as hard disk drives, memory, plug-in hard disk drives, Smart Media Card (SMC), and secure Digital (Secure Digital, SD) card, flash memory card (Flash Card), at least one magnetic disk memory device, flash memory device, or other volatile solid-state memory device. The communication bus 40 includes, but is not limited to, a data bus, a power bus, a control bus, a status signal bus, and the like.

所述第一處理模組10用於獲取待處理之視頻流，並對所述視頻流進行預處理得到按幀播放順序排列之圖像佇列。 The first processing module 10 is used to obtain a video stream to be processed, and preprocess the video stream to obtain an image queue arranged in the order of frame playback.

於一實施方式中，所述第一處理模組10可從具有視頻錄製功能之設備、視頻播放設備、存放裝置等中獲取待處理之視頻流。所述預處理可包括視頻流之解碼與幀圖像分割處理，幀圖像分割處理可是將一視頻分割成多個圖像幀。可理解視頻播放原理即快速連續地播放圖像幀，所述第一處理模組10優選將所述視頻流處理成按幀播放順序排列之圖像佇列，該圖像佇列之起始圖像幀可是該視頻流之第一幀圖像，該圖像佇列之末端圖像幀可是該視頻流之最後一幀圖像。 In one embodiment, the first processing module 10 can obtain a video stream to be processed from a device with a video recording function, a video playback device, a storage device, etc. The preprocessing may include decoding of the video stream and frame image segmentation processing, and the frame image segmentation processing may divide a video into multiple image frames. It is understandable that the principle of video playback is to play image frames quickly and continuously. The first processing module 10 preferably processes the video stream into an image queue arranged in the order of frame playback. The initial image of the image queue is The image frame may be the first image of the video stream, and the end image frame of the image queue may be the last image of the video stream.

為便於後續對圖像幀之識別工作，所述第一處理模組10還用於將所述圖像佇列存儲至所述存儲模組20。 In order to facilitate subsequent identification of image frames, the first processing module 10 is also used to store the image queue in the storage module 20.

所述待處理之視頻流可是存儲於存儲模組20中之視頻流，進而所述第一處理模組10可從所述存儲模組20中獲取所述待處理之視頻流。 The video stream to be processed may be a video stream stored in the storage module 20, and the first processing module 10 may obtain the video stream to be processed from the storage module 20.

於一實施方式中，所述第一處理模組10可包括一個或多個處理器，所述處理器可是中央處理單元(Central Processing Unit，CPU)，微處理器、數位訊號處理器(Digital Signal Processor，DSP)、專用積體電路(Application Specific Integrated Circuit，ASIC)、現成可程式設計閘陣列(Field-Programmable Gate Array， FPGA)或者其他可程式設計邏輯器件、分立門或者電晶體邏輯器件、分立硬體元件等。當所述第一處理模組10包括多個處理器時，所述第一處理模組10可採用多執行緒來對所述視頻流進行預處理得到按幀播放順序排列之圖像佇列。 In one embodiment, the first processing module 10 may include one or more processors, and the processors may be a central processing unit (CPU), a microprocessor, or a digital signal processor (Digital Signal Processor). Processor, DSP), dedicated integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. When the first processing module 10 includes multiple processors, the first processing module 10 may use multiple threads to preprocess the video stream to obtain image queues arranged in the order of frame playback.

所述第二處理模組30用於從所述存儲模組20中讀取待識別之圖像幀，並對所述圖像幀進行識別，以檢測所述圖像幀中之至少一目標物。 The second processing module 30 is used to read the image frame to be identified from the storage module 20 and identify the image frame to detect at least one target in the image frame .

於一實施方式中，所述待識別之圖像幀可是所述圖像佇列中未進行圖像識別之圖像幀。所述第二處理模組30可包括至少一個圖像識別模型，以對所述圖像幀進行識別。所述圖像識別模型可是藉由機器學習演算法訓練得到之識別模型，比如深度殘差網路演算法、卷積神經網路演算法等。 In one embodiment, the image frame to be recognized may be an image frame in the image queue for which no image recognition has been performed. The second processing module 30 may include at least one image recognition model to recognize the image frame. The image recognition model may be a recognition model trained by a machine learning algorithm, such as a deep residual network algorithm, a convolutional neural network algorithm, and the like.

所述目標物可根據實際使用需求進行設定，比如所述目標物可是人、包(手提包、背包)等。當所述第二處理模組30需要識別多個類型之目標時，每一類型之目標物可對應一圖像識別模型。比如所述第二處理模組30包括有第一圖像識別模型及第二圖像識別模型，所述第一圖像識別模型用於識別圖像幀中之人，所述第二圖像識別模型用於識別圖像幀中之包，則所述第二處理模組30可實現檢測圖像幀中之人與包。 The target can be set according to actual usage requirements. For example, the target can be a person, a bag (handbag, backpack), and the like. When the second processing module 30 needs to recognize multiple types of targets, each type of target can correspond to an image recognition model. For example, the second processing module 30 includes a first image recognition model and a second image recognition model. The first image recognition model is used to recognize a person in an image frame, and the second image recognition The model is used to identify the packets in the image frame, and the second processing module 30 can detect people and packets in the image frame.

於一實施方式中，所述第二處理模組30優選根據所述圖像佇列之圖像幀排序進行依次讀取與識別，即圖像佇列中排序靠前之圖像幀會優先被所述第二處理模組30讀取與識別。 In one embodiment, the second processing module 30 preferably reads and recognizes sequentially according to the sequence of the image frames of the image queue, that is, the image frames that are ranked first in the image queue are preferentially The second processing module 30 reads and recognizes.

於一實施方式中，所述第二處理模組30從所述存儲模組20中讀取到圖像幀後，還對所述圖像幀進行參數調整，然後再對參數調整後之圖像幀進行識別，其中所述參數調整可至少包括圖元調整與/或亮度調整，進而可提高圖像識別之效率與準確度。所述圖元調整比如可是將圖像幀之圖元縮小以減少計算量。 In one embodiment, after the second processing module 30 reads the image frame from the storage module 20, it further adjusts the parameters of the image frame, and then adjusts the image after the parameter adjustment. Frame recognition, wherein the parameter adjustment may at least include image element adjustment and/or brightness adjustment, thereby improving the efficiency and accuracy of image recognition. The picture element adjustment may be, for example, to reduce the picture element of the image frame to reduce the amount of calculation.

於一實施方式中，所述第二處理模組30還用於將識別得到之所述目標物之物體類別資訊作為所述目標物之標籤資訊儲存至所述圖像幀。 In one embodiment, the second processing module 30 is also used for recognizing the The object category information of the target is stored in the image frame as the tag information of the target.

請同時參閱圖2，所述第二處理模組30可是一處理器集群，所述處理器集群支援硬體擴展。所述處理器集群包括多個執行緒P#1~P#N及多個處理器單元PU#1~PU#N，N為大於1之自然數。每一執行緒一一對應每一處理器單元。比如，所述執行緒P#1用於每次從所述存儲模組20中讀取一幀待識別之圖像並傳送至所述處理器單元PU#1，以對所述至少一目標物進行識別。每一所述處理器單元PU#1~PU#N均包括至少一圖像識別模型。所述第二處理模組30藉由多執行緒P#1~P#N與多處理器單元PU#1~PU#N之配合可極大提高圖像識別速度，實現對超高清圖像幀進行即時目標物檢測。 Please also refer to FIG. 2, the second processing module 30 may be a processor cluster, and the processor cluster supports hardware expansion. The processor cluster includes a plurality of threads P#1~P#N and a plurality of processor units PU#1~PU#N, where N is a natural number greater than 1. Each thread corresponds to each processor unit one by one. For example, the thread P#1 is used to read one frame of image to be recognized from the storage module 20 each time and transmit it to the processor unit PU#1, so as to analyze the at least one target object. Identify it. Each of the processor units PU#1~PU#N includes at least one image recognition model. The second processing module 30 can greatly improve the image recognition speed through the cooperation of the multi-thread P#1~P#N and the multi-processor unit PU#1~PU#N, and realize the processing of ultra-high-definition image frames. Instant target detection.

於一實施方式中，由於超高清圖像幀之資料量較大，所述執行緒P#1每次從所述存儲模組20中讀取一幀待識別之圖像，避免資料量讀取過大影響識別速度，然後傳送給處理器單元PU#1進行識別，當處理器單元PU#1識別完成一幀圖像後，執行緒P#1再讀取下一幀待識別圖像。 In one embodiment, due to the large amount of data in the ultra-high-definition image frame, the thread P#1 reads the image to be recognized one frame at a time from the storage module 20 to avoid reading the amount of data Too much affects the recognition speed, and then transmits it to the processor unit PU#1 for recognition. After the processor unit PU#1 recognizes one frame of image, the thread P#1 reads the next frame of the image to be recognized.

可理解之是，當所述存儲模組20存儲有未識別之圖像幀時，多個執行緒程P#1~P#N會並行去讀取所述未識別之圖像幀。比如所述處理器集群包括八個執行緒P#1~P#8及八個處理器單元PU#1~PU#8，執行緒P#1讀取所述圖像佇列之第一圖像幀並傳送給處理器單元PU#1進行識別，執行緒P#2讀取所述圖像佇列之第二圖像幀並傳送給處理器單元PU#2進行識別，執行緒P#8讀取所述圖像佇列之第八圖像幀並傳送給處理器單元PU#8進行識別。當處理器單元PU#1完成第一圖像幀之識別後，執行緒P#1會再次讀取所述圖像佇列之第九圖像幀並傳送給處理器單元PU#1進行識別。 It is understandable that when the storage module 20 stores unidentified image frames, multiple threads P#1 to P#N will read the unidentified image frames in parallel. For example, the processor cluster includes eight threads P#1~P#8 and eight processor units PU#1~PU#8, and thread P#1 reads the first image in the image queue. The frame is sent to the processor unit PU#1 for identification, the thread P#2 reads the second image frame of the image queue and sends it to the processor unit PU#2 for identification, and the thread P#8 reads Take the eighth image frame of the image queue and send it to the processor unit PU#8 for identification. After the processor unit PU#1 completes the identification of the first image frame, the thread P#1 reads the ninth image frame of the image queue again and transmits it to the processor unit PU#1 for identification.

於一實施方式中，所述處理器單元PU#1~PU#N可是中央處理單元(Central Processing Unit，CPU)，微處理器、數位訊號處理器(Digital Signal Processor，DSP)、專用積體電路(Application Specific Integrated Circuit，ASIC)、現成可程式設計閘陣列(Field-Programmable Gate Array，FPGA)或者其他可程式設計邏輯器件、分立門或者電晶體邏輯器件、分立硬體元件等。 In one embodiment, the processor units PU#1~PU#N may be a central processing unit (CPU), a microprocessor, a digital signal processor (DSP), and a dedicated integrated circuit (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.

圖3為本發明一實施方式中圖像識別方法之流程圖。根據不同之需求，所述流程圖中步驟之順序可改變，某些步驟可省略。 Fig. 3 is a flowchart of an image recognition method in an embodiment of the present invention. According to different needs, the order of the steps in the flowchart can be changed, and some steps can be omitted.

步驟S300，利用所述第一處理模組10獲取待處理之視頻流，並對所述視頻流進行預處理得到按幀播放順序排列之圖像佇列。 In step S300, the first processing module 10 is used to obtain a video stream to be processed, and the video stream is preprocessed to obtain an image queue arranged in the order of frame playback.

步驟S302，將所述圖像佇列存儲至所述存儲模組20。 Step S302, storing the image queue in the storage module 20.

步驟S304，利用所述第二處理模組30從所述存儲模組20中讀取待識別之圖像幀，並對所述圖像幀進行識別，以檢測所述圖像幀中之至少一目標物。 Step S304: Use the second processing module 30 to read the image frame to be identified from the storage module 20, and identify the image frame to detect at least one of the image frames Target.

於一實施方式中，所述第二處理模組30優選根據所述圖像佇列之圖像幀排序進行依次讀取。 In one embodiment, the second processing module 30 preferably reads sequentially according to the sequence of the image frames of the image queue.

上述圖像識別裝置及方法，利用第一處理模組進行視頻流解碼得到待識別之圖像幀，利用第二處理模組之多執行緒對圖像幀進行識別，從而提高圖像識別效率，實現對每一超高清幀圖像進行即時目標物檢測。 The above-mentioned image recognition device and method use the first processing module to decode the video stream to obtain the image frame to be recognized, and use the multiple threads of the second processing module to recognize the image frame, thereby improving the efficiency of image recognition. Realize real-time target detection for each ultra-high-definition frame image.

綜上所述，本發明符合發明專利要件，爰依法提出專利申請。惟，以上所述者僅為本發明之較佳實施方式，本發明之範圍並不以上述實施方式為限，舉凡熟悉本案技藝之人士爰依本發明之精神所作之等效修飾或變化，皆應涵蓋於以下申請專利範圍內。 In summary, the present invention meets the requirements of an invention patent, and Yan filed a patent application in accordance with the law. However, the above are only the preferred embodiments of the present invention, and the scope of the present invention is not limited to the above embodiments. Anyone familiar with the art of the present case makes equivalent modifications or changes based on the spirit of the present invention. Should be covered in the scope of the following patent applications.

Claims

An image recognition method applied to an image recognition device. The image recognition device includes a first processing module, a storage module, and a second processing module. The method includes: using the first processing module to obtain The video stream to be processed, and preprocess the video stream to obtain an image queue arranged in the order of frame playback; store the image queue in the storage module; and use the second processing module The group reads the image frame to be recognized from the storage module, and recognizes the image frame to detect at least one target in the image frame; wherein, the second processing module Read sequentially according to the sorting of the image frames of the image queue, the second processing module includes a plurality of threads and a plurality of processor units, and each thread corresponds to each of the processes one by one The step of using the second processing module to read the image frame to be recognized from the storage module and recognizing the image frame includes: controlling the thread every time from Reading a frame of the image to be recognized in the storage module and transmitting it to the processor unit; and using the processor unit to recognize the read frame of the image to be recognized.

The method according to claim 1, wherein the step of using the first processing module to obtain a video stream to be processed includes: using the first processing module to read the to-be-processed video stream from the storage module The processed video stream.

The method according to claim 1, wherein each of the processor units includes at least one image recognition model, and each of the image recognition models is used to recognize one type of target.

The method according to claim 1, wherein the step of using the second processing module to recognize the image frame to detect at least one target in the image frame includes: using the first The second processing module recognizes the image frame to obtain object category information of the at least one target; and The object category information is stored in the image frame as tag information of the target object.

The method according to claim 1, wherein the step of using the second processing module to identify the image frame includes: using the second processing module to adjust the parameters of the image frame, Then, the image frame after the parameter adjustment is recognized; wherein the parameter adjustment includes at least image element adjustment and/or brightness adjustment.

An image recognition device includes a first processing module, a storage module, a second processing module, and a communication bus. The first processing module, the storage module, and the second processing module are The communication bus communicates with each other; the first processing module is used to obtain a video stream to be processed, and preprocess the video stream to obtain an image queue arranged in the order of frame playback; The first processing module is also used to store the image queue to the storage module; the second processing module is used to read the image frame to be identified from the storage module, and to The image frame is identified to detect at least one target in the image frame; wherein, the second processing module reads sequentially according to the sequence of the image frames in the image queue, the The second processing module includes a plurality of threads and a plurality of processor units, and each of the threads corresponds to each of the processor units one by one, and the threads are used to read from the storage module each time Take a frame of image to be recognized and send it to the processor unit.

The device according to claim 6, wherein the first processing module is used to read the to-be-processed video stream from the storage module.

The device according to claim 6, wherein each of the processor units includes an image recognition model to recognize the at least one target.

The device according to claim 6, wherein the second processing module is further configured to store the recognized object type information of the target as the tag information of the target in the image frame.

The device according to claim 6, wherein the second processing module is further configured to adjust the parameters of the image frame to be recognized, and to recognize the image frame after the parameter adjustment, wherein the parameter adjustment At least including image element adjustment and/or brightness adjustment.