TW201512996A

TW201512996A - Method for recognizing images in real time in mobile terminal and mobile terminal thereof

Info

Publication number: TW201512996A
Application number: TW103128990A
Authority: TW
Inventors: Xiao Liu; Jian Ding; hai-long Liu; Bo Chen
Original assignee: Tencent Tech Shenzhen Co Ltd
Priority date: 2013-09-18
Filing date: 2014-08-22
Publication date: 2015-04-01
Also published as: JP6026680B1; HK1200623A1; CN104144345B; CN104144345A; TWI522930B; WO2015039575A1; SA114350742B1; JP2016537692A

Abstract

The present invention provides a method for recognizing images in real time in a mobile terminal and a mobile terminal thereof, in which said method comprises: utilizing a camera of the mobile terminal to collect data in real time for acquiring a video frame; proceeding a motion estimation on the video frame so as to determine a video frame motion state; determining whether the video frame motion state is "a motion state to a static state" or not, if yes, determining the video frame as a frame of clear image and uploading the frame of clear image to a cloud server; and receiving a recognition result retrieved from the cloud server and presenting the recognition result. The present invention can save network flow and retrieve the recognition result efficiently.

Description

Method for real-time image recognition in mobile terminal and its shift Mobile terminal

本發明係關於圖像處理和識別的技術，特別有關一種在移動終端進行即時圖像識別的方法及其移動終端。 The present invention relates to techniques for image processing and recognition, and more particularly to a method for performing instant image recognition on a mobile terminal and a mobile terminal thereof.

在移動終端進行即時圖像識別的方案包括：利用移動終端之攝影鏡頭獲取關於目標的視訊框，將該視訊框發送給雲端伺服器，雲端伺服器對接收到的視訊框進行識別，在確定出對應的描述訊息後，反饋給移動終端進行顯示。 The method for performing real-time image recognition on the mobile terminal includes: acquiring a video frame about the target by using the photographic lens of the mobile terminal, and transmitting the video frame to the cloud server, and the cloud server identifies the received video frame, and determines the After the corresponding description message, the feedback is sent to the mobile terminal for display.

舉例來說，可以對圖書封面、CD封面、電影海報、條形碼、二維條碼、商品Logo等各種物體進行資料採集。雲端伺服器接收視訊框後，反饋相關描述訊息，描述訊息可包括關於相關物品的購買情況、評論訊息等。採用這種方式，可以即拍即得，非常快捷。 For example, data collection can be performed on various objects such as book covers, CD covers, movie posters, barcodes, 2D barcodes, and product logos. After receiving the video frame, the cloud server feeds back a description message, and the description message may include information about the purchase of the related item, a comment message, and the like. In this way, you can shoot instantly, very fast.

習知移動終端進行資料採集及發送的方式有兩種，下面分別進行具體說明。 There are two ways for the conventional mobile terminal to collect and transmit data, which are specifically described below.

方式一：利用移動終端之攝影鏡頭對準目標進行拍照，將得到的視訊框發送給雲端伺服器。 Method 1: Use the camera lens of the mobile terminal to take a picture with the target, and send the obtained video frame to the cloud server.

此方式存在如下缺點：需要對準後手動進行操作，不方便。並且，如果沒有對準，或者出現晃動，雲端伺服器將無法進行圖像識別，移動終端也就無法成功獲取關於目標的描述訊息。 This method has the following disadvantages: it is inconvenient to perform manual operation after alignment. Moreover, if there is no alignment or shaking, the cloud server will not be able to perform image recognition, and the mobile terminal will not be able to successfully obtain a description message about the target.

方式二：不需要進行拍照，而是採用即時對攝影鏡頭捕獲的整幅畫面進行資料採集，將採集的圖像資料發送給雲端伺服器。 Method 2: Instead of taking a photo, the data is collected on the entire screen captured by the photographic lens, and the collected image data is sent to the cloud server.

此方式雖然無需人為進行拍攝，操作較為方便，但存在如下缺點：由於即時地將採集到的視訊框發送給雲端伺服器，其流量較大。並且，採集到的資料中有些視訊框不清晰，雲端伺服器無法識別，不能有效反饋識別結果。 Although this method does not require manual shooting, the operation is convenient, but the following disadvantages exist: since the collected video frame is sent to the cloud server in real time, the traffic is large. Moreover, some of the captured data frames are not clear, the cloud server cannot recognize, and the recognition result cannot be effectively fed back.

可見，習知在移動終端進行即時圖像識別的方法，存在資料流量大、不能有效反饋識別結果等缺失。 It can be seen that the conventional method for real-time image recognition in a mobile terminal has a large data flow and cannot effectively feedback the recognition result.

本發明提供一種在移動終端進行即時圖像識別的方法及其移動終端，以節省網路流量，有效反饋識別結果。 The invention provides a method for real-time image recognition in a mobile terminal and a mobile terminal thereof, so as to save network traffic and effectively feedback the recognition result.

本發明提供一種在移動終端進行即時圖像識別的方法，其包括步驟：利用移動終端的攝影鏡頭即時進行資料採集，以獲取視訊框；對該視訊框進行運動估計，以確定出視訊框運動狀態；判斷該視訊框運動狀態是否為運動到靜止，如果是，則確定為清晰框圖像，將該清晰框圖像上傳到雲端伺服器；以及接收該雲端伺服器反饋的識別結果，顯示該識別結果。 The present invention provides a method for performing instant image recognition on a mobile terminal, comprising the steps of: performing data acquisition by using a photographic lens of the mobile terminal to obtain a video frame; performing motion estimation on the video frame to determine a motion state of the video frame Determining whether the motion state of the video frame is motion to standstill, if yes, determining to be a clear frame image, uploading the clear frame image to the cloud server; and receiving the recognition result fed back by the cloud server, displaying the identification result.

本發明另一方面提供一種進行即時圖像識別的移動終端，其包括一資料採集單元、一運動估計單元、一清晰框判斷單元和一識別結果顯示單元，其中：該資料採集單元，利用移動終端的攝影鏡頭即時進行資料採集，以獲取視訊框，並發送給該運動估計單元；該運動估計單元，對該視訊框進行運動估計，以確定出視訊框運動狀態，並發送給該清晰框判斷單元；該清晰框判斷單元，判斷該視訊框運動狀態是否為運動到靜止，如果是，則確定為清晰框圖像，將該清晰框圖像上傳到雲端伺服器；以及該識別結果顯示單元，接收該雲端伺服器反饋的識別結果，顯示該識別結果。 Another aspect of the present invention provides a mobile terminal for performing instant image recognition, comprising: a data collection unit, a motion estimation unit, a clear frame determination unit, and a recognition result display unit, wherein: the data collection unit utilizes the mobile terminal The photographic lens acquires the data frame to obtain the video frame and sends the video frame to the motion estimation unit. The motion estimation unit performs motion estimation on the video frame to determine the motion state of the video frame, and sends the video frame to the clear frame judgment unit. The clear frame judging unit judges whether the motion state of the video frame is motion to standstill, and if yes, determines the clear frame image, uploads the clear frame image to the cloud server; and the recognition result display unit receives The recognition result fed back by the cloud server displays the recognition result.

從上述技術方案可以看出，在本發明中，對採集到的視訊框進行運動估計，確定視訊框運動狀態，在判斷出視訊框運動狀態為運動到靜止時，確定為清晰框圖像，將該清晰框圖像上傳到雲端伺服器。本發明採用攝影鏡頭主動採集資料的方式，無需用戶手動進行拍照，簡便了操作。並且，只將清晰框圖像發送給雲端伺服器，而不是即時地將採集的視訊框發送給雲端伺服器，節省了流量。而且，由於雲端伺服器係基於清晰框圖像反饋識別結果，使得識別結果更加有效。 As can be seen from the above technical solution, in the present invention, motion estimation is performed on the collected video frame, and the motion state of the video frame is determined. When it is determined that the motion state of the video frame is motion to standstill, it is determined as a clear frame image, and The clear frame image is uploaded to the cloud server. The invention adopts the method of actively collecting data by the photographic lens, and does not require the user to manually take a photo, and the operation is simple. Moreover, only the clear frame image is sent to the cloud server, instead of instantly transmitting the captured video frame to the cloud server, saving traffic. Moreover, since the cloud server is based on the clear frame image feedback recognition result, the recognition result is more effective.

101‧‧‧利用移動終端的攝影鏡頭即時進行資料採集，以獲取視訊框 101‧‧‧ Instantly collect data using the camera lens of the mobile terminal to obtain the video frame

102‧‧‧對該視訊框進行運動估計，以確定出視頻框運動狀態 102‧‧‧A motion estimation of the video frame to determine the motion state of the video frame

103‧‧‧判斷該視訊框運動狀態是否為運動到靜止，如果是，則確定為清晰框圖像，將該清晰框圖像上傳到雲端伺服器 103‧‧‧ Determine whether the motion state of the video frame is motion to standstill, if yes, determine the clear frame image, upload the clear frame image to the cloud server

104‧‧‧接收該雲端伺服器反饋的識別結果，顯示該識別結果 104‧‧‧ Receive the recognition result of the cloud server feedback, and display the recognition result

201‧‧‧利用移動終端的攝影鏡頭即時進行資料採集，以獲取視訊框 201‧‧‧ Instantly collect data using the camera lens of the mobile terminal to obtain the video frame

202‧‧‧對該視訊框進行運動估計，以確定出視訊框運動狀態 202‧‧‧A motion estimation of the video frame to determine the motion state of the video frame

203‧‧‧判斷該視訊框運動狀態是否為運動到靜止，如果是，則執行步驟204；否則，結束流程 203‧‧‧ Determine whether the motion state of the video frame is motion to standstill, if yes, execute step 204; otherwise, end the flow

204‧‧‧計算出待檢測視訊框的角點特徵數目 204‧‧‧ Calculate the number of corner features of the video frame to be detected

205‧‧‧斷該角點特徵數目是否大於一角點數目閾值，如果是，則確定為清晰框圖像，將該清晰框圖像上傳到雲端伺服器；否則，確定為模糊框圖像 205‧‧‧ Is the number of features of the corner point greater than the threshold of the number of corner points? If yes, it is determined to be a clear frame image, and the clear frame image is uploaded to the cloud server; otherwise, it is determined to be a blurred frame image.

206‧‧‧接收該雲端伺服器反饋的識別結果，顯示該識別結果 206‧‧‧ Receive the recognition result of the cloud server feedback, and display the recognition result

301‧‧‧獲取待處理視訊框的中心區域像素，並儲存之 301‧‧‧Get the central area pixel of the frame to be processed and store it

302‧‧‧獲取該待處理視訊框的上一視訊框的中心區域像素 302‧‧‧Get the central area pixel of the previous video frame of the pending video frame

303‧‧‧以該待處理視訊框的中心區域為起點，在其周圍搜索出與該上一視訊框之中心區域像素相似的區域，以確定出一匹配塊 303‧‧‧ Using the central area of the to-be-processed video frame as a starting point, searching for an area similar to the pixel of the central area of the previous video frame to determine a matching block

304‧‧‧計算出該待處理視訊框的中心區域與該匹配塊之間的位置向量，作為運動向量 304‧‧‧ Calculate the position vector between the central area of the to-be-processed video frame and the matching block as a motion vector

305‧‧‧由運動向量確定出視訊框運動狀態 305‧‧‧Determining the motion state of the video frame from the motion vector

306‧‧‧是否繼續進行運動估計？ 306‧‧ Does the exercise estimate continue?

501‧‧‧資料採集單元 501‧‧‧ data collection unit

502‧‧‧運動估計單元 502‧‧‧Sports Estimation Unit

503‧‧‧清晰框判斷單元 503‧‧‧clear box judgment unit

504‧‧‧識別結果顯示單元 504‧‧‧Recognition result display unit

505‧‧‧運動向量計算子單元 505‧‧‧Motion vector calculation subunit

506‧‧‧狀態確定子單元 506‧‧‧Status determination subunit

507‧‧‧狀態確定模組 507‧‧‧Status determination module

508‧‧‧運動向量確定模組 508‧‧‧Motion vector determination module

509‧‧‧運動到靜止確定模組 509‧‧‧Moving to stationary determination module

510‧‧‧角點檢測模組 510‧‧‧ corner detection module

第1圖顯示本發明在移動終端進行即時圖像識別的方法示意性流程圖。 FIG. 1 is a schematic flow chart showing a method for performing instant image recognition on a mobile terminal according to the present invention.

第2圖顯示本發明在移動終端進行即時圖像識別的方法流程圖實例。 FIG. 2 shows an example of a flow chart of a method for performing instant image recognition on a mobile terminal according to the present invention.

第3圖顯示本發明進行運動估計方法流程圖實例。 Figure 3 shows an example of a flow chart of the motion estimation method of the present invention.

第4圖顯示本發明進行資料塊匹配的示意圖實例。 Figure 4 shows an example of a schematic diagram of data block matching in accordance with the present invention.

第5圖顯示本發明進行即時圖像識別的移動終端的結構示意圖。 Figure 5 is a diagram showing the structure of a mobile terminal for performing instant image recognition according to the present invention.

以下各實施例的說明是參考所附圖式，用以例示本發明可用以實施的特定實施例。 The following description of the various embodiments is intended to illustrate the specific embodiments

發明人在進行本發明的過程中發現，在進行資料採集的過程中，實際應用時，用戶先打開攝影鏡頭，然後移至對準目標，攝影鏡頭進行資料採集，這是一個由運動到靜止的過程。基於此，本發明對採集的視訊框的運動狀態進行判斷，當獲知視訊框運動狀態為運動到靜止時，確定為清晰框圖像，將清晰框圖像上傳到雲端伺服器。這樣，只將清晰框圖像發送給雲端伺服器，節省了流量。而且，由於雲端伺服器係基於清晰框圖像反饋識別結果，使得識別結果更加有效。 In the process of carrying out the invention, the inventor found that in the process of data collection, in actual application, the user first opens the photographic lens, then moves to the aiming target, and the photographic lens performs data collection, which is a motion to a stationary state. process. Based on this, the present invention determines the motion state of the captured video frame, and when it is known that the motion state of the video frame is motion to standstill, it is determined as a clear frame image, and the clear frame image is uploaded to the cloud server. In this way, only the clear frame image is sent to the cloud server, which saves traffic. Moreover, since the cloud server is based on the clear frame image feedback recognition result, the recognition result is more effective.

參閱第1圖，其顯示本發明在移動終端進行即時圖像識別的方法示意性流程圖，其包括如下步驟。 Referring to FIG. 1, there is shown a schematic flow chart of a method for performing instant image recognition on a mobile terminal of the present invention, which includes the following steps.

步驟101，利用移動終端的攝影鏡頭即時進行資料採集，以獲取視訊框。 Step 101: Instantly collect data by using a photographic lens of the mobile terminal to obtain a video frame.

步驟102，對該視訊框進行運動估計，以確定出視訊框運動狀態。 Step 102: Perform motion estimation on the video frame to determine a motion state of the video frame.

移動攝影鏡頭對畫面進行逐框採集，對即時獲取的某一視訊框進行運動估計，以確定出該某一視訊框的運動狀態。 The moving photographic lens captures the frame frame by frame, and performs motion estimation on a video frame that is acquired immediately to determine the motion state of the certain video frame.

運動估計的英文名稱是Motion Estimation，其多運用在視訊編碼技術中。本發明將運動估計應用到對移動終端之攝影鏡頭所採集的視訊框進行處理，以確定出視訊框的運動狀態。具體地，可採用運動向量確定視訊框運動狀態，包括：計算出視訊框與其上一視訊框之間的運動向量，該運動向量包括運動幅度和運動方向；由該運動向量確定出視訊框運動狀態。 The English name for motion estimation is Motion Estimation, which is used in video coding technology. The present invention applies motion estimation to the processing of the video frame captured by the photographic lens of the mobile terminal to determine the motion state of the video frame. Specifically, the motion vector can be used to determine the motion state of the video frame, including: calculating a motion vector between the video frame and the previous video frame, the motion vector including the motion amplitude and the motion direction; determining the motion state of the video frame by the motion vector .

採用運動估計計算出視訊框與其上一視訊框之間的運動向量，具體可採用如下方式：獲取該上一視訊框的中心區域像素；以該視訊框的中心區域為起點，在其周圍搜索出與該上一視訊框的中心區域像素相似的區域，以確定出一匹配塊；以及將該視訊框的中心區域與該匹配塊之間的位置向量作為運動向量。 The motion vector is used to calculate the motion vector between the video frame and the previous video frame. The method may be as follows: acquiring a central region pixel of the previous video frame; searching for the surrounding area of the video frame as a starting point An area similar to the central area pixel of the previous video frame to determine a matching block; and a position vector between the central area of the video frame and the matching block as a motion vector.

運動狀態包括運動、靜止、運動到靜止、靜止到運動，由運動向量確定出視訊框運動狀態的方式有許多種，可根據實際需要設置，下面進行實例說明。由運動向量確定出視訊框運動狀態包括：讀取儲存的背景運動狀態；如果該背景運動狀態為靜止，且從當前框開始連續N框的運動幅度都大於一第一運動閾值，N為自然數，當前框為第1框，則第1至第N+1框的運動狀態為靜止，該背景運動狀態仍為靜止，將第N+1框運動狀態確定為靜止到運動，並將該背景運動狀態修改為運動；如果該背景運動狀態為靜止，且當前框運動幅度小於該第一運動閾值，則當前框運動狀態仍為靜止，該背景運動狀態仍為靜止；以及如果該背景運動狀態為運動，且從當前框開始連續N框的運動幅度都小於一第二運動閾值，N為自然數，當前框為第1框，則第1至N+1框的運動狀態為運動，該背景運動狀態仍為運動，將第N+1框運動狀態確定為運動到靜止，並將該背景運動狀態修改為靜止；如果該背景運動狀態為運動，且當前框運動幅度大於該第二運動閾值，則當前框運動狀態仍為運動，該背景運動狀態仍為運動。 The motion state includes motion, rest, motion to rest, rest to motion, and there are many ways to determine the motion state of the video frame by the motion vector, which can be set according to actual needs. Determining the motion state of the video frame by the motion vector includes: reading the stored background motion state; If the background motion state is stationary, and the motion amplitude of the continuous N frame from the current frame is greater than a first motion threshold, N is a natural number, and the current frame is the first frame, then the motion of the first to N+1 frames The state is stationary, the background motion state is still stationary, the N+1th frame motion state is determined to be stationary to motion, and the background motion state is modified to motion; if the background motion state is stationary, and the current frame motion amplitude is less than The first motion threshold, the current frame motion state is still still, the background motion state is still; and if the background motion state is motion, and the continuous N frame motion amplitude from the current frame is less than a second motion threshold , N is a natural number, the current frame is the first frame, then the motion state of the first to N+1 frames is motion, the background motion state is still motion, and the N+1th frame motion state is determined to be motion to rest, and Modifying the background motion state to be stationary; if the background motion state is motion, and the current frame motion amplitude is greater than the second motion threshold, the current frame motion state is still motion, and the background motion state is still move.

進一步地，在確定出該背景運動狀態為靜止，且當前框的運動幅度小於該第一運動閾值之後，該方法還包括：判斷運動幅度是否大於一第三運動閾值，如果是，則當前框運動為微運動，該背景運動狀態仍然為靜止，如果從當前框開始連續M框的運動都是相同方向上的微運動，當前框為第1框，則將第M框的運動狀態確定為靜止到運動，並將該背景運動狀態修改為運動，M為自然數。 Further, after determining that the background motion state is stationary, and the motion amplitude of the current frame is less than the first motion threshold, the method further includes: determining whether the motion amplitude is greater than a third motion threshold, and if yes, the current frame motion For the micro motion, the background motion state is still stationary. If the motion of the continuous M frame from the current frame is the micro motion in the same direction, and the current frame is the first frame, the motion state of the M frame is determined to be stationary to Exercise and modify the background motion state to motion, M is a natural number.

在背景運動狀態為靜止的情況下，如果由運動幅度獲知上一視訊框之後有連續兩框的運動幅度都大於S1，且由運動方向獲知該連續兩框的運動方向相反，則判斷為抖動情形，將該連續兩框的運動狀態仍確定為靜止。 In the case where the background motion state is stationary, if the motion amplitude is known, the motion amplitude of the two consecutive frames after the last video frame is greater than S1, and When it is known from the direction of motion that the moving directions of the two consecutive frames are opposite, it is determined that the motion is still, and the motion state of the two consecutive frames is still determined to be stationary.

如果由運動幅度獲知上一視訊框之後有連續兩框的運動幅度都大於S1，且由運動方向獲知該連續兩框的運動方向相同，則將該連續兩框中最近一框確定為靜止到運動狀態。 If it is known by the motion amplitude that the motion amplitude of the two consecutive frames after the last video frame is greater than S1, and the motion direction is known to be the same in the motion direction of the two consecutive frames, the latest frame in the two consecutive frames is determined to be stationary to motion. status.

步驟103，判斷該視訊框運動狀態是否為運動到靜止，如果是，則確定為清晰框圖像，將該清晰框圖像上傳到雲端伺服器。 Step 103: Determine whether the motion state of the video frame is motion-to-station, and if yes, determine the clear frame image, and upload the clear frame image to the cloud server.

如果判斷該視訊框運動狀態不為運動到靜止，則不向該雲端伺服器上傳視訊框。 If it is determined that the motion state of the video frame is not moving to a standstill, the video frame is not uploaded to the cloud server.

進一步地，為了提高判斷清晰框的準確性，在確定出視訊框運動狀態為運動到靜止之後，還可以進行角點檢測，如下：計算出該視訊框的角點特徵數目；以及判斷該角點特徵數目是否大於一角點數目閾值，如果是，則確定為清晰框圖像；否則，確定為模糊框圖像。 Further, in order to improve the accuracy of the judgment clear frame, after determining that the motion state of the video frame is motion to standstill, corner point detection may also be performed, as follows: calculating the number of corner feature of the video frame; and determining the corner point Whether the number of features is greater than a threshold number of corner points, and if so, is determined to be a clear frame image; otherwise, it is determined to be a blurred frame image.

通常地，判斷出視訊框運動狀態為運動到靜止時，而確定為清晰框圖像後，將該清晰框圖像上傳到雲端伺服器。在某些應用環境下，也可以基於連續多個視訊框的運動狀態都為靜止，來確定出上傳清晰框圖像的時機；具體地，假設當前框為第1框，如果判斷出第1框至第N+1框都為靜止狀態，則確定第N+1框為清晰框，將該清晰框圖像上傳到雲端伺服器，N為自然數。 Generally, after determining that the motion state of the video frame is moving to a stationary state, and determining the clear frame image, the clear frame image is uploaded to the cloud server. In some application environments, the timing of uploading a clear frame image may be determined based on the motion state of consecutive multiple video frames. Specifically, the current frame is the first frame, and the first frame is determined. When the N+1 frame is in a static state, it is determined that the N+1 frame is a clear frame, and the clear frame image is uploaded to the cloud server, where N is a natural number.

步驟104，接收該雲端伺服器反饋的識別結果，顯示該識別結果。 Step 104: Receive a recognition result fed back by the cloud server, and display the recognition result.

雲端伺服器接收視訊框後，將反饋相關描述訊息，包括關於相關物品的購買情況、評論訊息等。 After receiving the video frame, the cloud server will feed back related description messages, including information about the purchase of related items, comment messages, and the like.

在本發明中，對採集到的視訊框進行運動估計，確定視訊框運動狀態，在判斷出視訊框運動狀態為運動到靜止時，確定為清晰框圖像，將該清晰框圖像上傳到雲端伺服器。本發明採用攝影鏡頭主動採集資料的方式，無需用戶手動進行拍照，簡便了操作。並且，只將清晰框圖像發送給雲端伺服器，而不是即時地將採集的視訊框發送給雲端伺服器，節省了流量。而且，由於雲端伺服器係基於清晰框圖像反饋識別結果，使得識別結果更加有效。 In the present invention, motion estimation is performed on the collected video frame, and the motion state of the video frame is determined. When it is determined that the motion state of the video frame is motion to standstill, it is determined as a clear frame image, and the clear frame image is uploaded to the cloud. server. The invention adopts the method of actively collecting data by the photographic lens, and does not require the user to manually take a photo, and the operation is simple. Moreover, only the clear frame image is sent to the cloud server, instead of instantly transmitting the captured video frame to the cloud server, saving traffic. Moreover, since the cloud server is based on the clear frame image feedback recognition result, the recognition result is more effective.

下面透過第2圖，對本發明在移動終端進行即時圖像識別的方法進行實例說明，其包括如下步驟。 The following is a description of a method for performing instant image recognition on a mobile terminal according to the present invention through FIG. 2, which includes the following steps.

步驟201，利用移動終端的攝影鏡頭即時進行資料採集，以獲取視訊框。 Step 201: Instantly collect data by using a photographic lens of the mobile terminal to obtain a video frame.

步驟202，對該視訊框進行運動估計，以確定出視訊框運動狀態。 Step 202: Perform motion estimation on the video frame to determine a motion state of the video frame.

為了方便說明，以下將進行運動估計的視訊框稱為待處理視訊框。 For convenience of explanation, the video frame for performing motion estimation is hereinafter referred to as a video frame to be processed.

在本發明中，將習知用於視訊編碼的運動估計概念移植到移動終端之攝影鏡頭對圖像的處理，視訊與移動終端之攝影鏡頭獲取的圖像序列均有共同的連續圖像相關性，因此運動估計算法是可以通用的。但是，二者又有不同之處，例如移動終端之攝影鏡頭獲取的圖像解析度往往較低，並且在用戶實際使用的時候移動終端不會有太大幅度的運動；更重要的是，視訊編碼中是採用針對全域的運動估計算法，這種計算方式非常慢，即使在PC上也往往無法達到即時的效果。因此，針對兩者的不同，本發明對應用於視訊編碼的運動估計算法進行改進，使其在各種移動終端上也能達到非常高效的性能，同時消耗較少的CPU資源，甚至使消耗的CPU資源基本上可以忽略不計。 In the present invention, the conventional motion estimation concept for video coding is transplanted to the processing of the image by the photographic lens of the mobile terminal, and the image sequence acquired by the photographic lens of the mobile terminal has a common continuous image correlation. Therefore, the motion estimation algorithm is universal. However, there are differences between the two, For example, the image resolution obtained by the photographic lens of the mobile terminal tends to be low, and the mobile terminal does not have too much motion when the user actually uses it; more importantly, the motion estimation algorithm for the global domain is adopted in the video coding. This calculation method is very slow, and even on a PC, it is often impossible to achieve immediate results. Therefore, for the difference between the two, the present invention improves the motion estimation algorithm for video coding, so that it can achieve very efficient performance on various mobile terminals, while consuming less CPU resources and even consuming CPU. Resources are basically negligible.

參閱第3圖，其顯示本發明進行運動估計方法流程圖實例，其包括如下步驟。 Referring to Figure 3, there is shown an example of a flow chart of a motion estimation method of the present invention, which includes the following steps.

步驟301，獲取待處理視訊框的中心區域像素，並儲存之。 Step 301: Acquire a central area pixel of the to-be-processed video frame, and store it.

步驟302，獲取該待處理視訊框的上一視訊框的中心區域像素。 Step 302: Acquire a central area pixel of a previous video frame of the to-be-processed video frame.

移動終端每次採集視訊框後，都將該視訊框的中心區域像素進行儲存。具體地，儲存該中心區域的像素灰度值。在此步驟中，提取所儲存的與待處理視訊框緊鄰的上一視訊框的中心區域像素灰度值。 Each time the mobile terminal collects the video frame, the central area pixel of the video frame is stored. Specifically, the pixel gray value of the central area is stored. In this step, the stored central region pixel grayscale value of the previous video frame immediately adjacent to the video frame to be processed is extracted.

步驟303，以該待處理視訊框的中心區域為起點，在其周圍搜索出與該上一視訊框之中心區域像素相似的區域，以確定出一匹配塊。 Step 303: Starting from a central area of the to-be-processed video frame, searching for an area similar to a pixel of a central area of the previous video frame to determine a matching block.

下面結合第4圖，對確定該匹配塊的方法進行詳細說明。於第4圖中，在上一視訊框中標有網格的方形區域為上一視訊框的中心區域，待處理視訊框中的虛線區域為待處理視訊框的中心區域，對虛線框周圍的一個有限鄰域內採用由裡到外的方式進行搜索，找到與上一視訊框之中心區域像素灰度值相似的區域，此區域稱為匹配塊，待處理視訊框中標有網格的方形區域便為搜索出的匹配塊。 The method of determining the matching block will be described in detail below with reference to FIG. In Figure 4, the square area marked with the grid in the previous video frame is the center area of the previous video frame, and the dotted area in the to-be-processed video frame is waiting. The central area of the video frame is processed, and a finite neighborhood around the dotted frame is searched from the inside to the outside to find an area similar to the gray value of the pixel in the central area of the previous video frame. This area is called a matching block. The square area marked with the grid in the pending video frame is the searched matching block.

在本實例中，將上一視訊框的中心區域(x,y)之像素灰度表示為I(x,y)，將待處理視訊框中用於與上一視訊框之中心區域進行匹配的搜索塊表示為I'(x,y)，將兩者之間差的平方和作為塊相似度的指標，假設區塊大小為N乘以N個像素，則其誤差平方和S為： In this example, the pixel gray level of the central area (x, y) of the previous video frame is represented as I ( x, y ), and the to-be-processed video frame is used to match the central area of the previous video frame. The search block is expressed as I '( x, y ), and the sum of squares of the differences between the two is used as an index of block similarity. Assuming that the block size is N times N pixels, the sum of squared errors S is:

按照該公式計算出S最小的區塊，作為匹配塊。根據該匹配塊到上一視訊框之中心區域的位置確定出兩框之間的運動向量，如第4圖中箭頭標出了運動方向。上述搜索過程採用逼近算法，具體地，首先進行大步距的移動，找到相似度差異相對較小的區域；然後在區域內減小步距，逐步逼近得到最終的搜索結果。為了保證算法的速度，如果視訊框像素太大，超過了一定的閾值，可以先進行降採樣處理，例如預先將2000乘以2000的視訊框降採樣為400乘以400。於第4圖中，係使用矩形區域表示匹配塊，但在實際應用中，還可以採用菱形匹配、圓形匹配等其它形狀的區塊進行匹配。 According to this formula, the block with the smallest S is calculated as a matching block. The motion vector between the two frames is determined according to the position of the matching block to the central region of the previous video frame, and the arrow indicates the direction of motion as shown in FIG. The above search process uses an approximation algorithm. Specifically, the large step movement is first performed to find a region where the similarity difference is relatively small; then the step size is reduced in the region, and the final search result is gradually approached. In order to ensure the speed of the algorithm, if the pixel of the video frame is too large and exceeds a certain threshold, the downsampling process may be performed first. For example, the video frame multiplied by 2000 by 2000 is downsampled to 400 times 400. In Fig. 4, a rectangular area is used to indicate a matching block, but in practical applications, other shapes of blocks such as diamond matching, circular matching, or the like may be used for matching.

在運動估計中，除了採用上述誤差平方和的相似度判定方法外，還可採用均方誤差、絕對誤差和、平均誤差和以及其他相似度判定方法。並且，除了採用逼近搜索算法外，在實際應用中，還可採用三步法、鑽石搜索法等其他搜索算法。 In the motion estimation, in addition to the similarity determination method using the sum of squares of the above errors, a mean square error, an absolute error sum, an average error sum, and other similarity determination methods may be employed. In addition to using an approximation search algorithm In addition, in practical applications, other search algorithms such as a three-step method and a diamond search method can also be used.

步驟304，計算出該待處理視訊框的中心區域與該匹配塊之間的位置向量，作為運動向量。 Step 304: Calculate a position vector between the central area of the to-be-processed video frame and the matching block as a motion vector.

計算出的運動向量包括運動方向和運動幅度。 The calculated motion vector includes the direction of motion and the magnitude of motion.

步驟305，由運動向量確定出視訊框運動狀態。 In step 305, the motion state of the video frame is determined by the motion vector.

在本發明中，視訊框運動狀態包括但不限于以下四種狀態：運動、靜止、運動到靜止、靜止到運動，其中運動到靜止狀態被確定為上傳圖像的時機。 In the present invention, the video frame motion state includes, but is not limited to, the following four states: motion, still, motion to still, still to motion, wherein the motion to the stationary state is determined as the timing of uploading an image.

在實際應用中，對於運動到靜止和靜止到運動狀態需要採取不同的幅度閾值，在圖像識別的應用中，通常運動到靜止的幅度閾值較高，該幅度閾值用第二運動閾值表示，而靜止到運動的幅度閾值較低，此時的幅度閾值用第一運動閾值表示，第一運動閾值小於第二運動閾值。 In practical applications, different amplitude thresholds are required for motion to rest and rest to motion. In image recognition applications, the amplitude threshold for motion to rest is generally higher, and the amplitude threshold is represented by a second motion threshold. The amplitude threshold for rest to motion is lower, the amplitude threshold at this time is represented by a first motion threshold, and the first motion threshold is less than the second motion threshold.

移動終端儲存了背景運動狀態，該背景運動狀態可以從已儲存的狀態中提取。然後，結合該背景運動狀態、第一運動閾值、第二運動閾值，便可確定出待處理視訊框的運動狀態，具體如下：讀取儲存的背景運動狀態；如果該背景運動狀態為靜止，且從當前框開始連續N框的運動幅度都大於一第一運動閾值，N為自然數，當前框為第1框，則第1至第N+1框的運動狀態為靜止，該背景運動狀態仍為靜止，將第N+1框運動狀態確定為靜止到運動，並將該背景運動狀態修改為運動；如果該背景運動狀態為靜止，且當前框運動幅度小於該第一運動閾值，則當前框運動狀態仍為靜止，該背景運動狀態仍為靜止；以及如果該背景運動狀態為運動，且從當前框開始連續N框的運動幅度都小於一第二運動閾值，N為自然數，當前框為第1框，則第1至N+1框的運動狀態為運動，該背景運動狀態仍為運動，將第N+1框運動狀態確定為運動到靜止，並將該背景運動狀態修改為靜止；如果該背景運動狀態為運動，且當前框運動幅度大於該第二運動閾值，則當前框運動狀態仍為運動，該背景運動狀態仍為運動。 The mobile terminal stores a background motion state that can be extracted from the stored state. Then, combining the background motion state, the first motion threshold, and the second motion threshold, the motion state of the to-be-processed video frame can be determined, as follows: reading the stored background motion state; if the background motion state is stationary, and The motion amplitude of the continuous N frame from the current frame is greater than a first motion threshold, N is a natural number, and the current frame is the first frame, and the motion state of the first to N+1 frames is still, and the background motion state is still For stationary, the N+1th frame motion state is determined to be stationary to motion, and the background motion state is modified to motion; if the background motion state is static And if the current frame motion amplitude is less than the first motion threshold, the current frame motion state is still still, the background motion state is still still; and if the background motion state is motion, and the continuous N frame motion is started from the current frame The amplitude is less than a second motion threshold, N is a natural number, and the current frame is the first frame, then the motion state of the first to N+1 frames is motion, the background motion state is still motion, and the N+1th frame motion is performed. The state is determined to be moving to rest, and the background motion state is modified to be stationary; if the background motion state is motion, and the current frame motion magnitude is greater than the second motion threshold, the current frame motion state is still motion, and the background motion state is Still sporty.

上述確定出該背景運動狀態為靜止，且當前框的運動幅度小於該第一運動閾值之後，該方法還包括：判斷運動幅度是否大於一第三運動閾值，如果是，則當前框運動為微運動，該背景運動狀態仍然為靜止，如果從當前框開始連續M框的運動都是相同方向上的微運動，當前框為第1框，則將第M框的運動狀態確定為靜止到運動，並將該背景運動狀態修改為運動，M為自然數。 After determining that the background motion state is stationary, and the motion amplitude of the current frame is less than the first motion threshold, the method further includes: determining whether the motion amplitude is greater than a third motion threshold, and if yes, the current frame motion is micro motion The background motion state is still stationary. If the motion of the continuous M frame is the micro motion in the same direction from the current frame, and the current frame is the first frame, the motion state of the M frame is determined to be stationary to motion, and The background motion state is modified to motion, and M is a natural number.

在本實例中，採用“狀態保持”的策略，對於偶發性的單次靜止或者運動狀態不進行狀態切換，只有累積兩次以上的狀態變化時，才進行狀態切換，透過這種策略可達到狀態的穩定性。用S1表示第一運動閾值，S2表示第二運動閾值，S3表示第三運動閾值，S表示待處理視訊框的運動幅度。假設通常需要累積兩次狀態變化才進行狀態切換，而對於微運動需要累積五次狀態變化才進行狀態切換，相應的“狀態保持”策略具體包括如下。 In this example, the "state hold" strategy is adopted, and state switching is not performed for sporadic single-stationary or motion states, and state switching is performed only when more than two state changes are accumulated, and the state can be achieved through this strategy. Stability. The first motion threshold is represented by S1, the second motion threshold is represented by S2, the third motion threshold is represented by S3, and S represents the magnitude of motion of the video frame to be processed. It is assumed that it is usually necessary to accumulate two state changes before performing state switching, and for micromotions, it is necessary to accumulate five state changes before performing state switching. The corresponding "state keeping" strategy specifically includes the following.

(一)背景運動狀態為靜止的情況： (1) When the background motion state is stationary:

(1)當S>S1時，確定待處理視訊框(用第Y框表示)為靜止狀態，背景運動狀態仍為靜止，然後再判斷第Y+1框的運動幅度是否仍然大於S1，如果是，則確定第Y+1框為靜止到運動狀態，並將背景運動狀態修改為運動。 (1) When S>S1, it is determined that the to-be-processed video frame (indicated by the Y-frame) is in a stationary state, the background motion state is still still, and then it is determined whether the motion amplitude of the Y+1 frame is still greater than S1, if Then, it is determined that the Y+1 frame is stationary to the motion state, and the background motion state is modified to the motion.

(2)當S<S1時，確定待處理視訊框為靜止狀態，背景運動狀態仍為靜止。 (2) When S<S1, it is determined that the to-be-processed video frame is in a stationary state, and the background motion state is still stationary.

(3)當S3<S<S1時，確定待處理視訊框(用第Z框表示)為微運動，且第Z至Z+3框判斷為相同方向上的微運動，但第Z至Z+3框仍然確定為靜止狀態，如果第Z+4框也為相同方向上的微運動，則將第Z+4框確定為靜止到運動狀態，並將背景運動狀態修改為運動。累計的次數可根據需要設置。 (3) When S3 < S < S1, it is determined that the to-be-processed video frame (represented by the Z-th frame) is micro-motion, and the Z-th to Z+3 frames are judged to be micro-motions in the same direction, but the Z-th to Z+ The 3 frame is still determined to be in a stationary state. If the Z+4 frame is also a micro motion in the same direction, the Z+4 frame is determined to be stationary to the motion state, and the background motion state is modified to the motion. The cumulative number of times can be set as needed.

(二)背景運動狀態為運動的情況： (2) The situation in which the background motion state is exercise:

(1)當S<S2時，確定待處理視訊框(用第Y框表示)為運動狀態，背景運動狀態仍為運動，然後再判斷第Y+1框的運動幅度是否仍然小於S2，如果是，則確定第Y+1框為運動到靜止狀態，並將背景運動狀態修改為靜止。 (1) When S<S2, it is determined that the to-be-processed video frame (indicated by the Y-th frame) is in motion, the background motion state is still motion, and then it is judged whether the motion amplitude of the Y+1 frame is still less than S2, if Then, it is determined that the Y+1 frame is moved to a stationary state, and the background motion state is changed to still.

(2)當S>S2時，確定待處理視訊框為運動狀態，背景運動狀態仍為運動。 (2) When S>S2, it is determined that the to-be-processed video frame is in motion, and the background motion state is still motion.

進一步地，還可以對手抖情況進行判定。如果出現“忽左忽右”，也就是運動方向出現相反的情況，則判定為是“手抖”情形，此情形下如果背景為靜止狀態，則暫不修改運動狀態，直到產生連續相同方向的運動為止。 Further, it is also possible to make a determination by the situation of the opponent's shaking. If there is a situation of "sudden left and right", that is, the opposite direction of motion, it is judged to be "hand shake". In this case, if the background is stationary, the motion state is not modified until the same direction is generated. Until now.

步驟306，判斷是否繼續進行運動估計，如果是，則返回步驟301繼續執行，否則結束流程。 At step 306, it is determined whether to continue the motion estimation. If yes, return to step 301 to continue the execution, otherwise the flow ends.

如果步驟201中持續獲取到視訊框，則到步驟301，繼續對獲取的視訊框進行運動估計。 If the video frame is continuously acquired in step 201, then to step 301, motion estimation of the acquired video frame is continued.

步驟203，判斷該視訊框運動狀態是否為運動到靜止，如果是，則執行步驟204；否則，結束流程。 Step 203: Determine whether the motion state of the video frame is motion to standstill. If yes, execute step 204; otherwise, end the process.

剛打開攝影鏡頭時，可以將狀態默認為靜止；之後，用戶將攝影鏡頭移至目標，這個過程將經歷靜止到運動、運動、運動到靜止。 When the photographic lens is just opened, the status can be set to rest by default; after that, the user moves the photographic lens to the target, and the process will experience rest to motion, motion, and motion to rest.

判斷出視訊框的運動狀態為運動到靜止，則將相應之視訊框作為待檢測視訊框。 When it is determined that the motion state of the video frame is motion to standstill, the corresponding video frame is used as the video frame to be detected.

步驟204，計算出待檢測視訊框的角點特徵數目。 Step 204: Calculate the number of corner feature of the video frame to be detected.

角點檢測算法有許多種，具體如FAST角點檢測算法、Harris角點檢測算法、CHOG角點檢測算法、FREAK角點檢測算法等，可任選其一，這些算法都有較好的角點檢測能力。根據有效圖片的定義，第一要求清晰，第二要求具有較為豐富的紋理。基於這兩點，可以採用FAST角點檢測算法。在圖片不清晰的時候，往往FAST角點較少，比如在大片空白或者單一顏色的圖片中，FAST角點很少，因此只需要對圖片的FAST角點數目進行判斷，即可確定是否為有效圖片。 There are many kinds of corner detection algorithms, such as FAST corner detection algorithm, Harris corner detection algorithm, CHOG corner detection algorithm, FREAK corner detection algorithm, etc., one of which can be selected. These algorithms have better corners. Test your ability. According to the definition of the effective picture, the first requirement is clear, and the second requirement has a rich texture. Based on these two points, the FAST corner detection algorithm can be used. When the picture is not clear, there are often fewer FAST corner points. For example, in a large blank or single color picture, the FAST corner point is very small, so it is only necessary to judge the number of FAST corner points of the picture to determine whether it is valid. image.

並且，除了採用角點檢測算法進行圖片有效性判別外，在實際應用中，還可採用基於梯度特徵、邊緣特徵等判別圖像有效性的算法。 Moreover, in addition to the corner point detection algorithm for image validity discrimination, in practical applications, an algorithm for discriminating image validity based on gradient features, edge features, and the like can also be employed.

步驟205，判斷該角點特徵數目是否大於一角點數目閾值，如果是，則確定為清晰框圖像，將該清晰框圖像上傳到雲端伺服器；否則，確定為模糊框圖像。 Step 205: Determine whether the number of corner feature numbers is greater than a corner number threshold. If yes, determine the clear frame image, and upload the clear frame image to the cloud server; otherwise, determine the blur frame image.

步驟206，接收該雲端伺服器反饋的識別結果，顯示該識別結果。 Step 206: Receive a recognition result fed back by the cloud server, and display the recognition result.

參閱第5圖，其顯示本發明進行即時圖像識別的移動終端的結構示意圖，該移動終端包括一資料採集單元501、一運動估計單元502、一清晰框判斷單元503和一識別結果顯示單元504。 Referring to FIG. 5, it is a schematic structural diagram of a mobile terminal for performing instant image recognition according to the present invention. The mobile terminal includes a data collection unit 501, a motion estimation unit 502, a clear frame determination unit 503, and a recognition result display unit 504. .

資料採集單元501利用該移動終端的攝影鏡頭即時進行資料採集，以獲取視訊框，並發送給運動估計單元502。 The data collection unit 501 performs data collection by using the photographic lens of the mobile terminal to acquire a video frame and send it to the motion estimation unit 502.

運動估計單元502對該視訊框進行運動估計，以確定出視訊框運動狀態，並發送給清晰框判斷單元503。 The motion estimation unit 502 performs motion estimation on the video frame to determine the motion state of the video frame, and sends it to the clear frame determining unit 503.

清晰框判斷單元503判斷該視訊框運動狀態是否為運動到靜止，如果是，則確定為清晰框圖像，將該清晰框圖像上傳到雲端伺服器。 The clear frame determining unit 503 determines whether the motion state of the video frame is motion-to-station, and if so, determines the clear frame image, and uploads the clear frame image to the cloud server.

識別結果顯示單元504接收該雲端伺服器反饋的識別結果，顯示該識別結果。 The recognition result display unit 504 receives the recognition result fed back by the cloud server, and displays the recognition result.

較佳地，運動估計單元502包括一運動向量計算子單元505和一狀態確定子單元506。 Preferably, motion estimation unit 502 includes a motion vector calculation sub-unit 505 and a state determination sub-unit 506.

運動向量計算子單元505計算出該視訊框與其上一視訊框之間的運動向量，並發送給狀態確定子單元506，該運動向量包括運動幅度和運動方向。 The motion vector calculation sub-unit 505 calculates the motion vector between the video frame and its previous video frame and sends it to the state determination sub-unit 506, which includes the motion amplitude and the motion direction.

狀態確定子單元506由該運動向量確定出該視訊框運動狀態。 The state determination sub-unit 506 determines the motion state of the video frame from the motion vector.

較佳地，狀態確定子單元506包括一狀態確定模組507，其讀取儲存的背景運動狀態；如果該背景運動狀態為靜止，且從當前框開始連續N框的運動幅度都大於一第一運動閾值，N為自然數，當前框為第1框，則第1至第N+1框的運動狀態為靜止，該背景運動狀態仍為靜止，將第N+1框運動狀態確定為靜止到運動，並將該背景運動狀態修改為運動；如果該背景運動狀態為靜止，且當前框運動幅度小於該第一運動閾值，則當前框運動狀態仍為靜止，該背景運動狀態仍為靜止。 Preferably, the state determining sub-unit 506 includes a state determining module 507 that reads the stored background motion state; if the background motion state is stationary, and the continuous N-frame motion amplitude from the current frame is greater than a first Motion threshold, N is a natural number, the current frame is the first frame, then the motion state of the first to N+1th frames is stationary, the background motion state is still still, and the N+1th frame motion state is determined to be stationary to Moving, and modifying the background motion state to motion; if the background motion state is stationary, and the current frame motion magnitude is less than the first motion threshold, the current frame motion state is still still, and the background motion state is still stationary.

如果該背景運動狀態為運動，且從當前框開始連續N框的運動幅度都小於一第二運動閾值，N為自然數，當前框為第1框，則第1至N+1框的運動狀態為運動，該背景運動狀態仍為運動，將第N+1框運動狀態確定為運動到靜止，並將該背景運動狀態修改為靜止；如果該背景運動狀態為運動，且當前框運動幅度大於該第二運動閾值，則當前框運動狀態仍為運動，該背景運動狀態仍為運動。 If the background motion state is motion, and the motion range of the continuous N frame from the current frame is less than a second motion threshold, N is a natural number, and the current frame is the first frame, then the motion state of the first to N+1 frames For motion, the background motion state is still motion, determining the motion state of the N+1th frame to move to rest, and modifying the background motion state to be stationary; if the background motion state is motion, and the current frame motion amplitude is greater than the The second motion threshold is that the current frame motion state is still motion, and the background motion state is still motion.

較佳地，在狀態確定模組507確定出該背景運動狀態為靜止，且當前框的運動幅度小於該第一運動閾值之後，還判斷運動幅度是否大於一第三運動閾值，如果是，則當前框運動為微運動，該背景運動狀態仍然為靜止，如果從當前框開始連續M框的運動都是相同方向上的微運動，當前框為第1框，則將第M框的運動狀態確定為靜止到運動，並將該背景運動狀態修改為運動，M為自然數。 Preferably, after the state determination module 507 determines that the background motion state is stationary, and the motion amplitude of the current frame is less than the first motion threshold, it is further determined whether the motion amplitude is greater than a third motion threshold, and if so, the current The frame motion is micro motion, and the background motion state is still stationary. If the motion of the continuous M frame from the current frame is micro motion in the same direction, the current When the frame is the first frame, the motion state of the Mth frame is determined to be stationary to motion, and the background motion state is modified to motion, and M is a natural number.

較佳地，運動向量計算單元505包括一運動向量確定模組508，其獲取該上一視訊框的中心區域像素；以該視訊框的中心區域為起點，在其周圍搜索出與該上一視訊框的中心區域像素相似的區域，以確定出一匹配塊；將該視訊框的中心區域與該匹配塊之間的位置向量作為運動向量。 Preferably, the motion vector calculation unit 505 includes a motion vector determination module 508 that acquires a central region pixel of the previous video frame; and searches for the previous video around the central region of the video frame as a starting point. A pixel-like area in the center of the frame is used to determine a matching block; a position vector between the central area of the video frame and the matching block is used as a motion vector.

較佳地，清晰框判斷單元503包括一運動到靜止確定模組509和一角點檢測模組510。 Preferably, the clear frame determining unit 503 includes a motion to stationary determining module 509 and a corner detecting module 510.

運動到靜止確定模組509判斷該視訊框運動狀態是否為運動到靜止，如果是，則向角點檢測模組510發送啟動指令。 The motion-to-station determination module 509 determines whether the motion state of the video frame is motion-to-station, and if so, sends a start command to the corner detection module 510.

角點檢測模組510接收來自運動到靜止確定模組509的啟動指令，計算出該視訊框的角點特徵數目；判斷該角點特徵數目是否大於一角點數目閾值，如果是，則確定為清晰框圖像，將該清晰框圖像上傳到該雲端伺服器；否則，確定為模糊框圖像。 The corner detection module 510 receives the start command from the motion to the stationary determination module 509, calculates the number of corner features of the video frame, determines whether the number of corner features is greater than a threshold number of corner points, and if so, determines to be clear. The frame image is uploaded to the cloud server; otherwise, it is determined to be a blurred frame image.

本發明實施例所描述的移動終端可採用硬體、軟體、韌體或這三者的任意組合來實現。如果在軟體中實現，這些功能可以儲存於計算機可讀的媒介上，或以一個或多個指令或代碼形式傳輸於計算機可讀的媒介上。計算機可讀媒介包括電腦儲存媒介和便於使得計算機程序從一個地方轉移到其它地方的通訊媒介。儲存媒介可以是任何通用或特殊電腦可以接入存取的可用媒體。例如，這樣的計算機可讀媒體可以包括但不限於RAM、ROM、EEPROM、CD-ROM或其它光碟儲存、磁碟儲存或其它磁性儲存裝置，或其它任何可以用於承載或儲存以指令或資料結構和其它可被通用或特殊計算機、或通用或特殊處理器讀取形式的程序代碼的媒介。此外，任何鏈接都可以被適當地定義為計算機可讀媒介，所述碟片(disk)和磁碟(disc)包括壓縮磁碟、雷射碟、光碟、DVD、軟碟和藍光光碟，磁碟通常以磁性複製資料，而碟片通常以激光進行光學複製資料。上述的組合也可以包括在計算機可讀媒介中。 The mobile terminal described in the embodiments of the present invention may be implemented by using hardware, software, firmware, or any combination of the three. If implemented in software, these functions can be stored on a computer readable medium or transmitted as one or more instructions or code on a computer readable medium. Computer readable media includes computer storage media and communication media that facilitates transfer of a computer program from one place to another. The storage medium can be any available media that can be accessed by any general purpose or special computer. For example, such a computer readable medium This may include, but is not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage or other magnetic storage device, or any other device that can be used for carrying or storing instructions or data structures and other general or special computers. Or a medium of program code in the form of a general or special processor read. In addition, any link may be appropriately defined as a computer readable medium, and the disk and disc include a compact disk, a laser disk, a compact disk, a DVD, a floppy disk, and a Blu-ray disk, and a disk. The data is usually reproduced magnetically, and the disc is usually optically replicated with a laser. Combinations of the above may also be included in a computer readable medium.

雖然本發明已就較佳實施例揭露如上，然其並非用以限定本發明。本發明所屬技術領域中具有通常知識者，在不脫離本發明之精神和範圍內，當可作各種之變更和潤飾。因此，本發明之保護範圍當視後附之申請專利範圍所界定者為準。 While the invention has been described above in terms of preferred embodiments, it is not intended to limit the invention. Various changes and modifications may be made without departing from the spirit and scope of the invention. Therefore, the scope of the invention is defined by the scope of the appended claims.

Claims

A method for real-time image recognition in a mobile terminal, comprising the steps of: performing data acquisition by using a photographic lens of the mobile terminal to obtain a video frame; performing motion estimation on the video frame to determine a motion state of the video frame; determining the video Whether the motion state of the frame is motion to standstill, if yes, it is determined as a clear frame image, and the clear frame image is uploaded to the cloud server; and the recognition result fed back by the cloud server is received, and the recognition result is displayed.

The method for performing real-time image recognition on a mobile terminal according to claim 1, wherein the step of performing motion estimation on the video frame to determine the motion state of the video frame comprises: calculating the video frame and the previous one. a motion vector between the video frames, the motion vector including a motion amplitude and a motion direction; and the motion vector determines the motion state of the video frame.

The method for performing instant image recognition on a mobile terminal as described in claim 2, wherein the step of determining the motion state of the video frame by the motion vector comprises: reading a stored background motion state; if the background motion state If it is stationary, and the motion range of the continuous N frame from the current frame is greater than a first motion threshold, N is a natural number, and the current frame is the first frame, then the motion state of the first to N+1 frames is static. Background movement The state is still stationary, the N+1th frame motion state is determined to be stationary to motion, and the background motion state is modified to motion; and if the background motion state is motion, and the continuous N frame motion range is from the current frame Less than a second motion threshold, N is a natural number, and the current frame is the first frame, then the motion state of the first to N+1 frames is motion, the background motion state is still motion, and the N+1th frame motion state is determined. To move to rest, and change the background motion state to still.

The method for performing instant image recognition on a mobile terminal according to claim 3, wherein after determining that the background motion state is stationary and the motion amplitude of the current frame is less than the first motion threshold, the method further includes : determining whether the motion amplitude is greater than a third motion threshold, and if so, the current frame motion is micro motion, and the background motion state is still stationary. If the motion of the continuous M frame from the current frame is micro motion in the same direction, When the current frame is the first frame, the motion state of the Mth frame is determined to be stationary to motion, and the background motion state is modified to motion, and M is a natural number.

The method for performing real-time image recognition on a mobile terminal according to claim 3, wherein after determining that the background motion state is stationary, the method includes: if the motion frame is known after the last video frame, two consecutive frames The motion amplitude is greater than the first motion threshold, and it is determined by the motion direction that the motion directions of the two consecutive frames are opposite, then the jitter condition is determined, and the motion state of the two consecutive frames is still determined to be stationary, and the motion state of the background is still still.

The method for performing real-time image recognition on a mobile terminal as described in claim 2, wherein the step of calculating a motion vector between the video frame and the previous video frame comprises: acquiring a central area of the previous video frame a pixel; searching for a region similar to a pixel of a central region of the previous video frame from a central region of the video frame to determine a matching block; and matching the central region of the video frame with the matching block The position vector between the vectors as a motion vector.

The method for performing instant image recognition on a mobile terminal according to any one of claims 1 to 6, wherein after determining that the motion state of the video frame is motion to standstill, the method further comprises: calculating the The number of corner feature of the video frame; and determining whether the number of corner feature is greater than a corner number threshold, and if so, determining to be a clear frame image; otherwise, determining to be a blurred frame image.

A mobile terminal for performing real-time image recognition includes a data collection unit, a motion estimation unit, a clear frame determination unit, and a recognition result display unit, wherein: the data collection unit uses the photographic lens of the mobile terminal to perform data collection instantaneously Obtaining a video frame, and sending the video frame to the motion estimation unit; the motion estimation unit performs motion estimation on the video frame to determine a motion state of the video frame, and sends the video frame to the clear frame determining unit; The clear frame determining unit determines whether the motion state of the video frame is motion-to-station, and if yes, determines a clear frame image, uploads the clear frame image to the cloud server; and the recognition result display unit receives the The recognition result fed back by the cloud server displays the recognition result.

The mobile terminal according to claim 8, wherein the motion estimation unit comprises a motion vector calculation subunit and a state determination subunit, wherein: the motion vector calculation subunit calculates the video frame and the previous video. a motion vector between the frames, and sent to the state determining subunit, the motion vector including a motion amplitude and a motion direction; and the state determining subunit, the motion frame vector determining the motion state of the video frame.

The mobile terminal of claim 9, wherein the state determining subunit comprises a state determining module that reads the stored background motion state; if the background motion state is stationary, and continues from the current frame N The motion amplitude of the frame is greater than a first motion threshold, N is a natural number, and the current frame is the first frame, then the motion state of the first to N+1 frames is stationary, and the background motion state is still still, and the N+th The 1-frame motion state is determined to be stationary to motion, and the background motion state is modified to motion; if the background motion state is motion, and the motion amplitude of the continuous N-frame from the current frame is less than a second motion threshold, N is natural The current frame is the first frame, and the motion state of the first to N+1 frames is motion, the background motion state is still motion, and the N+1th frame motion state is determined to be motion to rest, and the background motion is The status is changed to standstill.

The mobile terminal of claim 10, wherein after the state determining module determines that the background motion state is stationary, and the motion amplitude of the current frame is less than the first motion threshold, determining whether the motion amplitude is greater than one The third motion threshold, if yes, the current frame motion is micro motion, and the background motion state is still still. If the motion of the continuous M frame from the current frame is micro motion in the same direction, the current frame is the first frame. Then, the motion state of the Mth frame is determined to be stationary to motion, and the background motion state is modified to motion, and M is a natural number.

The mobile terminal of claim 9, wherein the motion vector calculation unit includes a motion vector determining module that acquires a central region pixel of the previous video frame; starting from a central region of the video frame A region similar to the central region pixel of the previous video frame is searched to determine a matching block; a position vector between the central region of the video frame and the matching block is used as a motion vector.

The mobile terminal according to any one of claims 8 to 12, wherein the clear frame determining unit comprises a motion to stationary determining module and a corner detecting module, wherein: the moving to the stationary determining module, Determining whether the motion state of the video frame is motion-to-station, if yes, sending a start command to the corner detection module; and the corner detection module receiving a start command from the motion to the stationary determination module, and calculating The number of corner features of the video frame; determining whether the number of corner features is greater than a threshold number of corner points, and if so, determining a clear block diagram For example, the clear frame image is uploaded to the cloud server; otherwise, it is determined to be a blurred frame image.