CN103530655A

CN103530655A - Station logo identification method and system

Info

Publication number: CN103530655A
Application number: CN201310200339.1A
Authority: CN
Inventors: 张登康; 邵诗强
Original assignee: TCL Corp
Current assignee: TCL Corp
Priority date: 2013-05-24
Filing date: 2013-05-24
Publication date: 2014-01-22
Anticipated expiration: 2033-05-24
Also published as: CN103530655B

Abstract

The invention discloses a station logo identification method and a system. The station logo identification method comprises that: a regional image to be detected is extracted, a characteristic vector of an image to be detected is acquired in the regional image to be detected, matching degree of the characteristic vector and a pre-stored station logo characteristic vector is calculated, matching degree which is greater than or equal to a preset value is selected, and corresponding station logo information of the selected matching degree is acquired; B. a corresponding relationship filtering function is found according to the acquired station logo information, and similarity of the image to be detected and the acquired station logo is calculated according to the relationship filtering function; C. and the maximum similarity is acquired, station logo information corresponding to the maximum similarity is searched, and an identification result is outputted. The included station logo information is acquired via calculating matching degree of the characteristic vector of the image to be detected and the characteristic vector of each station logo. Then similarity of the image to be detected and the corresponding station logo is calculated, and station logo information with the maximum similarity is put forward so that the station logo is effectively identified, efficiency is enhanced and identification speed is enhanced simultaneously.

Description

Station caption identification method and system

Technical Field

The invention relates to the technical field of image recognition, in particular to a station caption recognition method and system.

Background

The station mark of the television station is an important mark for distinguishing the television station, contains important semantic information of the station name, programs and the like of the television station, and is one of important semantic sources for realizing video analysis, understanding and retrieval. The station caption effectively identifies, can provide effective support for video viewing behavior statistics, program previewing and the like, and has very important research and application values.

When the television is played, pixel points of a certain station caption in an image can change along with complex background change, the phenomenon is more obvious particularly in a semitransparent station caption, the edge of the certain station caption is a smooth and gradual change process, the edge characteristic is not obvious, certain difficulty is brought to correct segmentation of the station caption, effective description of the station caption characteristic and effective identification of the station caption, and the identification accuracy of the station caption is low.

Thus, the prior art has yet to be improved and enhanced.

Disclosure of Invention

In view of the technical problem of low recognition accuracy of station captions in the prior art, the present invention aims to provide a station caption recognition method and system.

In order to achieve the purpose, the invention adopts the following technical scheme:

a station caption identification method comprises the following steps:

A. extracting an image of a region to be detected, acquiring a feature vector of the image to be detected in the image of the region to be detected, calculating the matching degree of the feature vector with a pre-stored station caption, selecting the matching degree greater than or equal to a preset value, and acquiring station caption information corresponding to the selected matching degree;

B. searching a corresponding relation filter function according to the acquired station caption information, and calculating the similarity between the acquired station caption and the corresponding relation filter function according to the relation filter function;

C. and acquiring the maximum similarity, searching station caption information corresponding to the maximum similarity, and outputting an identification result.

In the station caption identifying method, before step a, the station caption identifying method further includes:

and A0, collecting a positive sample of each station caption, manufacturing a standard template of each station caption, and obtaining a feature vector and a relation filter function of each station caption.

In the station caption identifying method, the step a specifically includes:

a1, extracting an image of a region to be detected containing a station caption;

a2, selecting a standard template of a station caption from a pre-stored station caption library to traverse an area of the image of the area to be detected, and calculating a feature vector of the current area;

a3, comparing the calculated characteristic vector with the characteristic vector of the selected station caption, and calculating the matching degree; when the matching degree is greater than or equal to a preset threshold value, executing the step A4, otherwise, traversing to the next region, and continuously calculating the feature vector of the current region until all regions of the image of the region to be detected are traversed;

and A4, traversing an area of the image of the area to be detected by the standard template of the next station caption, calculating a feature vector of the currently obtained area, comparing the calculated feature vector with the feature vector of the next station caption, and calculating the matching degree until all pre-stored standard templates of the station caption traverse the image of the area to be detected, so as to obtain the station caption information corresponding to the matching degree which is greater than or equal to the preset value.

In the station caption identifying method, the step B specifically includes:

b1, inquiring a station caption relation filter function corresponding to the acquired station caption information, and calculating and outputting a two-dimensional correlation plane of the image to be detected under the filter function corresponding to the inquired station caption;

b2, calculating the similarity between the image to be detected and the corresponding station caption based on the two-dimensional correlation plane, and judging whether the calculation of the similarity between the image to be detected and all the station captions corresponding to the acquired station caption information is finished or not; if so, step C is performed, otherwise step B1 is diverted to and performed.

In the station caption identification method, before the characteristic vector of each station caption is obtained, the characteristic vectors of all positive samples of each station caption are obtained; when obtaining the feature vectors of all positive samples of the current station caption, the method specifically includes:

a01, if the pixel value x (i, j) of the current pixel point (i, j) in a positive sample x of the current station caption is larger than or equal to half of the sum of the average value of the pixel values of all the pixel points on the abscissa of the pixel point and the average value of the pixel values of all the pixel points on the ordinate of the pixel point, setting the current pixel value x (i, j) as 1, otherwise, setting the current pixel value x (i, j) as 0;

a02, normalization operation of the positive sample of station caption and the standard template, and a reference of wThe two-dimensional image with the size of q is converted into a one-dimensional array to obtain the characteristic vector p of the current positive sample_kThe feature vector p_kIs m rows and one column; wherein, w is the length of the two-dimensional image, q is the width of the two-dimensional image, and m is w × q;

a03, obtaining the next positive sample of the current station caption, repeating the steps A01 and A02 until the feature vectors of all the positive samples of the current station caption are completed.

In the station caption identification method, in step a03, the feature vector of the current station caption is obtained by calculating according to the following formula:

when in use

When is, p_f(i)=0

When in use

<math> <mrow> <munderover> <mi>Σ</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mi>P</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <munderover> <mi>Σ</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mrow> <mo>(</mo> <mi>P</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </math>

When is, p_f(i)=1

Wherein j represents the jth sample, i represents the ith characteristic value, P (i, j) represents the ith characteristic value under the jth positive sample, and P_fIs the feature vector of the current station target, which is a determinant of m (rows) × 1 (columns), p_f(i) For the feature vector p of the current station mark_fThe ith eigenvalue of (a).

In the station caption identifying method, in step a0, the filter relationship of the current station caption filter is:

h=D^-1X(X⁺D^-1X)^-1u

wherein,

and + represents the conjugate transpose of the complex vector, X_kIs x_kFourier transform of (1), x_kDenotes the kth positive sample, X_k ^*Is X_kThe conjugate transpose matrix of (a);

furthermore, the correlation filter function also has to satisfy the following constraints and requirements at the origin, namely:

X⁺h=du

where X is a matrix of dimension d N, u is the specific output limit selected, and N is the number of the relevant filter functions.

In the station caption identification method, the calculation formula of the matching degree between the feature vector of the image of the area to be detected and the feature vector of each station caption is as follows:

wherein p (i) is the ith characteristic value of the image of the area to be detected, p_f(i) Is the ith characteristic value of the current station mark, d is d₁×d₂A positive sample of (d)₁Denotes the length of the positive sample, d₂Representing the width of the positive sample.

In the station caption identifying method, in the step B, the calculation formula of the similarity is as follows:

wherein peak is the peak value in the two-dimensional correlation plane, mean and sigma are the average value and standard deviation in the effective area around the peak value respectively;

the two-dimensional correlation plane is obtained by the following formula:

<math> <mrow> <mi>y</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>l</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&CircleTimes;</mo> <mi>h</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </math>

wherein h (x, y) is a two-dimensional filter, and l (x, y) is the current traversal image.

A station logo identification system, comprising:

the image extraction module is used for extracting an image of the area to be detected;

the storage module is used for storing the characteristic vectors and the relational filter functions of the station captions;

the processing module is used for acquiring the characteristic vector of the image to be detected in the image of the area to be detected, calculating the matching degree of the characteristic vector of the image to be detected and the characteristic vector of the station caption stored in advance, selecting the matching degree larger than or equal to a preset value, and acquiring the station caption information corresponding to the selected matching degree; searching a corresponding relation filter function according to the obtained station caption information, and calculating the similarity between the obtained station caption and the corresponding relation filter function according to the relation filter function;

and the identification module is used for acquiring the maximum similarity, searching the station caption information corresponding to the maximum similarity and outputting an identification result.

Compared with the prior art, the station caption identification method and the station caption identification system provided by the invention have the advantages that the included station caption information is obtained by calculating the matching degree of the characteristic vector of the image to be detected and the characteristic vector of each station caption, the initial judgment is carried out, the similarity between the image to be detected and the corresponding station caption is calculated, the station caption information with the maximum similarity is extracted, the station caption is effectively identified under the condition of ignoring color and current information, the efficiency is improved, and the identification speed is increased.

Drawings

Fig. 1 is a flowchart of a method of identifying station captions according to the present invention.

Fig. 2 is a schematic diagram of traversing an image to be detected in the station caption identifying method provided by the invention.

Fig. 3 is a schematic diagram of calculating similarity by using a two-dimensional correlation output plane in the station caption identifying method provided by the present invention.

Fig. 4 is a block diagram of a station caption detection system provided by the present invention.

Detailed Description

The invention provides a station caption identification method and a station caption identification system, which adopt a coarse-to-fine identification method, firstly, station caption information contained in an image to be detected is obtained through the characteristic vector value of each station caption, and the detection range is narrowed; and then, accurately identifying by using a related filtering function, searching station caption information with the maximum similarity, and obtaining an identification result.

In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Please refer to fig. 1, which is a flowchart illustrating a method for station caption identification according to the present invention. The station caption identification method provided by the invention comprises the following steps:

s100, extracting an image of a region to be detected, acquiring a feature vector of the image to be detected in the image of the region to be detected, calculating the matching degree of the feature vector with a pre-stored station caption, selecting the matching degree larger than or equal to a preset value, and acquiring station caption information corresponding to the selected matching degree;

s200, finding a corresponding relation filter function according to the acquired station caption information, and calculating the similarity between the acquired station caption and the corresponding relation filter function according to the relation filter function;

s300, obtaining the maximum similarity, searching station caption information corresponding to the maximum similarity, and outputting an identification result.

Before the station caption identification is carried out on the image to be detected, the method needs to train in a large number of samples to obtain the relevant data (such as the characteristic vector and the relation filter function of each station caption) of each station caption and store the relevant data. Therefore, before step S100, the station caption identifying method further includes: and S10, collecting the positive samples of each station caption, manufacturing the standard template of each station caption, and obtaining the characteristic vector and the relation filter function of each station caption.

In the embodiment of the invention, when the characteristic vector of each station caption is obtained, the characteristic vectors of all positive samples of each station caption are obtained firstly; therefore, before obtaining the feature vectors of all positive samples of the current station caption, the positive samples of each station caption need to be collected, and the positive samples of the same station caption have the same size, all of which are d₁×d₂（d₁Denotes the length of the positive sample, d₂Representing the width of the positive sample) and preferably completely encompasses the plateau region. Each station caption needs to acquire different positive samples according to video playing contents, and other parts (such as background patterns, colors and the like) are different in all the positive samples except for the same station caption.

And then, making a standard template I of each station caption, wherein the standard template I of each station caption is a binary image without any background, the size of the binary image is consistent with that of the corresponding positive sample, and only one standard template I of each station caption is needed.

Then, normalization operation is performed on all the positive samples and the standard template I, the image size is uniformly normalized to w × q size (for example, 64 × 64), and the sizes of the positive samples and the standard template I are reduced, thereby increasing the calculation speed. Where w is the length of the two-dimensional image and q is the width of the two-dimensional image.

After the normalization operation is completed, calculating the feature vectors of all positive samples of each station caption, which specifically includes:

s11, if the pixel value x (i, j) of the current pixel point (i, j) in a positive sample x of the current station caption is larger than or equal to half of the sum of the average value of the pixel values of all the pixel points on the abscissa of the pixel point and the average value of the pixel values of all the pixel points on the ordinate of the pixel point, setting the current pixel value x (i, j) as 1, otherwise, setting the current pixel value x (i, j) as 0;

s12, converting the two-dimensional image with the size of w multiplied by q obtained by normalizing the positive sample of the station caption and the standard template into a one-dimensional array to obtain the characteristic vector p of the current positive sample (namely the kth sample)_kThe feature vector p_kIs m rows and one column; wherein, w is the length of the two-dimensional image, q is the width of the two-dimensional image, and m is w × q;

and S13, acquiring the next positive sample of the current station caption, and repeating the steps S11 and S12 until the feature vectors of all the positive samples of the current station caption are completed.

Taking the calculation of the image feature vectors of all positive samples of the Hunan satellite television as an example, when a certain positive sample x is calculated, the feature vector calculation method of the positive sample is as follows:

a) if the value x (i, j) of the current pixel point (i, j) in the positive sample x is greater than or equal to half of the sum of the average value of the pixel values of all the pixel points on the row (i.e., the abscissa of the pixel point) and the average value of the pixel values of all the pixel points on the column (i.e., the ordinate of the pixel point), the current pixel value x (i, j) is set to 1, otherwise, the current pixel value x (i, j) is set to 0. The specific calculation formula is as follows:

<math> <mrow> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mrow> <mo>(</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mo>+</mo> <msub> <mi>V</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>/</mo> <mn>2</mn> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo><</mo> <mrow> <mo>(</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mo>+</mo> <msub> <mi>V</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>/</mo> <mn>2</mn> </mtd> </mtr> </mtable> </mfenced> </mrow> </math>

H_j,V_iand respectively representing the average value of the pixel values of all the pixel points in the jth row and the average value of the pixel values of all the pixel points in the ith column under the mask of the standard template I. Wherein H_j，V_iThe calculation formula of (a) is as follows:

in the above formula a_jIs the number of non-zero pixels in the jth line in the standard template I, b_iThe number of pixels in the ith column of the standard template I is not zero, and I (I, j) represents the pixel value of the standard template I at the pixel (I, j).

b) Converting the obtained two-dimensional image with the size of w multiplied by q into a one-dimensional array to obtain the characteristic vector p of the current positive sample (namely the kth sample)_kThe feature vector p_kIs m (m = w × q) rows, one column.

Obtaining the next sample, repeating the a and b processes, thereby obtaining the feature vector set P = [ P ] of all positive samples under Hunan satellite television₁,p₂,p₃,......,p_k,......,p_n]And n is the number of positive samples.

Specifically, the feature vector set P of all samples is:

P = [\begin{matrix}  \end{matrix} p_{1}, p_{2}, p_{3}, . . . . . ., p_{k}, . . . . . ., p_{n}]

= [\begin{matrix} P (1,1), P (1,2), P (1,3), P (1,4), . . . . . ., P (1, k), . . . . . ., P (1, n) \\ P (2,1), P (2,2), P (2,3), P (2,4), . . . . . ., P (2, k), . . . . . . P (2, n) \\ P (3,1), P (3,2), P (3,3), P (3,4), . . . . . ., P (3, k), . . . . . . P (3, n) \\ . . . . . . . . . . . . . . . . . . \\ P (l, 1), P (l, 2), P (l, 3), P (l, 4), . . . . . ., P (l, k), . . . . . ., P (l, n) \\ . . . . . . . . . . . . . . . . . . \\ P (m, 1), P (m, 2), P (m, 3), P (m, 4), . . . . . ., P (m, k), . . . . . ., P (m, n) \end{matrix}]

wherein p is_kThe feature vector under the kth positive sample (i.e. the feature vector of the current positive sample); p (l, k) is the l characteristic data of the k positive sample; l is more than or equal to 1 and less than or equal to m, m is wxq, k is the sequence number of the sample, namely the kth sample, k is more than or equal to 1 and less than or equal to n, and n is the number of positive samples.

In the embodiment of the invention, after the feature vectors of all positive samples of one station caption are obtainedAnd obtaining the characteristic vector p of the current station caption_fThe calculation formula is as follows:

when in use

When is, p_f(i)=0

When in use

When is, p_f(i)=1

Wherein j represents the jth sample, i represents the ith characteristic value, P (i, j) represents the ith characteristic value under the jth positive sample, and P_fIs the feature vector of the current station logo, which is m (rows)) Determinant of x 1 (column), p_f(i) For the feature vector p of the current station mark_fThe ith eigenvalue of (a).

Similarly, taking the feature vector of the Hunan satellite as an example: the feature vector set P of all samples obtained in the above steps is a determinant of m (rows) × n (columns), where m = w × q, and the feature vector of the current station logo (e.g., the view of the south of the lake) is P_fThen it is m (row) × 1 (column) determinant, then p_fThe value of (2) is obtained by the above calculation formula for obtaining the feature vector of the current station caption.

And then, calculating the feature vectors of all station captions (such as CCTV1, Shanxi satellite television, Jiangxi satellite television, Tianjin satellite television, … …) according to the feature vector calculation method of Hunan satellite television.

Before station caption identification, the invention also needs to obtain the relation filter function of all station captions on the basis of obtaining the positive samples of all station captions, and the calculation mode is as follows:

h=D^-1X(X⁺D^-1X)^-1u

the above-mentioned relational filter function is obtained by using lagrange multiplier method, and the relational filter is a basic tool for pattern recognition in frequency domain, and its most common algorithm is the minimum average correlation energy filter (MACE). As mentioned above, there are n pieces of d (taking Hunan Wei as an example)₁×d₂Sample x of_kLet d = d₁×d₂Then, the output image g of the positive sample after passing through the relation filter function h is input_k(n) is:

<math> <mrow> <msub> <mi>g</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>x</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&CircleTimes;</mo> <mi>h</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </math>

wherein, suppose x_k(x, y) is a two-dimensional input signal, i.e. the k sample x_kH (x, y) is a two-dimensional filter, g_k(x, y) is the two-dimensional correlation plane for the kth sample, and l (x, y) is the current traversal image.

According to the Parseval theorem (Parseval theorem), the optimization goal of the minimum average correlation energy filter is to minimize the average energy (ACE) of an image, namely:

= \min (h^{+} Dh)

wherein,

and + represents the conjugate transpose of the complex vector, X_kIs x_kFourier transform of (2), X_k ^*Is X_kThe conjugate transpose matrix of (2).

Furthermore, the relevant relational filter function also has to satisfy the following constraints and requirements at the origin, namely:

X⁺h=du

where X is a matrix of dimension d × N, the column vector is a vector representation of the fourier transform of the training image, u is the selected specific output limit, and N is the number of possible correlation filter functions.

In the specific implementation process, the relationship filter functions h under each station caption are different, but the relationship filter functions h of the MACE filters under all station captions (such as CCTV1, shanxi satellite television, jiangxi satellite television, tianjin satellite television, … … and the like) can be obtained by repeating the calculation process.

The invention can carry out rapid and accurate identification on the station caption after the characteristic vector and the relation filter function of the station caption are determined. In a specific implementation process, the step S100 specifically includes:

firstly, extracting an image of a region to be detected containing a station caption. As shown in fig. 2, when the television is played, the station caption is generally located in the upper left corner area or the upper right corner area of the display screen, so that when the station caption is identified in this embodiment, images of the upper left corner area and the upper right corner area of the video need to be extracted.

And secondly, selecting a standard template of the station caption from a pre-stored station caption library to traverse one region of the image of the region to be detected, and calculating a feature vector of the current region. Wherein the standard template is smaller than the image area. Referring to fig. 2, when the matching degree between a station caption and the image of the region to be detected is calculated, the size d of the standard template in the station caption standard module library is used₁×d₂To traverse a region of the image of the region to be detected, and extract a traverse image l, such as the region indicated by the thick solid line rectangular box in fig. 2.

Thirdly, comparing the calculated characteristic vector with the characteristic vector of the selected station caption, and calculating the matching degree; and executing the fourth step when the matching degree is greater than or equal to a preset threshold, otherwise executing the fifth step. When the matching degree is greater than or equal to a preset threshold value, indicating that the station caption information is contained in the image to be detected; if the matching degree is smaller than the preset threshold, the next region of the image of the region to be detected, such as the dashed box region shown in fig. 2, needs to be traversed, and then judgment is performed.

And fourthly, traversing one region of the image of the region to be detected by using the standard template of the next station caption, calculating the feature vector of the currently obtained region, comparing the calculated feature vector with the feature vector of the next station caption, and calculating the matching degree until all pre-stored standard templates of the station caption traverse the image of the region to be detected, so as to obtain the station caption information corresponding to the matching degree which is greater than or equal to the preset value for subsequent accurate identification.

And fifthly, traversing to the next region, and continuously calculating the feature vector of the current region until all regions of the image of the region to be detected are traversed. And if the matching degree of the standard template of the current station caption is smaller than the preset threshold value after traversing all the areas of the area image to be detected, the area image to be detected does not contain the information of the current station caption, and then traversing the area image to be detected by the standard template of the next station caption. And if the matching degrees of the standard templates of all station captions are greater than or equal to a preset threshold value after traversing the image of the area to be detected, the standard templates of all station captions indicate that the image of the area to be detected contains station caption information.

In the embodiment of the invention, when traversing to a new position, the feature vector p of the current area image is obtained according to the mode of the fifth step, and the feature vector p of the current station logo are combined_fComparing, and calculating the matching degree M of the characteristic vector of the traverse image l (namely the image to be detected) and the characteristic vector of the current station caption_aDegree of matching M thereof_aThe calculation method is as follows:

wherein p (i) is the ith characteristic value of the image of the area to be detected, p_f(i) Is the ith characteristic value of the current station mark, d is d₁×d₂X of_kPositive sample, x_kRepresenting the kth positive sample.

After all station captions traverse the images of the area to be detected, the matching degree is more than or equal to T_m（T_mIs a preset threshold value) is determined as station caption information contained in the image to be detected, and after the matching degree is calculated, only the image to be detected needs to be judged to possibly contain the station caption informationThe station caption information with large matching degree greatly reduces the station caption identification range. The matching degree of the current traversal region is determined through the step S100 and compared with the threshold, if the matching degree is larger than or equal to the threshold, other regions do not need to be traversed, for example, theoretically, one region to be detected needs to be traversed for 100 times, but when the matching degree is larger than or equal to the threshold at the 30 th time, the remaining 70 times do not need to be traversed, and the station logo recognition speed is improved.

Therefore, after the matching degree is calculated, similarity judgment is carried out on the station caption information. The step S200 specifically includes:

firstly, inquiring a station caption relation filter function corresponding to the acquired station caption information, and calculating and outputting a two-dimensional correlation plane of the image to be detected under the corresponding relation filter function of the inquired station caption;

secondly, calculating the similarity between the image of the area to be detected and the corresponding station caption based on the two-dimensional correlation plane, and judging whether the calculation of the similarity between the image of the area to be detected and all the station captions corresponding to the acquired station caption information is finished or not; if yes, executing step S300, otherwise, acquiring the relation filter function of the next station caption in the station caption library, and executing the first step.

When similarity judgment is carried out, accurate identification is carried out by utilizing a relation filtering function, and the calculation formula of the similarity is as follows:

peak is the peak value in the two-dimensional correlation plane, mean and sigma are the average value and standard deviation in the effective area around the peak value respectively;

wherein the two-dimensional correlation plane is obtained by the following formula:

y (x, y) is the two-dimensional correlation plane of the current traversal image, h (x, y) is the two-dimensional filter, and l (x, y) is the current traversal image.

Also taking the Hunan satellite vision (e.g., it is the mth station caption) as an example, the two-dimensional correlation plane y (x, y) of the traversal image l at the position under the correlation filter h of the current station caption A (Hunan satellite vision) is first calculated.

Then, the similarity S between the current area and the station caption A (such as Hunan satellite television) is calculated₁. Because the output plane y is a two-dimensional correlation plane, the correlation energy ratio is used as a similarity matching index, and according to the correlation energy principle, the similarity is as follows:

where peak is the peak value (maximum value in y) in the two-dimensional correlation plane, i.e., peak = max (y), the position of the maximum value (peak) in y is (x, y), and mean and σ are the average value and standard deviation of the effective area around the maximum value (peak), e.g., the portion of the area G not including the area R as in fig. 3, respectively. The region G is a square region which takes the peak value peak as the center and the side length as G;

the region R is a square region centered on the peak with a side length R, and g is larger than R. The effective area is as follows: region G-region R, the size of region R in FIG. 3 being generally that of region GIn this example to(i.e., length and width are each taken

) If the size of the region G is 20 × 20, the size of the region R is 5 × 5.

After the similarity of the mth station caption is obtained, the next station caption B (namely the (m + 1) th station caption) of the included station caption information is obtained according to the similarity calculation mode until the similarities of all possible station captions are calculated { S }₁、S₂、S₃、……、S_m、……}。

Then, the similarity { S between the image to be detected and all possible station marks is searched₁、S₂、S₃、……、S_m… …, e.g. max S₁、S₂、S₃、……、S_m、……}=S_mIf so, the m-th station caption information in the standard template library is contained in the image to be detected, namely the m-th station caption information has the largest similarity and is the station caption information contained in the image to be detected, and station caption identification is completed.

Taking the center station as an example, when the matching degree is calculated in step S100, station caption information of one, three, and seven center stations included in the to-be-detected region image is obtained. Then, in step S200, the similarity between the corresponding relation filter function of the image to be detected and the relation filter functions of the central one, the central three and the central seven is calculated, and if the calculated similarity of the central one is the maximum, the identification result is directly output as the central one.

In order to improve the identification precision, the matching degree can be calculated while traversing all the images of the area to be detected; then all matching degrees are compared to obtain the matching degree which is greater than or equal to the threshold value; and then calculating the similarity between the relation filter function of the corresponding station caption and the obtained station caption, and identifying the station caption. Supposing that theoretically, a region to be detected needs to be traversed 100 times totally to obtain 100 matching degrees, and then the maximum matching degree is obtained and compared with a threshold value; the method specifically comprises the following steps: calculating to obtain a plurality of matching degrees aiming at the region to be detected traversed by each station caption, and taking the maximum matching degree; therefore, a plurality of station logos have a plurality of maximum matching degrees; then comparing the maximum matching degrees with a threshold value to obtain station caption information corresponding to the matching degrees which are greater than or equal to the threshold value; and then calculating the similarity under the acquired station marks, acquiring the maximum similarity, and outputting an identification result. For example, there are 100 station marks in total, the invention firstly saves the characteristic vectors of the 100 station marks, obtains the characteristic vector of the current image to be detected, and respectively compares the characteristic vector with the saved 100 characteristic vectors to obtain the characteristic vector matching degrees of the 100 station marks, if the characteristic vector matching degrees of the 100 station marks have 20, which are greater than or equal to the threshold T_mThen, the relational filter functions corresponding to the 20 station markers are obtained to perform the next calculation, that is, the possibility that the image contains the 20 station markers is high, and further determination needs to be performed by using the relational filter functions.

Based on the station caption identification method, the invention also correspondingly provides a station caption identification system which comprises an image extraction module 101, a storage module 102, a processing module 103 and an identification module 104, wherein the image extraction module 101, the storage module 102 and the identification module 104 are all connected with the processing module 103.

The image extraction module 101 is configured to extract an image of a region to be detected. The storage module 102 is configured to store the feature vectors and the relational filter functions of the station captions. The processing module 103 is configured to obtain a feature vector of an image to be detected in an image of a region to be detected, calculate a matching degree between the feature vector and a pre-stored station caption feature vector, select a matching degree greater than or equal to a preset value, and obtain station caption information corresponding to the selected matching degree; and searching a corresponding relation filter function according to the obtained station caption information, and calculating the similarity between the obtained station caption and the corresponding relation filter function according to the relation filter function. The identification module 104 is configured to obtain the maximum similarity, search for the station caption information corresponding to the maximum similarity, and output an identification result.

In this embodiment, the processing module 103 is further configured to calculate a feature vector and a relationship filter function of each station caption, so as to perform matching degree and similarity calculation when identifying the station caption. Since the functions of the modules in the station caption identifying system have been described in detail in the station caption identifying method, the details are not described here.

In summary, the invention obtains the characteristic vector value of the station caption to be detected by scaling the image to be detected to a uniform size, calculates the matching degree between the characteristic vector value and each station caption, and reduces the station caption identification range according to the matching degree; and then, a filtering function of the station caption is obtained, the similarity between the image to be detected and the station caption corresponding to the filtering function is calculated, and the station caption information with the maximum similarity is found out, so that the station caption information can be accurately and quickly identified in various complex backgrounds, and effective technical support can be provided for automatic video searching, recording, analyzing and retrieving of the multimedia technology.

It should be understood that equivalents and modifications of the technical solution and inventive concept thereof may occur to those skilled in the art, and all such modifications and alterations should fall within the scope of the appended claims.

Claims

1. A station caption identification method is characterized by comprising the following steps:

2. The station caption identifying method according to claim 1, wherein before the step a, the station caption identifying method further comprises:

3. The station caption identification method according to claim 2, wherein the step a specifically includes:

4. The station caption identification method according to claim 1, wherein the step B specifically includes:

5. The station caption identifying method according to claim 2, wherein before the feature vector of each station caption is obtained, feature vectors of all positive samples of each station caption are obtained; when obtaining the feature vectors of all positive samples of the current station caption, the method specifically includes:

a02, converting a two-dimensional image with the size of w multiplied by q obtained by normalizing a positive sample of the station caption and a standard template into a one-dimensional array to obtain a feature vector p of the current positive sample_kThe feature vector p_kIs m rows and one column; wherein, w is the length of the two-dimensional image, q is the width of the two-dimensional image, and m is w × q;

6. The station caption identifying method of claim 5, wherein in the step A03, the feature vector of the current station caption is obtained by calculating according to the following formula:

when in use

When is, p_f(i)=0

When in use

When is, p_f(i)=1

7. The station caption identifying method of claim 2, wherein in step a0, the filter of the current station caption has a relational filter function of:

h=D^-1X(X⁺D^-1X)^-1u

wherein,

X⁺h=du

8. The station caption identification method according to claim 3, wherein a calculation formula of the matching degree of the feature vector of the image of the region to be detected and the feature vector of each station caption is:

wherein p (i) is the ith characteristic value of the image of the area to be detected, p_f(i) Is the ith characteristic value of the current station mark, d is d₁×d₂Positive sample of，d₁Denotes the length of the positive sample, d₂Representing the width of the positive sample.

9. The station caption identifying method according to claim 1, wherein in the step B, the calculation formula of the similarity is:

the two-dimensional correlation plane is obtained by the following formula:

10. A station caption identification system, comprising: