CN101833664A - Video image character detecting method based on sparse expression - Google Patents

Video image character detecting method based on sparse expression Download PDF

Info

Publication number
CN101833664A
CN101833664A CN 201010151779 CN201010151779A CN101833664A CN 101833664 A CN101833664 A CN 101833664A CN 201010151779 CN201010151779 CN 201010151779 CN 201010151779 A CN201010151779 A CN 201010151779A CN 101833664 A CN101833664 A CN 101833664A
Authority
CN
China
Prior art keywords
image
character
edge
video
gray level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 201010151779
Other languages
Chinese (zh)
Inventor
王春恒
李心洁
程刚
张荣国
张阳
肖柏华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN 201010151779 priority Critical patent/CN101833664A/en
Publication of CN101833664A publication Critical patent/CN101833664A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a video image character detecting method based on sparse expression, which comprises the following steps of: S1, resampling a video sequence to obtain a color video image, and converting the gray level and the multi-scale of the color video image to obtain a multi-scale gray level image; S2, performing edge detection and morphological closed operation to the multi-scale gray level image with an improved Sobel operator to obtain an edge image and filter the edge density of the edge image; obtaining a candidate character region through connected domain analysis and regular analysis; and S3, performing vertical projection and horizontal projection to the candidate character region, diving a vertical projecting image and a horizontal projecting image to obtain candidate character lines, dividing the candidate character lines into small regions through sliding windows, extracting the edge characteristics of the small regions, respectively classifying each small region with a classifying method based on the sparse expression, judging whether the small regions are character regions, judging the candidate character lines according to the judging result of the small regions to obtain and output a final character line region.

Description

Video image character detecting method based on sparse expression
Technical field
The invention belongs to image understanding and image retrieval category, be specifically related to a kind of rapid and precise video text detection method and system and realize.
Background technology
Along with multimedia technology and rapid development of Internet, the quantity of multimedia messages is explosive growth.Except comprising image and text message, also comprise video information in the more and more data storehouse.Video information is the most general a kind of in the multimedia messages, and it can obtain in several ways, as TV, network etc.This has caused the interest of Chinese scholars how to retrieve the information paid close attention to from a large amount of video informations.In video was understood and retrieved, literal can provide very abundant semantic information for video, is a crucial ingredient.The caption in the news program for example, the score of competing in the sports cast, the title of commodity and manufacturer in the advertisement.Current many video databases all are to carry out index and retrieval by manually picture being carried out the text message that note produces.Not only speed is slow but also very uninteresting when manually picture being carried out text annotation, therefore needs the effective computerized algorithm of exploitation that video image is carried out note automatically.By some algorithms, can carry out index and retrieval with the feature of from video image, directly extracting.
Literal in the video image can be divided into two classes: caption character, scene literal.Characteristics such as caption character is to be added in video image the artificial later stage to understand in order to help the beholder, and therefore this literal has the contrast height, and illumination is even.Because caption character adds video through the elaborately planned later stage, has important video content information usually.The scene literal is to occur as the part of video scene, and be taken into video together with scene, the scene literal overwhelming majority is accidental the appearance, occur along with the appearance of object in the scene, the literal on the road sign for example, the title in shop, the literal on the personage's clothes in the video, the literal on the billboard etc.
The method that the detection Word message has existed from video image roughly can be divided into four classes: based on the edge, utilize literal and background to have stronger contrast usually and have abundant marginal information; Based on connected domain, utilized literal in a row, become to list existing characteristic; Based on angle point, utilize character area to have abundant angle point with respect to background; Based on texture, utilize the moving window of fixed measure, extract the interior average of each window, second-order moment around mean, third central moment as feature.
Because video Chinese words size differs, font type and color are varied, and traditional method exists that efficient is lower, calculation of complex, the not high limitation of degree of accuracy.The method of utilizing marginal density to detect is in this article carried out quick rough detection to video image, for the candidate character region that rough detection obtains, utilizes the sorting technique of sparse expression to verify again.Experimental result shows that this method can overcome the deficiency of classic method.
Summary of the invention
At the deficiencies in the prior art, the present invention seeks to can orient text filed under the complex background effectively, accurately, fast by detection method from coarse to fine, for this reason, a kind of video image character detecting method based on sparse expression is proposed.
In order to achieve the above object, the technical scheme of video image character detecting method that the present invention is based on sparse expression is as follows: video sequence pre-service that the method comprising the steps of, video image character zone rough detection and video image character examining are surveyed, and concrete steps are:
Step S1, sequence of video images pre-service: video sequence is resampled, obtain color video frequency image; And color video frequency image is converted to gray level image; Gray level image is carried out multi-scale transform obtain multiple dimensioned gray level image;
Step S2, video image character zone rough detection: at first multiple dimensioned gray level image is adopted and improve that Suo Beier (Sobel) operator carries out rim detection and morphology closes calculation, obtain edge image; Secondly edge image being carried out marginal density filters; Obtain candidate character region by connected domain analysis, rule analysis at last;
Step S3, the video image character examining is surveyed: at first the candidate character region that rough detection is obtained is by vertical projection and horizontal projection, again vertical projection image and horizontal projection being looked like to carry out cutting, to obtain candidate character capable, be the zonule by moving window with the capable cutting of candidate character then, edge feature is extracted in the zonule, adopt then based on the sorting technique of sparse expression (Sparse Representation) and classified respectively in each zonule, judge whether the zonule is character area, judged result according to the zonule, judge candidate character is capable, obtain and export final literal line zone.
Effect of the present invention is: compare with existing method, the present invention can orient character area fast, and has higher recall rate and accuracy rate.Can be applicable in visual classification and the searching system.System of the present invention adopts by thick to thin multiple dimensioned text detection framework, effectively filter out most non-legible zone in the rough detection stage by rapid edge filter density method, divide literal and non-legible zone in the examining stage of surveying by sorting technique active zone based on sparse expression, obtain higher accuracy rate, multiple dimensioned processing can detect the literal of different sizes.Therefore this method has improved the accuracy of word area detection, and not influenced by font size, illumination etc. under the prerequisite of taking into account speed and recall rate.
Description of drawings
Fig. 1 is a detection algorithm frame diagram of the present invention
Embodiment
The present invention is described in further detail below in conjunction with the drawings and specific embodiments.
As shown in Figure 1, a kind of video image character detecting method based on sparse expression from coarse to fine of the present invention specifically may further comprise the steps:
1, video sequence pre-service.
(1) video sequence resamples:
According to statistics, the literal in the video image appears in the tens continuous two field pictures at least.Because the difference of adjacent two two field pictures is especially little, when adopting same set of algorithm to handle, the result who obtains also can be closely similar.In this case, all frames are independently handled to be brought counting yield low.Therefore, on the accuracy and high performance basis that guarantee text detection and extraction, we resample to video sequence, and per 10 frames are got 1 two field picture, can make the work efficiency of system obtain the raising of several times like this, and not influence the accuracy of sampling.
(2) coloured image is converted into gray level image:
At first the coloured image with input is converted into gray level image, and it is changed referring to formula (1):
f g(x,y)=0.3R(x,y)+0.59G(x,y)+0.11B(x,y)?(1)
R in the formula (1) (x, y), G (x, y), B (x y) is the R of input color image, G, and the B component, x, y are the coordinate figure of pixel, f g(x y) is the gray level image after the conversion.
Because the literal in the video image is not of uniform size, in order to detect the literal that varies in size, gray level image is carried out multi-scale transform, original image is decomposed into the image of different resolution.Detect at the enterprising style of writing word of each level of resolution then, the result of Jian Ceing is mapped among the former figure at last, and the detected literal of different scale is merged.Little character is detected on the higher subgraph of resolution, and relatively large character is detected on the low word figure of resolution.At last the result is integrated.
2, video image character zone rough detection
(1) video image rim detection:
The multiple dimensioned gray level image that obtains in the above-mentioned steps 1 is carried out improved rope Bel (Sobel) operator edge detection by formula 2.Concrete steps are the operator and formula (2) the computed image edge of the four direction of employing table 1, and table 1 is as described below:
Figure GSA00000087357100041
E (x, y)=max (| S H|, | S V|, | S LD|, | S RD|)+k * | S ⊥ MAX| max represents to select maximal value, S in (2) formula (2) H, S V, S LD, S RDBe respectively Suo Beier (Sobel) edge intensity value computing on horizontal direction, vertical, left diagonal line, right cornerwise this four direction, S ⊥ MAXRepresent the Grad of the direction vertical with the greatest gradient direction, k is a fixed coefficient, and (x is that coordinate is that (k ∈ (0,1) here k gets 0.5 for x, the edge intensity value computing of some y) y) to E.Because (x, value y) might surpass 255, therefore need (x, value linear change y) is between [0-255] with E to calculate back edge E.
(2) closing operation of mathematical morphology:
Because there is the interference of noise in the process edge-detected image, and some character stroke fracture, there are a lot of little gaps and isolated point.This will hinder the connected domain analysis of back.Therefore the point that needs to isolate is removed, and little gap is connected.The edge image that obtains by rim detection is carried out closing operation of mathematical morphology, effectively the little gap in the removal of images.
(3) marginal density filters:
Marginal density filters and to be meant when being that the edge intensity value computing of this window was set to zero, when greater than a certain value, remains unchanged when one of the center fixedly the marginal density in M * N window was lower than a certain value with certain pixel.Because the stroke feature of literal makes that the marginal density of character area is strong with respect to the marginal density of background area, be one of the center fixedly in the window of M * N with certain pixel just, with regard to the quantity of edge pixel, the quantity of pixel is greater than the quantity in the interior window pixel in background area (non-text filed) in the text filed interior window.
For the edge image that obtains in the step (2), carry out marginal density and filter.Set up an other width of cloth and the identical new image FE of former figure size, with all pixel zero setting of new image FE.By in formula (3) the edge calculation image with pixel i, j be center size for the pixel number EW in the window of M * N (i, j), if EW (i, j) greater than empirical value T, T ∈ (0, S MN), S MNBe that size is then will copy to the correspondence position of FE for the pixel in the window by the area of M * N window.Obtain marginal density image FE.
EW ( i , j ) = Σ x = i - M / 2 i + M / 2 Σ y = j - N / 2 j + N / 2 E ( x , y ) - - - ( 3 )
Wherein (x is (x, edge intensity value computing y) for coordinate y) to E.Window size is M * N, and (i is with coordinate i j) to EW, the marginal density value in the stationary window M * N at j center.
We adopt formula (6) to carry out computing for the computing velocity of accelerating formula (3).At first by formula (4) and (5) iteration obtain IE (x, y), IE (x, y) be (x, y) upper left all pixel values and, promptly Wherein (i is (i, edge intensity value computing j) for coordinate j) to E.By formula (6) edge calculation density EW (x, y);
s(x,y)=s(x,y-1)+E(x,y) (4)
IE(x,y)=IE(x-1,y)+s(x,y) (5)
(x is that coordinate is that ((x y) is coordinate points (x, 0), (x, 1) to s for x, the edge intensity value computing of some y) y) to E in the formula (4) ... (x, y-1), (x, y) accumulated value of edge strength.IE in the formula (5) (x, y) be s (0, y), s (1, y) ... s (x-1, y), s (x, y) value and.By iterative formula (4) and formula (5), initial value s (x ,-1)=0, IE (1, y)=0, IE (x, value y) can once be calculated by color video frequency image and finish, and the value of some marginal densities can calculate fast by formula (6) arbitrarily,
EW(i,j)=IE(i+M/2,j+N/2)+IE(i-M/2,j-N/2) (6)
-(IE(i+M/2,j-N/2)+IE(i-M/2,j+N/2))
Wherein (i is with coordinate i j) to EW, the marginal density value of a stationary window M * N at j center.(x y) can obtain by formula (4), (5) iteration IE.
(4) connected domain analysis:
Carry out 8 neighborhood connected domains for the image that obtains in the step (3) and demarcate, calibrate the zone that all pixel values are communicated with, i.e. connected member.
(5) rule analysis:
Analyze us by step (4) connected domain and obtained a lot of connected members, the size, area, length breadth ratio and the edge pixel that utilize connected member are judged as character area or non-legible zone than these geometric properties with connected member, and non-legible zone is abandoned.
Remaining connected member is merged according to the size that intersects area between connected member, till the connection piece that does not have to merge.Position and size to each text connected member are analyzed, and will form candidate character region in the text connected member combination with delegation or same row.
3, with the candidate character one's respective area that obtains under the different scale, merge according to geometric relationship, if the zone that two candidate character region intersect greater than certain ratio, two merge into a character area with these two character areas.
4, the character area examining is surveyed:
(1) candidate character region that step 3 is obtained is carried out cutting.Candidate character region is carried out vertical projection and horizontal projection, carry out cutting, thereby the candidate character of orienting in the picture is capable according to perspective view.
(2) the capable checking of candidate character to navigating in the step (1) keeps the literal line of correctly judging, with the literal line filtration of wrong Pu'an section.Is the zonule by moving window with the literal line cutting, carries out feature extraction for the zonule.Utilization is classified to the zonule based on the sorting technique of sparse expression (SparseRepresentation), and this method is divided into training and judges two processes:
Training process carries out in advance, in training process, has chosen the positive sample and the negative sample of a large amount of character areas, uses the method for k mean cluster and svd (K-SVD) to train; Obtain positive dictionary D P, negative dictionary D NDictionary D={D P, D N;
In deterministic process, output is designated as Z (w), detected character area w in the step (1) is judged by the reconstructed error of positive dictionary and negative dictionary, if the reconstructed error of positive dictionary is less, be judged as correct character area, be output as+1, otherwise, if the reconstructed error of positive dictionary is bigger than the reconstructed error of negative dictionary, then be judged as the character area of erroneous judgement, be output as-1; Utilize formula (7) that literal line is judged then, the literal line filtration with erroneous judgement keeps correct literal line, R represents line of text in the formula (7), w shows N * N size windows, and the image-region among Z (w) the expression window w passes through the judged result based on the sorting technique of sparse expression, d wExpression window w center is to the distance at line of text R center, σ 0Be variable (σ 0∈ (0 ,+∞)), the classification results of C (R) expression line of text R, if C (R) greater than zero, then R belongs to correct line of text, on the contrary R belongs to the erroneous judgement line of text, and is filtered.
In the present embodiment, detailed process is as follows.
Training process: the line of text in the step (1) is normalized to H as the sample height, carry out Tuscany (Canny) rim detection.Use size to be that N * N, step-length are the moving window cutting line of text of k, the image block of N * N size is converted into vectorial y ∈ IR N * N, train positive dictionary D by the K-SVD algorithm PSelect the background area as negative sample, the negative dictionary D of training NPositive dictionary and the merging of negative dictionary are obtained D={D P, D N.
Deterministic process: sample process is highly normalized to H as training process, uses size to be that B * B, step-length are the moving window w cutting line of text of k, and the image block of N * N size is converted into vectorial y ∈ IR N * N, obtain sparse coefficient x={x by match tracing (Matching Pursuit) algorithm P, x N, difference error of calculation E P=|| y-D Px P|| 2, E N=|| y-D Nx N|| 2If E P>E NTest sample y belongs to negative sample, and the zone in the corresponding window promptly belongs to the erroneous judgement character area, and output valve is-1, if E P≤ E N, sample y belongs to positive sample, and the zone in the corresponding window promptly belongs to correct character area, and output valve is+1.To export result queue is Z (w), for line of text R, because the contribution that belongs to literal for R the closer to the centre is also just big more, therefore adopts formula (7) that line of text is judged, the literal line of erroneous judgement is filtered, and keeps correct literal line; R represents line of text in the formula, and w represents N * N size windows, and Z (w) expression is adopted based on the sorting technique of the sparse expression judged result to the image-region among the window w, d wThe center of expression window w is to the distance at line of text R center, σ 0∈ (0 ,+∞) be variable, the classification results of the capable R of C (R) expression candidate character is if C (R) is greater than zero, judge that then line of text R is correct literal line, with its reservation and output, otherwise, if C (R) is less than zero, then literal line R belongs to the literal line of erroneous judgement, then will judge literal line by accident and be filtered;
C ( R ) = Σ w ⊆ R Z ( w ) · 1 2 π σ 0 exp ( d w 2 2 σ 0 2 ) - - - ( 7 )
Below, under microcomputer Windows XP environment, adopt Object Oriented method and soft project standard, realize with C Plus Plus, we adopt resolution is that 480 * 360 1 sections Chinese news videos are tested, and video sequence is resampled, and per 10 frames are got 1 two field picture, the video image that obtains is transformed to gray level image by formula (1), by multi-scale transform gray level image is scaled 0.3,0.5,0.7 respectively then, 1 times, export multiple dimensioned gray level image.By formula (2) multiple dimensioned gray level image is carried out Suo Beier (Sobel) operator edge detection, obtain edge image, the rim value that obtains is normalized to [0,255].Then edge image is carried out closing operation of mathematical morphology, then edge image is carried out the rapid edge filter density, setting window size is 29 * 19, then the image behind the edge filter density is carried out the connected domain analysis obtains connected member, by utilizing geometrical rule, filter and the merging connected member, obtain candidate character region.The text block that different scale is obtained merges according to geometric relationship, then these candidate character region is carried out examining and surveys, and at first passes through candidate character region vertical projection and horizontal projection, and cutting is that candidate character is capable.Then these candidate character every trade height are normalized to 16 pixels, choose the moving window of 16 * 16 sizes, step-length is 8, the image in the window is carried out Tuscany (canny) rim detection obtain edge intensity value computing, obtains the proper vector of 256 dimensions.Utilize the dictionary D of match tracing (Matching PursuitMP) algorithm by having trained P, D NObtain positive dictionary coefficient x respectively PWith negative dictionary coefficient x N, difference error of calculation E P=|| y-D Px P|| 2, E N=|| y-D Nx N|| 2If E P>E NTest sample y belongs to negative sample, and the zone in the corresponding window promptly belongs to the erroneous judgement character area, and output valve is-1, if E P≤ E N, sample y belongs to positive sample, and the zone in the corresponding window is correct character area, and output valve is+1.To export result queue is Z (w), by formula (7) line of text is judged.If C (R) is greater than zero then judge that line of text R is correct literal line, otherwise be the literal line of erroneous judgement, will be filtered.The line of text zone output that to correctly judge at last.
Experimental result
Table two is based on the text detection experimental result of sparse expression
Figure GSA00000087357100081
In a word, the present invention has taken into full account video image character and has detected performance and speed, can orient text filedly fast and accurately, is not subjected to the influence of font size and language, has very strong versatility.Can provide favourable support facility for the classification of video image and retrieval etc.

Claims (5)

1. the video image character detecting method based on sparse expression is characterized in that, video sequence pre-service that the method comprising the steps of, video image character zone rough detection and video image character examining are surveyed, and concrete steps are:
Step S1, sequence of video images pre-service: video sequence is resampled, obtain color video frequency image; And color video frequency image is converted to gray level image; Gray level image is carried out multi-scale transform obtain multiple dimensioned gray level image;
Step S2, video image character zone rough detection: at first multiple dimensioned gray level image is adopted and improve that Suo Beier (Sobel) operator carries out rim detection and morphology closes calculation, obtain edge image; Secondly edge image being carried out marginal density filters; Obtain candidate character region by connected domain analysis, rule analysis at last;
Step S3, the video image character examining is surveyed: at first the candidate character region that rough detection is obtained is by vertical projection and horizontal projection, again vertical projection image and horizontal projection being looked like to carry out cutting, to obtain candidate character capable, be the zonule by moving window with the capable cutting of candidate character then, edge feature is extracted in the zonule, adopt then based on the sorting technique of sparse expression and classified respectively in each zonule, judge whether the zonule is character area, judged result according to the zonule, judge candidate character is capable, obtain and export final literal line zone.
2. video image character detecting method as claimed in claim 1 is characterized in that, described rim detection is to adopt improved Sobel algorithm in the following manner: E (x, y)=max (| S H|, | S V|, | S LD|, | S RD|)+k * | S ⊥ MAX| obtain, (x is that the gray level image coordinate is for (the Sobel edge intensity value computing on the gray level image four direction is expressed as horizontal S respectively for x, edge intensity value computing y) y) to E H, vertical S V, left diagonal line S LDWith right diagonal line S RD, max represents to select the maximal value of Sobel edge strength, S ⊥ MAXThe Grad of the direction that the greatest gradient direction of expression gray level image is vertical, k ∈ (0,1).
3. video image character detecting method as claimed in claim 1 is characterized in that, described marginal density be certain pixel with edge image be one of the center fixedly length and width be of a size of in the window of M * N, calculate the summation of this window inward flange value; Calculate according to following formula:
s(x,y)=s(x,y-1)+E(x,y),
IE(x,y)=IE(x-1,y)+s(x,y),
(x y) is coordinate points (x, 0), (x, 1) to s in the formula ... (x, y-1), (x, y) accumulated value of edge strength; (x is that coordinate is (x, the edge intensity value computing of some y), IE (x y) to E, y) be s (0, y), s (1, y) ... s (x-1, y), s (x, y) value and, to above-mentioned s (x, y) formula and IE (x, y) formula carries out iteration, if initial value be s (x ,-1)=0, IE (1, y)=0, IE (x, value y) is once calculated by color video frequency image and is finished, by following formula:
EW(i,j)=IE(i+M/2,j+N/2)+IE(i-M/2,j-N/2)
-(IE (i+M/2, j-N/2)+IE (i-M/2, j+N/2)) calculates any some marginal density values of edge image.
4. video image character detecting method as claimed in claim 1 is characterized in that, adopts and based on the sorting technique of sparse expression is classified respectively in each zonule, and this classification comprises training and determining step, and is specific as follows described:
Training step: in advance positive sample and the negative sample of choosing the zonule are trained, obtain positive dictionary and negative dictionary;
Determining step: the zonule is judged by the reconstructed error of positive dictionary and negative dictionary, if the reconstructed error of positive dictionary is littler than the reconstructed error of negative dictionary, then be judged as correct character area, otherwise, if the reconstructed error of positive dictionary is bigger than the reconstructed error of negative dictionary, then be judged as the character area of erroneous judgement.
5. video image character detecting method as claimed in claim 1 is characterized in that, judges it is to utilize to candidate character is capable
Figure FSA00000087357000021
Judge, the literal line of judging by accident is filtered, keep correct literal line; R represents line of text in the formula, and w represents N * N size windows, and Z (w) expression is adopted based on the sorting technique of the sparse expression judged result to the image-region among the window w, d wThe center of expression window w is to the distance at line of text R center, σ 0∈ (0, + ∞) be variable, the classification results of the capable R of C (R) expression candidate character, if C (R) is greater than zero, then literal line R belongs to correct literal line, and it is kept and the capable zone of output character, otherwise, if C (R) is less than zero, literal line R belongs to the erroneous judgement literal line, then will judge literal line by accident and filter out.
CN 201010151779 2010-04-21 2010-04-21 Video image character detecting method based on sparse expression Pending CN101833664A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010151779 CN101833664A (en) 2010-04-21 2010-04-21 Video image character detecting method based on sparse expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010151779 CN101833664A (en) 2010-04-21 2010-04-21 Video image character detecting method based on sparse expression

Publications (1)

Publication Number Publication Date
CN101833664A true CN101833664A (en) 2010-09-15

Family

ID=42717727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010151779 Pending CN101833664A (en) 2010-04-21 2010-04-21 Video image character detecting method based on sparse expression

Country Status (1)

Country Link
CN (1) CN101833664A (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102208023A (en) * 2011-01-23 2011-10-05 浙江大学 Method for recognizing and designing video captions based on edge information and distribution entropy
CN102306280A (en) * 2011-07-12 2012-01-04 央视国际网络有限公司 Method and device for detecting video scores
CN102306279A (en) * 2011-07-12 2012-01-04 央视国际网络有限公司 Method for identifying video scores and device
CN102750540A (en) * 2012-06-12 2012-10-24 大连理工大学 Morphological filtering enhancement-based maximally stable extremal region (MSER) video text detection method
CN102799879A (en) * 2012-07-12 2012-11-28 中国科学技术大学 Method for identifying multi-language multi-font characters from natural scene image
CN102831402A (en) * 2012-08-09 2012-12-19 西北工业大学 Sparse coding and visual saliency-based method for detecting airport through infrared remote sensing image
CN103020618A (en) * 2011-12-19 2013-04-03 北京捷成世纪科技股份有限公司 Detection method and detection system for video image text
CN103473769A (en) * 2013-09-05 2013-12-25 东华大学 Fabric flaw detection method based on singular value decomposition
CN103632159A (en) * 2012-08-23 2014-03-12 阿里巴巴集团控股有限公司 Method and system for training classifier and detecting text area in image
CN103839062A (en) * 2014-03-11 2014-06-04 东方网力科技股份有限公司 Image character positioning method and device
CN107016392A (en) * 2016-01-27 2017-08-04 四川效率源信息安全技术股份有限公司 A kind of method of text border in removal picture
CN107230200A (en) * 2017-05-15 2017-10-03 东南大学 A kind of method for extracting rotor coil contour feature
CN107302718A (en) * 2017-08-17 2017-10-27 河南科技大学 A kind of video caption area positioning method based on Corner Detection
CN107688788A (en) * 2017-08-31 2018-02-13 平安科技(深圳)有限公司 Document charts abstracting method, electronic equipment and computer-readable recording medium
CN108256518A (en) * 2017-11-30 2018-07-06 北京元心科技有限公司 Detection method and detection device for character region
CN109299682A (en) * 2018-09-13 2019-02-01 北京字节跳动网络技术有限公司 Video text detection method, device and computer readable storage medium
CN109359644A (en) * 2018-08-28 2019-02-19 东软集团股份有限公司 Character image uniformity comparison method, apparatus, storage medium and electronic equipment
CN109460768A (en) * 2018-11-15 2019-03-12 东北大学 A kind of text detection and minimizing technology for histopathology micro-image
CN110059647A (en) * 2019-04-23 2019-07-26 杭州智趣智能信息技术有限公司 A kind of file classification method, system and associated component
CN112668468A (en) * 2020-12-28 2021-04-16 北京翰立教育科技有限公司 Photographing evaluation method and device
CN113297875A (en) * 2020-02-21 2021-08-24 华为技术有限公司 Video character tracking method and electronic equipment
CN114782950A (en) * 2022-03-30 2022-07-22 慧之安信息技术股份有限公司 2D image text detection method based on Chinese character stroke characteristics
CN116152818A (en) * 2023-02-16 2023-05-23 中国工业互联网研究院 Method and system for improving identification accuracy of text lines of rotation image

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101448100A (en) * 2008-12-26 2009-06-03 西安交通大学 Method for extracting video captions quickly and accurately
CN101453575A (en) * 2007-12-05 2009-06-10 中国科学院计算技术研究所 Video subtitle information extracting method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453575A (en) * 2007-12-05 2009-06-10 中国科学院计算技术研究所 Video subtitle information extracting method
CN101448100A (en) * 2008-12-26 2009-06-03 西安交通大学 Method for extracting video captions quickly and accurately

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《2009 Second International Symposium on Computational Intelligence and Design,12-14 December 2009,Changsha,China 》 20091214 Zhao Ming etc. Sparse Representation Classification for Image Text Detection 76-79 , 2 *

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102208023B (en) * 2011-01-23 2013-05-08 浙江大学 Method for recognizing and designing video captions based on edge information and distribution entropy
CN102208023A (en) * 2011-01-23 2011-10-05 浙江大学 Method for recognizing and designing video captions based on edge information and distribution entropy
CN102306280A (en) * 2011-07-12 2012-01-04 央视国际网络有限公司 Method and device for detecting video scores
CN102306279A (en) * 2011-07-12 2012-01-04 央视国际网络有限公司 Method for identifying video scores and device
CN102306280B (en) * 2011-07-12 2014-04-02 央视国际网络有限公司 Method and device for detecting video scores
CN103020618B (en) * 2011-12-19 2016-03-16 北京捷成世纪数码科技有限公司 The detection method of video image character and system
CN103020618A (en) * 2011-12-19 2013-04-03 北京捷成世纪科技股份有限公司 Detection method and detection system for video image text
CN102750540A (en) * 2012-06-12 2012-10-24 大连理工大学 Morphological filtering enhancement-based maximally stable extremal region (MSER) video text detection method
CN102750540B (en) * 2012-06-12 2015-03-11 大连理工大学 Morphological filtering enhancement-based maximally stable extremal region (MSER) video text detection method
CN102799879B (en) * 2012-07-12 2014-04-02 中国科学技术大学 Method for identifying multi-language multi-font characters from natural scene image
CN102799879A (en) * 2012-07-12 2012-11-28 中国科学技术大学 Method for identifying multi-language multi-font characters from natural scene image
CN102831402A (en) * 2012-08-09 2012-12-19 西北工业大学 Sparse coding and visual saliency-based method for detecting airport through infrared remote sensing image
CN102831402B (en) * 2012-08-09 2015-04-08 西北工业大学 Sparse coding and visual saliency-based method for detecting airport through infrared remote sensing image
CN103632159A (en) * 2012-08-23 2014-03-12 阿里巴巴集团控股有限公司 Method and system for training classifier and detecting text area in image
CN103632159B (en) * 2012-08-23 2017-05-03 阿里巴巴集团控股有限公司 Method and system for training classifier and detecting text area in image
CN103473769A (en) * 2013-09-05 2013-12-25 东华大学 Fabric flaw detection method based on singular value decomposition
CN103473769B (en) * 2013-09-05 2016-01-06 东华大学 A kind of fabric defects detection method based on svd
CN103839062A (en) * 2014-03-11 2014-06-04 东方网力科技股份有限公司 Image character positioning method and device
CN103839062B (en) * 2014-03-11 2017-08-08 东方网力科技股份有限公司 A kind of pictograph localization method and device
CN107016392A (en) * 2016-01-27 2017-08-04 四川效率源信息安全技术股份有限公司 A kind of method of text border in removal picture
CN107230200A (en) * 2017-05-15 2017-10-03 东南大学 A kind of method for extracting rotor coil contour feature
CN107302718B (en) * 2017-08-17 2019-12-10 河南科技大学 video subtitle area positioning method based on angular point detection
CN107302718A (en) * 2017-08-17 2017-10-27 河南科技大学 A kind of video caption area positioning method based on Corner Detection
CN107688788A (en) * 2017-08-31 2018-02-13 平安科技(深圳)有限公司 Document charts abstracting method, electronic equipment and computer-readable recording medium
CN107688788B (en) * 2017-08-31 2021-01-08 平安科技(深圳)有限公司 Document chart extraction method, electronic device and computer readable storage medium
CN108256518A (en) * 2017-11-30 2018-07-06 北京元心科技有限公司 Detection method and detection device for character region
CN108256518B (en) * 2017-11-30 2021-07-06 北京元心科技有限公司 Character area detection method and device
CN109359644A (en) * 2018-08-28 2019-02-19 东软集团股份有限公司 Character image uniformity comparison method, apparatus, storage medium and electronic equipment
CN109299682A (en) * 2018-09-13 2019-02-01 北京字节跳动网络技术有限公司 Video text detection method, device and computer readable storage medium
CN109460768A (en) * 2018-11-15 2019-03-12 东北大学 A kind of text detection and minimizing technology for histopathology micro-image
CN109460768B (en) * 2018-11-15 2021-09-21 东北大学 Text detection and removal method for histopathology microscopic image
CN110059647A (en) * 2019-04-23 2019-07-26 杭州智趣智能信息技术有限公司 A kind of file classification method, system and associated component
CN113297875A (en) * 2020-02-21 2021-08-24 华为技术有限公司 Video character tracking method and electronic equipment
CN113297875B (en) * 2020-02-21 2023-09-29 华为技术有限公司 Video text tracking method and electronic equipment
CN112668468A (en) * 2020-12-28 2021-04-16 北京翰立教育科技有限公司 Photographing evaluation method and device
CN114782950A (en) * 2022-03-30 2022-07-22 慧之安信息技术股份有限公司 2D image text detection method based on Chinese character stroke characteristics
CN114782950B (en) * 2022-03-30 2022-10-21 慧之安信息技术股份有限公司 2D image text detection method based on Chinese character stroke characteristics
CN116152818A (en) * 2023-02-16 2023-05-23 中国工业互联网研究院 Method and system for improving identification accuracy of text lines of rotation image

Similar Documents

Publication Publication Date Title
CN101833664A (en) Video image character detecting method based on sparse expression
KR101856120B1 (en) Discovery of merchants from images
Huang et al. A new building extraction postprocessing framework for high-spatial-resolution remote-sensing imagery
Zhang et al. MCnet: Multiple context information segmentation network of no-service rail surface defects
CN105608456B (en) A kind of multi-direction Method for text detection based on full convolutional network
CN107133955B (en) A kind of collaboration conspicuousness detection method combined at many levels
CN106610969A (en) Multimodal information-based video content auditing system and method
CN108875600A (en) A kind of information of vehicles detection and tracking method, apparatus and computer storage medium based on YOLO
CN106408030B (en) SAR image classification method based on middle layer semantic attribute and convolutional neural networks
CN103824079B (en) Multi-level mode sub block division-based image classification method
CN104978567B (en) Vehicle checking method based on scene classification
CN104166841A (en) Rapid detection identification method for specified pedestrian or vehicle in video monitoring network
CN105574063A (en) Image retrieval method based on visual saliency
CN111967313B (en) Unmanned aerial vehicle image annotation method assisted by deep learning target detection algorithm
CN102968637A (en) Complicated background image and character division method
CN107480607B (en) Method for detecting and positioning standing face in intelligent recording and broadcasting system
CN105608454A (en) Text structure part detection neural network based text detection method and system
CN104751153B (en) A kind of method and device of identification scene word
CN106055653A (en) Video synopsis object retrieval method based on image semantic annotation
CN101719142A (en) Method for detecting picture characters by sparse representation based on classifying dictionary
CN103473545A (en) Text-image similarity-degree measurement method based on multiple features
CN103530638A (en) Method for matching pedestrians under multiple cameras
CN105912739B (en) A kind of similar pictures searching system and its method
Wu et al. Recognition of Student Classroom Behaviors Based on Moving Target Detection.
CN106156691A (en) The processing method of complex background image and device thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20100915