CN106682669A - Image processing method and mobile terminal - Google Patents
Image processing method and mobile terminal Download PDFInfo
- Publication number
- CN106682669A CN106682669A CN201611161577.6A CN201611161577A CN106682669A CN 106682669 A CN106682669 A CN 106682669A CN 201611161577 A CN201611161577 A CN 201611161577A CN 106682669 A CN106682669 A CN 106682669A
- Authority
- CN
- China
- Prior art keywords
- character
- image
- width
- target
- target image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/635—Overlay text, e.g. embedded captions in a TV program
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
Abstract
An embodiment of the invention provides an image processing method and a mobile terminal. The method comprises the steps of acquiring a target image; performing text testing on the target image, and obtaining at least one text area image; performing character segmentation on at least one text area image and obtaining P character areas, wherein P is a positive integer; identifying K characters by means of a target classifier, obtaining Q characters and the width of each character in the Q characters, wherein Q is a positive integer which is smaller than P; determining a target character width according to the widths of the Q characters; and identifying the P character areas through a slide block with the target character width, and obtaining the time stamp of the target image. Through the image processing method and the mobile terminal, the time stamp of the image can be quickly extracted on the condition that the mobile terminal is in a single-machine state.
Description
Technical field
The present invention relates to technical field of image processing, and in particular to a kind of image processing method and mobile terminal.
Background technology
Under normal circumstances, in security protection work, generally require scene interested in monitor video and be labeled the time, with
Just recall, and calculate the time difference of two events.At present, to scene label time interested often through artificial, according to
The date of the mark of monitor video is manually entered, and is more bothered.
In prior art, its character area is not difficult to find out due to caption character clearly.Text filed detection is often
Based on some more ripe methods, such as Corner Detection, rim detection, connected domain, texture feature extraction, in mobile terminal
Actual requirement is not reached in the case of (such as mobile phone, panel computer).Additionally, text background complexity is various, to splitting and recognizing
Character also result in certain harmful effect.Under normal circumstances, need first to find out text filed, then text filed character is entered
Row segmentation, the word being partitioned into is recognized finally by the method for machine learning.At present, mobile terminal is under unit state,
The timestamp in image can not quickly be extracted.
The content of the invention
Embodiments provide a kind of image processing method and mobile terminal, to rapidly extract in image when
Between stab.
Embodiment of the present invention first aspect provides a kind of image processing method, including:
Obtain target image;
Text detection is carried out to the target image, at least one text filed image is obtained;
Character segmentation is carried out to described at least one text filed image, P character zone is obtained, the P is positive integer;
The P character zone is identified using object classifiers, obtains every in Q character and the Q character
The width of one character, the Q is less than the positive integer of the P;
Target character width is determined according to the width of the Q character;
The P character zone is identified with the slide block of the target character width, obtains the target image
Timestamp.
Alternatively, the acquisition target image, including:
Obtain pending image;
Gaussian smoothing is carried out to the pending image using default template, the target image is obtained.
Alternatively, it is described that text detection is carried out to the target image, at least one text filed image is obtained, including:
The difference square of the horizontal direction of the target image is calculated, multiple difference square values are obtained;
Calculate the difference quadratic sum of the plurality of difference square value;
Targets threshold is determined according to the difference quadratic sum;
The target image is detected using default sliding window, obtain M candidate frame, the M is whole more than 1
Number;
Floor projection is carried out to the M candidate frame, the M projection matrix is obtained;
The region of the corresponding candidate frame of N number of projection matrix in the M projection matrix more than the targets threshold is made
For described at least one text filed image, the N is less than the positive integer of the M.
Alternatively, it is described that text detection is carried out to the target image, at least one text filed image is obtained, including:
Determine the integrogram of the target image;
Mask figure is determined according to the integrogram;
Connected region in the mask figure is numbered, the K numbering is obtained, the K is positive integer;
Determine the maximum collection and minimum of a value collection in the K numbering;
Described at least one text filed image is determined according to the minimum of a value collection and the minimum of a value collection.
Alternatively, the width according to the Q character determines target character width, including:
Using the most character duration of occurrence number in the width of the Q character as the target character width.
Embodiment of the present invention second aspect provides a kind of mobile terminal, including:
Acquiring unit, for obtaining target image;
Detector unit, for carrying out text detection to the target image, obtains at least text filed image;
Cutting unit, for carrying out Character segmentation to described at least one text filed image, obtains P character zone,
The P is positive integer;
Recognition unit, for being identified to the P character zone using object classifiers, obtains Q character and institute
The width of each character in Q character is stated, the Q is less than the positive integer of the P;
Determining unit, for determining target character width according to the width of the Q character;
The recognition unit, also particularly useful for:
The P character zone is identified with the slide block of the target character width, obtains the target image
Timestamp.
Alternatively, the acquiring unit includes:
Acquisition module, for obtaining pending image;
Processing module, for carrying out Gaussian smoothing to the pending image using default template, obtains the mesh
Logo image.
Alternatively, the detector unit includes:
Computing module, for calculating the difference square of the horizontal direction of the target image, obtains multiple difference square values;
The computing module, also particularly useful for:
Calculate the difference quadratic sum of the plurality of difference square value;
First determining module, for determining targets threshold according to the difference quadratic sum;
Detection module, for detecting to the target image using default sliding window, obtains M candidate frame, institute
It is the integer more than 1 to state M;
Projection module, for carrying out floor projection to the M candidate frame, obtains the M projection matrix;
Second determining module, for the N number of projection matrix in the M projection matrix more than the targets threshold is corresponding
Candidate frame region as described at least one text filed image, the N is less than the positive integer of the M.
Alternatively, the detector unit includes:
3rd determining module, for determining the integrogram of the target image;
3rd determining module, also particularly useful for:
Mask figure is determined according to the integrogram;
Numbering module, for being numbered to the connected region in the mask figure, obtains the K numbering, and the K is
Positive integer;
4th determining module, for determine it is described K numbering in maximum collection and minimum of a value collection;
4th determining module, also particularly useful for:
Described at least one text filed image is determined according to the minimum of a value collection and the minimum of a value collection.
Alternatively, the determining unit specifically for:
Using the most character duration of occurrence number in the width of the Q character as the target character width.
Implement the embodiment of the present invention, have the advantages that:
By the embodiment of the present invention, target image is obtained, text detection is carried out to target image, obtain at least one text
Area image, at least one text filed image Character segmentation is carried out, and obtains P character zone, and P is positive integer, is adopted
Object classifiers are identified to K character, obtain the width of each character in Q character and Q character, and Q is less than K just
Integer, according to the width of Q character target character width is determined, P character zone is carried out with the slide block of target character width
Identification, obtains the timestamp of target image.Thus, mobile terminal under unit state, can rapidly extract the timestamp of image.
Description of the drawings
Technical scheme in order to be illustrated more clearly that the embodiment of the present invention, below will be to making needed for embodiment description
Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are some embodiments of the present invention, for ability
For the those of ordinary skill of domain, on the premise of not paying creative work, can be attached to obtain others according to these accompanying drawings
Figure.
Fig. 1 is a kind of second embodiment schematic flow sheet of image processing method provided in an embodiment of the present invention;
Fig. 1 a are the demonstration schematic diagrames of timestamp provided in an embodiment of the present invention;
Fig. 1 b are the demonstration schematic diagrames of smooth template provided in an embodiment of the present invention;
Fig. 2 a are a kind of first embodiment structural representations of mobile terminal provided in an embodiment of the present invention;
Fig. 2 b are the structural representations of the acquiring unit of the mobile terminal described by Fig. 2 a provided in an embodiment of the present invention;
Fig. 2 c are the structural representations of the detector unit of the mobile terminal described by Fig. 2 a provided in an embodiment of the present invention;
Fig. 2 d are the another structural representations of the detector unit of the mobile terminal described by Fig. 2 a provided in an embodiment of the present invention
Figure;
Fig. 3 is a kind of second embodiment structural representation of mobile terminal provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is a part of embodiment of the invention, rather than the embodiment of whole.Based on this
Embodiment in bright, the every other enforcement that those of ordinary skill in the art are obtained under the premise of creative work is not made
Example, belongs to the scope of protection of the invention.
Term " first ", " second ", " the 3rd " in description and claims of this specification and the accompanying drawing and "
Four " it is etc. for distinguishing different objects, rather than for describing particular order.Additionally, term " comprising " and " having " and it
Any deformation, it is intended that cover and non-exclusive include.For example contain the process of series of steps or unit, method, be
System, product or equipment are not limited to the step of listing or unit, but alternatively also include the step of not listing or list
Unit, or alternatively also include other steps intrinsic for these processes, method, product or equipment or unit.
Referenced herein " embodiment " is it is meant that the special characteristic, structure or the characteristic that describe can be wrapped in conjunction with the embodiments
In being contained at least one embodiment of the present invention.It is identical that each position in the description shows that the phrase might not each mean
Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and
Implicitly it is understood by, embodiment described herein can be in combination with other embodiments.
Mobile terminal described by the embodiment of the present invention can include smart mobile phone (as Android phone, iOS mobile phones,
Windows Phone mobile phones etc.), panel computer, palm PC, notebook computer, mobile internet device (MID, Mobile
Internet Devices) or Wearable etc., above-mentioned is only citing, and non exhaustive, including but not limited to above-mentioned mobile whole
End.
Deep learning as machine learning research in a frontier, this 2 years image recognition, speech recognition and
Natural language processing aspect achieves huge success.Deep learning is to train number by building multilayer neural network model
According to, can learn useful feature, by great amount of samples study can obtain very high recognition correct rate.But at the same time
When needing to recognize multiple attributes, existing deep learning method often by out, be by each attribute independent each category
Property training one model, this undoubtedly considerably increases complexity.Therefore, how each attribute relationship to be got up, by design one
Individual model can be identified becoming the problem for continuing to solve instantly to multiple attributes.
Fig. 1 is referred to, is a kind of first embodiment schematic flow sheet of image processing method provided in an embodiment of the present invention.
Image processing method described in the present embodiment, comprises the following steps:
101st, target image is obtained.
Wherein, target image can be the image comprising timestamp, as shown in Figure 1a.
Alternatively, in above-mentioned steps 101, target image is obtained, it may include following steps:
11), pending image is obtained;
12), Gaussian smoothing is carried out to the pending image using default template, obtains the target image.
Pending image in above-mentioned steps 11 can be the frame in a certain video file, or any image.Can
Gaussian smoothing is carried out to pending image using Gaussian smoothing algorithm, it is of course also possible to use default template is to pending
Image carries out Gaussian smoothing.
For example, taken pictures with mobile terminal (such as mobile phone) or during recorded video, the pending image that obtains (image or
Person's frame of video) in Moire fringe occurs, therefore, can mitigate this interference to pending picture smooth treatment.Usual feelings
Under condition, Gaussian smoothing is a kind of conventional method, however, it is time-consuming higher, thus, above-mentioned default template employs putting down such as Fig. 2
Sliding template, compared with smooth template of the prior art, the Gaussian smoothing algorithm of the embodiment of the present invention, in Practical Calculation,
Avoid the need for carrying out being multiplied with some floating numbers, thus, the processing speed of Gaussian smoothing algorithm can be accelerated faster.It is as follows, on
Default template m stated is represented by:
102nd, text detection is carried out to the target image, obtains at least one text filed image.
Alternatively, in above-mentioned steps 102, text detection is carried out to the target image, obtains at least one text filed
Image, it may include following steps:
21) difference square of the horizontal direction of the target image, is calculated, multiple difference square values are obtained;
22) the difference quadratic sum of the plurality of difference square value, is calculated;
23), targets threshold is determined according to the difference quadratic sum;
24), detect that obtain M candidate frame, the M is more than 1 to the target image using default sliding window
Integer;
25), floor projection is carried out to the M candidate frame, obtains the M projection matrix;
26), the area of the corresponding candidate frame of N number of projection matrix of the targets threshold will be more than in the M projection matrix
Used as described at least one text filed image, the N is less than the positive integer of the M in domain.
Specifically, it is assumed that target image is I, a width of w, a height of h, the text filed image in the embodiment of the present invention can be managed and be
Area-of-interest (ROI), its basic step can be as follows:
1st, the difference square d in calculated level direction, in addition to first pixel, each pixel can obtain one it is corresponding
Value, as can seen above, the target image may be defined as I.Calculate summation s of d;
2nd, a targets threshold T=λ × s for being used to determine whether ROI is obtained according to summation, wherein, λ is an experience
Value.
3rd, a width w1, w1=w × λ are set2, λ2Span for (0,1), for example, λ2Span be 0.1
To between 0.2.
4. a sliding window is set, is highly h, width is w, is slided in target image I by default step-length.
5. the matrix in pair sliding window does the projection of horizontal direction, obtains the array (a that length is h1,a2, a3...,
ah), if a (i) > are T × w1, the i-th row in the sliding window is considered as ROI, and i is satisfactory region between 0~h.
Wherein, in above-mentioned processing procedure, due to needing to calculate projection, the operation is accumulation operations, and sliding window every time
Will computing, and to obtain the mask figure of area-of-interest, be subsequently additionally operable to calculate the coordinate of starting and terminal respectively, this meaning
Taste needs to connecting Field Number.Thus, for larger image, these operations are than relatively time-consuming.
Alternatively, in above-mentioned steps 102, text detection is carried out to the target image, obtains at least one text filed
Image, it may include following steps:
A the integrogram of the target image), is determined;
B), mask figure is determined according to the integrogram;
C), the connected region in the mask figure is numbered, the K numbering is obtained, the K is positive integer;
D the maximum collection and minimum of a value collection in the K numbering), is determined;
E), described at least one text filed image is determined according to the minimum of a value collection and the minimum of a value collection.
Specifically, it is assumed that target image is I, a width of w, a height of h, the text filed image in the embodiment of the present invention can be managed and be
Area-of-interest (ROI), is only illustrated below with simple examples, for example, as follows:
1. a mask is set, and its is a width ofIt is a height of
2., for target image I, the integrogram of a horizontal direction can be calculated according to equation below:
3. sliding window is calculated in projection process, can calculate often capable projected size according to equation below:
ay=Is(x,y)-Is(x,y-w)
4. in target image in 4 × 4 region, if having more than targets threshold, in mask figure respective coordinates subscript 1,
Mask figure coordinate is a quarter of artwork corresponding coordinate, rounds downwards.As seen in Figure 3, it is big only with a quarter
Little mask figure, also can well mark text filed candidate region.
Thus, after obtaining mask figure, then can return to the origin coordinates and length and width of each area-of-interest.Using method step
It is rapid as follows:
1., using the method for 4- connected regions, each connected region of mask is numbered.
2. the maximum and minimum of a value of connected region each numbering are found out, and the collection being made up of all maximums is combined into maximum
Collection, the collection being made up of all minimum of a values is combined into minimum of a value collection, then, the multiple (expanding 4 times) that mask reduces before being multiplied by, i.e., and
For the corresponding region of target image.
Thus, all processes obtained for ROI region, in actual applications, speed is very fast, and the part is almost felt not
To interim card.
103rd, Character segmentation is carried out to described at least one text filed image, obtains P character zone, the P is for just
Integer.
Wherein, after text filed image is extracted, wherein, some regions are interference, and some regions are text filed.
It is above-mentioned it is text filed also have various situations, some are the explanations of Chinese characters kind, and some are scene word, such as billboard, car plate etc. this
Come the word that there is.Because only Identification Date, the format write on date are also more complicated, have plenty of XXXX XX days month XX,
Have plenty of XXXX-XX-XX, that what is had is then XXXX/XX/XX, the method to splitting using mould causes certain difficulty, so
Also need to some complex operations.
104th, the P character zone is identified using object classifiers, obtains Q character and the Q character
In each character width, the Q is less than the positive integer of the P.
105th, target character width is determined according to the width of the Q character.
Alternatively, in above-mentioned steps 105, target character width is determined according to the Q character zone width, it may include such as
Lower step:
Using the most character duration of occurrence number in width in the Q character as the target character width.
106th, the P character zone is identified with the slide block of the target character width, obtains the target figure
The timestamp of picture.
Above-mentioned P character zone can be identified using object classifiers, thus, Q character and the Q word is obtained
The corresponding character duration of each character in symbol, above-mentioned Q is the positive integer less than P.
Alternatively, before step 101, it may include following steps:
110) positive sample collection and negative sample collection, are determined;
120), the positive sample collection and the negative sample collection can be trained using convolutional neural networks, obtains target
Grader.
Wherein, the object classifiers in above-mentioned steps 104 can be identified to the numeral on date, and it is based primarily upon convolution god
Jing networks (CNN).Specifically, can refer to following Procedure Acquisition positive sample collection:
The first step:Obtain the complicated numeral of some backgrounds, the frame of video for looking for several scenes abundant at random from network, Ran Houyong
The method of captioning, in arbitrary position the captions of pure digi-tal are increased;
Second step:Numeral is intercepted out from the frame of video of the first step, it is The more the better.
In this regard, first with one SVM classifier of a small amount of numeral training to these connected region classification, correspondence can be saved in
Classification under, when selecting again, mistake move under corresponding classification, workload can be so greatly reduced.Obtain
After obtaining the numeral of many complex backgrounds, then trained with these numerals.
In addition, in addition it is also necessary to select a certain amount of impurity, as negative sample collection, then CNN is set per layer parameter, experiment is different
Parameter find out the preferable number of plies and every layer of convolution size, train object classifiers, this target is directly used when test
Grader.After obtaining object classifiers, the operation to each two field picture is comprised the following steps that:
1. the text filed image (ROI) of at least one obtained in each step 103 is obtained successively, makes it be R1If, R1It is long
Width meets pre-conditioned, carries out next step operation, otherwise takes next ROI.
2. couple R1Carry out binaryzation and obtain R2, to overcome uneven illumination, region can be divided into many fritters, each fritter is used
Da-Jin algorithm finds out segmentation threshold, then carries out binaryzation to each fritter.
3. with the method for 4- connected domains to R2Connected region label.
4. the starting point of current ROI is based on, according to R2Each connected region transverse and longitudinal coordinate minimum of a value, maximum are xmin, ymin,
xmax, ymax, successively from R1The region D that the minimum transverse and longitudinal coordinate of each connected region and maximum transverse and longitudinal coordinate are encircled a city is intercepted out,
A height of (the m of its widthd,nd).D sizes are reset to (m, n), and do grey scale, input object classifiers recognize the r that obtains a result,
If r is numeral, its m is recordedd, and xmin。
5., if it is numeral that the ROI has more recognition result, the ROI is probably the field on date.Statistics with histogram
The all of m of the setd。
6th, that most values are counted, is then possible width.Take modal width, and the width place vertical bar
Previous, the width existed in latter vertical bar, take average as this ROI numeral may width, if this value is
7th, again to R1Width is more thanConnected domain, be with a widthSlide block be identified.If identification
Go out numeral, record result, coordinate, and move slide blockOtherwise move a less step-length identification.
8th, finally all results of this ROI are sorted according to abscissa, if living coordinate occurs, then with the
The result of 7 steps covers the result of the 4th step.
The output of each two field picture is often some numerals, but not necessarily all numeral is all correctly identified
Come, impurity or other numerals may be identified as.At this time, the numeric string of multiple images is recorded.
Alternatively, in video processing procedure, after step 106, can also comprise the steps of:
1st, the possible field of some time zones, such as " 20 " are found first, or, " 19 ".
2nd, for every two field picture, some digits behind digital possible field are lined up, digit can arrange one in advance
If threshold value, such as 14, it is also possible to user input, represent the digit of time character.
If the 3, being both less than predetermined threshold value per the digit behind frame figure, recognition failures reacquire frame of video.
4th, if the digit behind the image more than a frame has exceeded predetermined threshold value, then the side of ballot can be proceeded by
Method, removing (often the second, can not stop to change) value for counting each every two field picture per beyond last position, take out now most
Many real predicted values for this
If the 5, failure (such as voting equal), resurveys next frame of video.
As can be seen that by the embodiment of the present invention, obtaining target image, text detection is carried out to target image, obtain to
A few text filed image, at least one text filed image Character segmentation is carried out, and obtains P character zone, and P is just whole
Number, is identified using object classifiers to P character zone, obtains the width of each character in Q character and Q character, Q
It is the positive integer less than P, target character width is determined according to the width of Q character, with the slide block of target character width to P word
Symbol region is identified, and obtains the timestamp of target image.Thus, mobile terminal can rapidly extract figure under unit state
The timestamp of picture.
Consistent with the abovely, it is below the device of the above-mentioned image processing method of enforcement, it is specific as follows:
Fig. 2 a are referred to, is a kind of first embodiment structural representation of mobile terminal provided in an embodiment of the present invention.This
Mobile terminal described in embodiment, including:Acquiring unit 201, detector unit 202, cutting unit 203, recognition unit 204
With determining unit 205, it is specific as follows:
Acquiring unit 201, for obtaining target image;
Detector unit 202, for carrying out text detection to the target image, obtains at least one text filed image;
Cutting unit 203, for carrying out Character segmentation to described at least one text filed image, obtains P character area
Domain, the P is positive integer;
Recognition unit 204, for being identified to the P character zone using object classifiers, obtain Q character and
The width of each character in the Q character, the Q is less than the positive integer of the P;
Determining unit 205, for determining target character width according to the width of the Q character;
The recognition unit 204, also particularly useful for:
The P character zone is identified with the slide block of the target character width, obtains the target image
Timestamp.
Alternatively, as shown in Figure 2 b, Fig. 2 b are the concrete refinement knot of the acquiring unit 201 of the mobile terminal described by Fig. 2 a
Structure, the acquiring unit 201 may include:Acquisition module 2011 and processing module 2012, it is specific as follows:
Acquisition module, for obtaining pending image;
Processing module, for carrying out Gaussian smoothing to the pending image using default template, obtains the mesh
Logo image.
Alternatively, as shown in Figure 2 c, Fig. 2 c are the concrete refinement knot of the detector unit 202 of the mobile terminal described by Fig. 2 a
Structure, the detector unit 202 may include:Computing module 2021, the first determining module 2022, detection module 2023, projection module
2024 and second determining module 2025, it is specific as follows:
Computing module 2021, for calculating the difference square of the horizontal direction of the target image, obtains multiple difference and puts down
Side's value;
The computing module 2021, also particularly useful for:
Calculate the difference quadratic sum of the plurality of difference square value;
First determining module 2022, for determining targets threshold according to the difference quadratic sum;
Detection module 2023, for detecting to the target image using default sliding window, obtains M candidate
Frame, the M is the integer more than 1;
Projection module 2024, for carrying out floor projection to the M candidate frame, obtains the M projection matrix;
Second determining module, for the N number of projection matrix in the M projection matrix more than the targets threshold is corresponding
Candidate frame region as described at least one text filed image, the N is less than the positive integer of the M.
Alternatively, as shown in Figure 2 d, Fig. 2 d are the another concrete thin of the detector unit 202 of the mobile terminal described by Fig. 2 a
Change structure, the detector unit 202 may include:3rd determining module 2026, the determining module 2028 of numbering module 2027 and the 4th,
It is specific as follows:
3rd determining module 2026, for determining the integrogram of the target image;
3rd determining module 2026, also particularly useful for:
Mask figure is determined according to the integrogram;
Numbering module 2027, for being numbered to the connected region in the mask figure, obtains the K numbering, institute
K is stated for positive integer;
4th determining module 2028, for determine it is described K numbering in maximum collection and minimum of a value collection;
4th determining module 2028, also particularly useful for:
Described at least one text filed image is determined according to the minimum of a value collection and the minimum of a value collection.
Alternatively, above-mentioned determining unit 205 specifically for:
Using the most character duration of occurrence number in the width of the Q character zone as the target character width.
As can be seen that by the mobile terminal described by the embodiment of the present invention, target image can be obtained, target image is entered
Row text detection, obtains at least one text filed image, and at least one text filed image Character segmentation is carried out, and obtains P
Individual character zone, P is positive integer, and P character zone is identified using object classifiers, obtains Q character zone and Q
The width of each character in character, Q is the positive integer less than P, target character width is determined according to the width of Q character, with mesh
The slide block of mark character duration is identified to P character zone, obtains the timestamp of target image.Thus, mobile terminal can be
Under unit state, the timestamp of image is rapidly extracted.
Consistent with the abovely, Fig. 3 is referred to, is a kind of second embodiment knot of mobile terminal provided in an embodiment of the present invention
Structure schematic diagram.Mobile terminal described in the present embodiment, including:At least one input equipment 1000;At least one output sets
Standby 2000;At least one processor 3000, such as CPU;With memory 4000, above-mentioned input equipment 1000, output equipment 2000,
Processor 3000 and memory 4000 are connected by bus 5000.
Wherein, above-mentioned input equipment 1000 concretely contact panel, physical button or mouse.
The concretely display screen of above-mentioned output equipment 2000.
Above-mentioned memory 4000 can be high-speed RAM memory, alternatively nonvolatile storage (non-volatile
Memory), such as magnetic disc store.Above-mentioned memory 4000 is used to store batch processing code, above-mentioned input equipment 1000, defeated
Going out equipment 2000 and processor 3000 is used to call the program code stored in memory 4000, performs following operation:
Above-mentioned processor 3000, is used for:
Obtain target image;
Text detection is carried out to the target image, at least one text filed image is obtained;
Character segmentation is carried out to described at least one text filed image, P character zone is obtained, the P is positive integer;
The P character zone is identified using object classifiers, obtains every in Q character and the Q character
The width of one character, the Q is less than the positive integer of the P;
Target character width is determined according to the width of the Q character;
The P character zone is identified with the slide block of the target character width, obtains the target image
Timestamp.
Alternatively, above-mentioned processor 3000 obtains target image, including:
Obtain pending image;
Gaussian smoothing is carried out to the pending image using default template, the target image is obtained.
Alternatively, 3000 pairs of target images of above-mentioned processor carry out text detection, obtain at least one text filed
Image, including:
The difference square of the horizontal direction of the target image is calculated, multiple difference square values are obtained;
Calculate the difference quadratic sum of the plurality of difference square value;
Targets threshold is determined according to the difference quadratic sum;
The target image is detected using default sliding window, obtain M candidate frame, the M is whole more than 1
Number;
Floor projection is carried out to the M candidate frame, the M projection matrix is obtained;
The region of the corresponding candidate frame of N number of projection matrix in the M projection matrix more than the targets threshold is made
For described at least one text filed image, the N is less than the positive integer of the M.
Alternatively, 3000 pairs of target images of above-mentioned processor carry out text detection, obtain at least one text filed
Image, including:
Determine the integrogram of the target image;
Mask figure is determined according to the integrogram;
Connected region in the mask figure is numbered, the K numbering is obtained, the K is positive integer;
Determine the maximum collection and minimum of a value collection in the K numbering;
Described at least one text filed image is determined according to the minimum of a value collection and the minimum of a value collection.
Alternatively, above-mentioned processor 3000 determines target character width according to the width of the Q character, including:
Using the most character duration of occurrence number in the width of the Q character as the target character width.
The embodiment of the present invention also provides a kind of computer-readable storage medium, wherein, the computer-readable storage medium can be stored with journey
Sequence, including the part or all of step of any image processing method described in said method embodiment during the program performing
Suddenly.
Although here combines each embodiment, and invention has been described, however, implementing the present invention for required protection
During, those skilled in the art are by checking the accompanying drawing, disclosure and appended claims, it will be appreciated that and it is real
Other changes of the existing open embodiment.In the claims, " including " (comprising) word be not excluded for other composition
Part or step, "a" or "an" is not excluded for multiple situations.Single processor or other units can realize claim
In some functions enumerating.Mutually different has been recited in mutually different dependent some measures, it is not intended that these are arranged
Apply to combine and produce good effect.
It will be understood by those skilled in the art that embodiments of the invention can be provided as method, device (equipment) or computer journey
Sequence product.Therefore, the present invention can using complete hardware embodiment, complete software embodiment or with reference to software and hardware in terms of
The form of embodiment.And, the present invention can be adopted and wherein include the calculating of computer usable program code at one or more
The computer program implemented in machine usable storage medium (including but not limited to magnetic disc store, CD-ROM, optical memory etc.)
The form of product.Computer program is stored/distributed in suitable medium, is provided together with other hardware or as the one of hardware
Part, it would however also be possible to employ other distribution forms, such as by Internet or other wired or wireless telecommunication systems.
The present invention be with reference to the embodiment of the present invention method, device (equipment) and computer program flow chart with/
Or block diagram is describing.It should be understood that can by each flow process in computer program instructions flowchart and/or block diagram and/
Or the combination of square frame and flow chart and/or the flow process in block diagram and/or square frame.These computer program instructions can be provided
To the processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices producing one
Individual machine so that produced for realizing by the instruction of computer or the computing device of other programmable data processing devices
The device of the function of specifying in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or multiple square frames.
These computer program instructions may be alternatively stored in can guide computer or other programmable data processing devices with spy
In determining the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory is produced to be included referring to
Make the manufacture of device, the command device realize in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or
The function of specifying in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing devices so that in meter
Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented process, so as in computer or
The instruction performed on other programmable devices is provided for realizing in one flow process of flow chart or multiple flow processs and/or block diagram one
The step of function of specifying in individual square frame or multiple square frames.
Although with reference to specific features and embodiment, invention has been described, it is clear that, without departing from this
In the case of bright spirit and scope, various modifications and combinations can be carried out to it.Correspondingly, the specification and drawings are only institute
The exemplary illustration of the invention that attached claim is defined, and be considered as cover in the scope of the invention any and all and repair
Change, change, combining or equivalent.Obviously, those skilled in the art the present invention can be carried out it is various change and modification and not
Depart from the spirit and scope of the present invention.So, if the present invention these modification and modification belong to the claims in the present invention and its
Within the scope of equivalent technologies, then the present invention is also intended to comprising these changes and modification.
Claims (10)
1. a kind of image processing method, it is characterised in that include:
Obtain target image;
Text detection is carried out to the target image, at least one text filed image is obtained;
Character segmentation is carried out to described at least one text filed image, P character zone is obtained, the P is positive integer;
The P character zone is identified using object classifiers, obtains each word in Q character and the Q character
The width of symbol, the Q is less than the positive integer of the P;
Target character width is determined according to the width of the Q character;
The P character zone is identified with the slide block of the target character width, obtains the time of the target image
Stamp.
2. method according to claim 1, it is characterised in that the acquisition target image, including:
Obtain pending image;
Gaussian smoothing is carried out to the pending image using default template, the target image is obtained.
3. method according to claim 1, it is characterised in that described to carry out text detection to the target image, obtains
At least one text filed image, including:
The difference square of the horizontal direction of the target image is calculated, multiple difference square values are obtained;
Calculate the difference quadratic sum of the plurality of difference square value;
Targets threshold is determined according to the difference quadratic sum;
The target image is detected using default sliding window, obtain M candidate frame, the M is the integer more than 1;
Floor projection is carried out to the M candidate frame, the M projection matrix is obtained;
The region of the corresponding candidate frame of N number of projection matrix of the targets threshold will be more than in the M projection matrix as institute
At least one text filed image is stated, the N is less than the positive integer of the M.
4. the method according to any one of claims 1 to 3, it is characterised in that described that text is carried out to the target image
Detection, obtains at least one text filed image, including:
Determine the integrogram of the target image;
Mask figure is determined according to the integrogram;
Connected region in the mask figure is numbered, the K numbering is obtained, the K is positive integer;
Determine the maximum collection and minimum of a value collection in the K numbering;
Described at least one text filed image is determined according to the minimum of a value collection and the minimum of a value collection.
5. the method according to any one of claims 1 to 3, it is characterised in that the width according to the Q character is true
Set the goal character duration, including:
Using the most character duration of occurrence number in the width of the Q character as the target character width.
6. a kind of mobile terminal, it is characterised in that include:
Acquiring unit, for obtaining target image;
Detector unit, for carrying out text detection to the target image, obtains at least one text filed image;
Cutting unit, for carrying out Character segmentation to described at least one text filed image, obtains P character zone, the P
For positive integer;
Recognition unit, for being identified to the P character zone using object classifiers, obtains Q character and the Q
The width of each character in character, the Q is less than the positive integer of the P;
Determining unit, for determining target character width according to the width of the Q character;
The recognition unit, also particularly useful for:
The P character zone is identified with the slide block of the target character width, obtains the time of the target image
Stamp.
7. mobile terminal according to claim 6, it is characterised in that the acquiring unit includes:
Acquisition module, for obtaining pending image;
Processing module, for carrying out Gaussian smoothing to the pending image using default template, obtains the target figure
Picture.
8. mobile terminal according to claim 6, it is characterised in that the detector unit includes:
Computing module, for calculating the difference square of the horizontal direction of the target image, obtains multiple difference square values;
The computing module, also particularly useful for:
Calculate the difference quadratic sum of the plurality of difference square value;
First determining module, for determining targets threshold according to the difference quadratic sum;
Detection module, for detecting that obtain M candidate frame, the M is to the target image using default sliding window
Integer more than 1;
Projection module, for carrying out floor projection to the M candidate frame, obtains the M projection matrix;
Second determining module, for the corresponding time of N number of projection matrix by the targets threshold is more than in the M projection matrix
Select the region of frame as described at least one text filed image, the N is less than the positive integer of the M.
9. the mobile terminal according to any one of claim 6 to 8, it is characterised in that the detector unit includes:
3rd determining module, for determining the integrogram of the target image;
3rd determining module, also particularly useful for:
Mask figure is determined according to the integrogram;
Numbering module, for being numbered to the connected region in the mask figure, obtains the K numbering, and the K is just whole
Number;
4th determining module, for determine it is described K numbering in maximum collection and minimum of a value collection;
4th determining module, also particularly useful for:
Described at least one text filed image is determined according to the minimum of a value collection and the minimum of a value collection.
10. the mobile terminal according to any one of claim 6 to 8, it is characterised in that the determining unit specifically for:
Using the most character duration of occurrence number in the width of the Q character as the target character width.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611161577.6A CN106682669A (en) | 2016-12-15 | 2016-12-15 | Image processing method and mobile terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611161577.6A CN106682669A (en) | 2016-12-15 | 2016-12-15 | Image processing method and mobile terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106682669A true CN106682669A (en) | 2017-05-17 |
Family
ID=58869038
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611161577.6A Pending CN106682669A (en) | 2016-12-15 | 2016-12-15 | Image processing method and mobile terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106682669A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111460198A (en) * | 2019-01-18 | 2020-07-28 | 阿里巴巴集团控股有限公司 | Method and device for auditing picture timestamp |
CN111652204A (en) * | 2020-06-03 | 2020-09-11 | 广东小天才科技有限公司 | Method and device for selecting target text area, electronic equipment and storage medium |
CN112418109A (en) * | 2020-11-26 | 2021-02-26 | 复旦大学附属中山医院 | Image processing method and device |
CN112668573A (en) * | 2020-12-25 | 2021-04-16 | 平安科技(深圳)有限公司 | Target detection position reliability determination method and device, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102982331A (en) * | 2012-12-05 | 2013-03-20 | 曙光信息产业(北京)有限公司 | Method for identifying character in image |
CN106156767A (en) * | 2016-03-02 | 2016-11-23 | 平安科技(深圳)有限公司 | Driving license effect duration extraction method, server and terminal |
-
2016
- 2016-12-15 CN CN201611161577.6A patent/CN106682669A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102982331A (en) * | 2012-12-05 | 2013-03-20 | 曙光信息产业(北京)有限公司 | Method for identifying character in image |
CN106156767A (en) * | 2016-03-02 | 2016-11-23 | 平安科技(深圳)有限公司 | Driving license effect duration extraction method, server and terminal |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111460198A (en) * | 2019-01-18 | 2020-07-28 | 阿里巴巴集团控股有限公司 | Method and device for auditing picture timestamp |
CN111460198B (en) * | 2019-01-18 | 2023-06-20 | 阿里巴巴集团控股有限公司 | Picture timestamp auditing method and device |
CN111652204A (en) * | 2020-06-03 | 2020-09-11 | 广东小天才科技有限公司 | Method and device for selecting target text area, electronic equipment and storage medium |
CN112418109A (en) * | 2020-11-26 | 2021-02-26 | 复旦大学附属中山医院 | Image processing method and device |
CN112668573A (en) * | 2020-12-25 | 2021-04-16 | 平安科技(深圳)有限公司 | Target detection position reliability determination method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3916627A1 (en) | Living body detection method based on facial recognition, and electronic device and storage medium | |
CN108108732B (en) | Character recognition system and character recognition method thereof | |
US8750573B2 (en) | Hand gesture detection | |
US8792722B2 (en) | Hand gesture detection | |
CN106650740B (en) | A kind of licence plate recognition method and terminal | |
CN110517246B (en) | Image processing method and device, electronic equipment and storage medium | |
CN109583449A (en) | Character identifying method and Related product | |
CN106845331B (en) | A kind of image processing method and terminal | |
CN106650615B (en) | A kind of image processing method and terminal | |
CN106295502B (en) | A kind of method for detecting human face and device | |
CN111476284A (en) | Image recognition model training method, image recognition model training device, image recognition method, image recognition device and electronic equipment | |
CN104952083B (en) | A kind of saliency detection method based on the modeling of conspicuousness target background | |
CN112001932B (en) | Face recognition method, device, computer equipment and storage medium | |
CN106682669A (en) | Image processing method and mobile terminal | |
EP4047509A1 (en) | Facial parsing method and related devices | |
CN110287862B (en) | Anti-candid detection method based on deep learning | |
CN106650670A (en) | Method and device for detection of living body face video | |
CN108710893A (en) | A kind of digital image cameras source model sorting technique of feature based fusion | |
CN111368682A (en) | Method and system for detecting and identifying station caption based on faster RCNN | |
CN111160107B (en) | Dynamic region detection method based on feature matching | |
CN111062854A (en) | Method, device, terminal and storage medium for detecting watermark | |
CN113111880A (en) | Certificate image correction method and device, electronic equipment and storage medium | |
CN106295620A (en) | Hair style recognition methods and hair style identification device | |
CN114005019B (en) | Method for identifying flip image and related equipment thereof | |
CN113255557B (en) | Deep learning-based video crowd emotion analysis method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170517 |
|
RJ01 | Rejection of invention patent application after publication |