CN108062547A - Character detecting method and device - Google Patents
Character detecting method and device Download PDFInfo
- Publication number
- CN108062547A CN108062547A CN201711332870.9A CN201711332870A CN108062547A CN 108062547 A CN108062547 A CN 108062547A CN 201711332870 A CN201711332870 A CN 201711332870A CN 108062547 A CN108062547 A CN 108062547A
- Authority
- CN
- China
- Prior art keywords
- candidate region
- probability
- word
- mask
- word candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/243—Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The disclosure is directed to a kind of character detecting method and devices.Wherein, which includes:Word candidate region extraction is carried out to image, obtains multiple word candidate regions;Calculate the multiple word candidate region word probability and, calculate word probability and meet the mask figure of its character area of the word candidate region of probability threshold value requirement;The external border of minimum of the mask figure is calculated, obtains text detection result.Since the disclosure extracts multiple character areas in a manner that word candidate region is extracted, calculate the mask figure that word probability meets its character area of the word candidate region of probability threshold value requirement, and the shielding of mask figure is exactly that character area is in situ, therefore, no matter whether word tilts, word position can accurately be corresponded to by obtaining text detection result according to the external border of minimum of mask figure, therefore, the character detecting method that the embodiment of the present disclosure provides can tackle the word of various typesettings, effectively raise the recall rate of word.
Description
Technical field
This disclosure relates to computer realm more particularly to a kind of character detecting method and device.
Background technology
Text detection refers to find the position of word in image.
In correlation technique, text detection is generally realized using the method for object detection.In detection process, put down with several
Whether capable rectangle frame is to being that word is detected into the object in rectangle frame.However, this detection method is to irregular
Such as sideling the word detection result of typesetting is poor for word.
The content of the invention
To overcome the problems, such as present in correlation technique, the disclosure provides a kind of character detecting method and device.
According to the embodiment of the present disclosure in a first aspect, provide a kind of character detecting method, including:Word time is carried out to image
Favored area is extracted, and obtains multiple word candidate regions;Calculate the multiple word candidate region word probability and, calculate
Word probability meets the mask figure of its character area of the word candidate region of probability threshold value requirement;Calculate the minimum of the mask figure
External border obtains text detection result.
It is described to calculate the multiple word time according to a kind of possible embodiment of the first aspect of the embodiment of the present disclosure
The word probability of favored area and, calculate word probability meet its character area of the word candidate region of probability threshold value requirement
Mask figure includes:The word probability of the multiple word candidate region is calculated using convolutional neural networks and word probability meets generally
The mask figure of its character area of the word candidate region of rate threshold requirement.
According to a kind of possible embodiment of the first aspect of the embodiment of the present disclosure, the calculating word probability meets general
The mask figure of its character area of the word candidate region of rate threshold requirement includes:It is filtered out from the multiple word candidate region
Word probability meets the word candidate region of probability threshold value requirement;Its character area is carried out to the word candidate region filtered out
The calculating of mask figure.
According to a kind of possible embodiment of the first aspect of the embodiment of the present disclosure, the calculating word probability meets general
The mask figure of its character area of the word candidate region of rate threshold requirement includes:Calculate the multiple its word of word candidate region
The mask figure in region;From the mask figure of its character area of the multiple word candidate region, filter out probability and meet probability threshold
It is worth the mask figure of the word candidate region of requirement.
It is described to calculate the multiple word time according to a kind of possible embodiment of the first aspect of the embodiment of the present disclosure
The mask figure of its character area of favored area includes:Feature extraction is carried out to described image using the first convolutional neural networks, is obtained
Characteristic pattern, wherein first volume product neutral net is to have completed the convolutional neural networks of image characteristics extraction training;By described in
Multiple word candidate regions are respectively mapped in the characteristic pattern, obtain the corresponding feature in the multiple word candidate region
Region;The corresponding characteristic area in the multiple word candidate region is mapped as to the feature vector of fixed size;By described in
The feature vector of the corresponding fixed size in multiple word candidate regions is input to the second convolutional neural networks, and institute is calculated
The mask figure of the corresponding word probability in multiple word candidate regions and its character area is stated, wherein, the second convolution god
It is the convolutional neural networks for having completed word probability and the training of mask figure through network.
According to the second aspect of the embodiment of the present disclosure, a kind of text detection device is provided, including:Region extraction module, quilt
It is configured to carry out word candidate region extraction to image, obtains multiple word candidate regions.First computing module is configured as counting
Calculate multiple word candidate regions that the region extraction module obtains word probability and, calculate word probability meet probability
The mask figure of its character area of the word candidate region of threshold requirement.Second computing module is configured as calculating first meter
The external border of minimum for the mask figure that module calculates is calculated, obtains text detection result.
According to a kind of possible embodiment of the second aspect of the embodiment of the present disclosure, first computing module is configured
Meet probability threshold value to calculate the word probability of the multiple word candidate region and word probability using convolutional neural networks and want
The mask figure for its character area of word candidate region asked.
According to a kind of possible embodiment of the second aspect of the embodiment of the present disclosure, first computing module includes:
Probability calculation submodule is configured as calculating the word probability of the multiple word candidate region.Probability screen submodule, by with
It is set to from the multiple word candidate region and filters out the word candidate region that word probability meets probability threshold value requirement.Mask
Figure computational submodule is configured as carrying out the word candidate region filtered out the calculating of the mask figure of its character area.
According to a kind of possible embodiment of the second aspect of the embodiment of the present disclosure, first computing module includes:
Probability and mask figure computational submodule are configured as calculating the word probability and its character area of the multiple word candidate region
Mask figure.Probability and mask figure screening submodule, are configured as covering from its character area of the multiple word candidate region
In code figure, the mask figure that probability meets the word candidate region of probability threshold value requirement is filtered out.
According to a kind of possible embodiment of the second aspect of the embodiment of the present disclosure, the probability and mask figure calculate son
Module includes:First convolution computational submodule is configured as putting forward described image progress feature using the first convolutional neural networks
It takes, obtains characteristic pattern, wherein first volume product neutral net is to have completed the convolutional neural networks of image characteristics extraction training.
Characteristic area mapping submodule is configured as the multiple word candidate region being respectively mapped in the characteristic pattern, obtain
The corresponding characteristic area in the multiple word candidate region.DUAL PROBLEMS OF VECTOR MAPPING submodule is configured as the multiple word
The corresponding characteristic area in candidate region is mapped as the feature vector of fixed size.Second convolution computational submodule, is configured
For the feature vector of the corresponding fixed size in the multiple word candidate region is input to the second convolutional neural networks, count
Calculation obtains the mask figure of the corresponding word probability in the multiple word candidate region and its character area, wherein, described the
Two convolutional neural networks are the convolutional neural networks for having completed word probability and the training of mask figure.
According to the third aspect of the embodiment of the present disclosure, a kind of text detection device is provided, including:Processor;For storing
The memory of processor-executable instruction;Wherein, the processor is configured as:Word candidate region extraction is carried out to image,
Obtain multiple word candidate regions;Calculate the multiple word candidate region word probability and, calculate word probability and meet
The mask figure of its character area of word candidate region of probability threshold value requirement;The external border of minimum of the mask figure is calculated, is obtained
To text detection result.
According to the fourth aspect of the embodiment of the present disclosure, a kind of computer readable storage medium is provided, is stored thereon with calculating
Machine program instruction realizes any possible embodiment party of the first aspect of the embodiment of the present disclosure when program instruction is executed by processor
The step of formula the method.
The technical scheme provided by this disclosed embodiment can include the following benefits:The disclosure passes through word candidate regions
The mode of domain extraction extracts multiple character areas, calculates the word candidate region that word probability meets probability threshold value requirement
The mask figure of its character area, and the shielding of mask figure is exactly that character area is in situ, therefore, no matter whether word tilts, according to
The external border of minimum of mask figure, which obtains text detection result, can accurately correspond to word position, and therefore, the disclosure is implemented
The character detecting method that example provides can tackle the word of various typesettings, effectively raise the recall rate of word.
It should be appreciated that above general description and following detailed description are only exemplary and explanatory, not
The disclosure can be limited.
Description of the drawings
Attached drawing herein is merged in specification and forms the part of this specification, shows the implementation for meeting the disclosure
Example, and for explaining the principle of the disclosure together with specification.
Fig. 1 is the flow chart according to a kind of character detecting method shown in an exemplary embodiment.
Fig. 2 is according to the image schematic diagram shown in an exemplary embodiment.
Fig. 3 is a kind of flow chart of the character detecting method shown according to another exemplary embodiment.
Fig. 4 is the block diagram according to a kind of text detection device shown in an exemplary embodiment.
Fig. 5 is a kind of block diagram of the text detection device shown according to another exemplary embodiment.
Fig. 6 is the block diagram according to a kind of text detection device shown in another exemplary embodiment.
Fig. 7 is the block diagram according to a kind of text detection device shown in an exemplary embodiment.
Specific embodiment
Here exemplary embodiment will be illustrated in detail, example is illustrated in the accompanying drawings.Following description is related to
During attached drawing, unless otherwise indicated, the same numbers in different attached drawings represent the same or similar element.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the disclosure.On the contrary, they be only with it is such as appended
The example of the consistent apparatus and method of some aspects be described in detail in claims, the disclosure.
Fig. 1 is according to a kind of flow chart of character detecting method shown in an exemplary embodiment, as shown in Figure 1, this article
Word detection method may comprise steps of:
In step 110, word candidate region extraction is carried out to image, obtains multiple word candidate regions.
For example, the quantity n of word candidate region can be hundreds of to thousands of.Common candidate region extracting method has
Selective Search, RPN etc..Each candidate region is by its transverse and longitudinal coordinate height-width isoparametric formulations.
In the step 120, calculate the multiple word candidate region word probability and, calculate word probability and meet
The mask figure of its character area of word candidate region of probability threshold value requirement.
For example, in a kind of possible embodiment, convolutional neural networks can be utilized to calculate the multiple word candidate regions
The word probability and word probability in domain meet the mask figure of its character area of the word candidate region of probability threshold value requirement.In the reality
It applies in mode, the advantage of convolutional neural networks can be utilized to realize speed text detection with high accuracy soon.
It should be noted that the disclosure is to the computation sequence of word probability and mask figure and is not limited.
For example, in a kind of possible embodiment, the word probability of the multiple word candidate region can be first calculated, from
The word candidate region that word probability meets probability threshold value requirement is filtered out in the multiple word candidate region, to what is filtered out
Word candidate region carries out the calculating of the mask figure of its character area.The calculating of mask figure can be reduced by the embodiment
Amount.
For another example in alternatively possible embodiment, the word probability of multiple word candidate regions and the meter of mask figure
Calculation order is unlimited, after the mask figure of its character area of multiple word candidate regions is calculated, from the multiple word candidate regions
In the mask figure of its character area of domain, the mask figure that probability meets the word candidate region of probability threshold value requirement is filtered out.Pass through
The embodiment can calculate word probability simultaneously and mask figure, detection speed are very fast.
In step 130, the external border of minimum of the mask figure is calculated, obtains text detection result.
For example, the minimum enclosed rectangle of mask figure can be calculated.In image schematic diagram as shown in Figure 2, " STOP " word
The minimum enclosed rectangle frame 210 of mask figure corresponding to region is text detection as a result, it has accurately framed word institute in place
It puts.
As it can be seen that the disclosure extracts multiple character areas in a manner that word candidate region is extracted, word is calculated
Probability meets the mask figure of its character area of the word candidate region of probability threshold value requirement, and the shielding of mask figure is exactly literal field
Domain is in situ, and therefore, no matter whether word tilts, and obtaining text detection result according to the external border of minimum of mask figure can be accurate
Corresponding word position, therefore, the character detecting method that the embodiment of the present disclosure provides can tackle the word of various typesettings, effectively
The recall rate for improving word.
Fig. 3 is a kind of flow chart of the character detecting method shown according to another exemplary embodiment, as shown in figure 3, should
Character detecting method may comprise steps of:
In the step 310, word candidate region extraction is carried out to image, obtains multiple word candidate regions.
In step 320, feature extraction is carried out to described image using the first convolutional neural networks, obtains characteristic pattern,
Described in the first convolutional neural networks be completed image characteristics extraction training convolutional neural networks.
In a step 330, the multiple word candidate region is respectively mapped in the characteristic pattern, obtained the multiple
The corresponding characteristic area in word candidate region.
Characteristic pattern namely eigenmatrix, for describing the high-layer semantic information of image, such as what is inside image,
Where etc..
For example, for word candidate region r=(x, y, w, h), characteristic pattern F is mapped thatc, the r can be obtained in FcIn
Corresponding characteristic area rc=(xc,yc,wc,hc)=(sc*x,sc*y,sc*w,sc* h), scFor input image size to its feature
The zoom factor of figure size.
In step 340, the corresponding characteristic area in the multiple word candidate region is mapped as fixed size
Feature vector.
It for example, can be to characteristic area rcThe operation of maximum pondization is carried out, is mapped as the feature vector of regular length
fc。
In step 350, the feature vector of the corresponding fixed size in the multiple word candidate region is input to
The corresponding word probability in the multiple word candidate region and its character area is calculated in second convolutional neural networks
Mask figure, wherein, second convolutional neural networks are the convolutional neural networks for having completed word probability and the training of mask figure.
In step 360, from the mask figure of its character area of the multiple word candidate region, probability satisfaction is filtered out
The mask figure of the word candidate region of probability threshold value requirement.
For example, the word probability of multiple word candidate regions is carried out using default probability threshold value threshold filtering and it is non-most
Big value inhibits, and the mask figure of the word candidate region remained is the mask figure of the word candidate region filtered out.
In step 370, the minimum enclosed rectangle of the mask figure is calculated, obtains text detection result.
As it can be seen that since the present embodiment using convolutional neural networks calculates the mask figure of word probability and character area simultaneously,
Text detection is obtained according to the minimum enclosed rectangle of mask figure as a result, therefore, fast and accurately word institute can be detected with rectangle
In position.
Fig. 4 is according to a kind of block diagram of text detection device 400 shown in an exemplary embodiment, as shown in figure 4, this article
Word detection device 400 can include:Region extraction module 410, the first computing module 420 and the second computing module 430.
The region extraction module 410 can be configured as and carry out word candidate region extraction to image, obtains multiple words
Candidate region.
First computing module 420 can be configured as and calculate multiple words time that the region extraction module 410 obtains
The word probability of favored area and, calculate word probability meet its character area of the word candidate region of probability threshold value requirement
Mask figure.
Second computing module 430 can be configured as and calculate the mask figure that first computing module 420 calculates
Minimum external border, obtains text detection result.
As it can be seen that the disclosure region extraction module 410 by word candidate region extract in a manner of to multiple character areas
It extracts, the first computing module 420 calculates word probability and meets its character area of the word candidate region of probability threshold value requirement
Mask figure, and the shielding of mask figure is exactly that character area is in situ, and therefore, no matter whether word tilts, the second computing module 430
Word position, therefore, the disclosure can accurately be corresponded to by obtaining text detection result according to the external border of minimum of mask figure
The character detecting method that embodiment provides can tackle the word of various typesettings, effectively raise the recall rate of word.
In a kind of possible embodiment, first computing module 420, which can be configured as, utilizes convolutional neural networks
Calculate the word probability of the multiple word candidate region and word probability meet probability threshold value requirement word candidate region its
The mask figure of character area.In this embodiment, the advantage of convolutional neural networks can be utilized to realize that speed is with high accuracy soon
Text detection.
It should be noted that the disclosure is to the computation sequence of word probability and mask figure and is not limited.
For example, a kind of block diagram of the text detection device 500 shown according to another exemplary embodiment as shown in Figure 5, institute
Stating the first computing module 420 can include:Probability calculation submodule 421, probability screening submodule 422 and mask figure calculate son
Module 423.
The probability calculation submodule 421 can be configured as the word probability for calculating the multiple word candidate region.
The probability screens submodule 422, can be configured as that filter out word from the multiple word candidate region general
Rate meets the word candidate region of probability threshold value requirement;
The mask figure computational submodule 423, can be configured as and carry out its literal field to the word candidate region filtered out
The calculating of the mask figure in domain.
Since the embodiment carries out the word candidate region filtered out the calculating of the mask figure of its character area,
The calculation amount of mask figure can be reduced.
For another example the block diagram of a kind of text detection device 600 according to another exemplary embodiment as shown in Figure 6,
First computing module 420 can include:Probability and mask figure computational submodule 424 and probability and mask figure screening submodule
425。
The probability and mask figure computational submodule 424 can be configured as the text for calculating the multiple word candidate region
The mask figure of word probability and its character area.
The probability and mask figure screening submodule 425, can be configured as from the multiple its word of word candidate region
In the mask figure in region, the mask figure that probability meets the word candidate region of probability threshold value requirement is filtered out.
Word probability can be calculated by the embodiment simultaneously and mask figure, detection speed are very fast.
In the embodiment for combining convolutional neural networks, the probability and mask figure computational submodule 424 can include:
First convolution computational submodule, characteristic area mapping submodule, DUAL PROBLEMS OF VECTOR MAPPING submodule and the second convolution computational submodule.
The first convolution computational submodule can be configured as and carry out spy to described image using the first convolutional neural networks
Sign extraction, obtains characteristic pattern, wherein first volume product neutral net is to have completed the convolutional Neural of image characteristics extraction training
Network.
This feature area maps submodule, can be configured as the multiple word candidate region is respectively mapped to it is described
In characteristic pattern, the corresponding characteristic area in the multiple word candidate region is obtained.
The DUAL PROBLEMS OF VECTOR MAPPING submodule can be configured as the corresponding characteristic area in the multiple word candidate region
It is mapped as the feature vector of fixed size.
The second convolution computational submodule can be configured as the corresponding fixation in the multiple word candidate region
The feature vector of size is input to the second convolutional neural networks, and the corresponding text in the multiple word candidate region is calculated
The mask figure of word probability and its character area, wherein, second convolutional neural networks are to have completed word probability and mask figure
Trained convolutional neural networks.
As it can be seen that since the present embodiment using convolutional neural networks calculates the mask figure of word probability and character area simultaneously,
Therefore, word position can fast and accurately be detected.
On the device in above-described embodiment, wherein modules perform the concrete mode of operation in related this method
Embodiment in be described in detail, explanation will be not set forth in detail herein.
The disclosure also provides a kind of computer readable storage medium, is stored thereon with computer program instructions, which refers to
The step of character detecting method that the disclosure provides is realized when order is executed by processor.
Fig. 7 is the block diagram according to a kind of text detection device 700 shown in an exemplary embodiment.For example, device 700 can
To be mobile phone, computer, digital broadcast terminal, messaging devices, game console, tablet device, Medical Devices are good for
Body equipment, personal digital assistant etc..
With reference to Fig. 7, device 700 can include following one or more assemblies:Processing component 702, memory 704, electric power
Component 706, multimedia component 708, audio component 710, the interface 712 of input/output (I/O), sensor module 714 and
Communication component 716.
The integrated operation of 702 usual control device 700 of processing component, such as with display, call, data communication, phase
Machine operates and record operates associated operation.Processing component 702 can refer to including one or more processors 720 to perform
Order, to complete all or part of step of above-mentioned character detecting method.In addition, processing component 702 can include one or more
A module, convenient for the interaction between processing component 702 and other assemblies.For example, processing component 702 can include multimedia mould
Block, to facilitate the interaction between multimedia component 708 and processing component 702.
Memory 704 is configured as storing various types of data to support the operation in device 700.These data are shown
Example is included for the instruction of any application program or method that are operated on device 700, contact data, and telephone book data disappears
Breath, picture, video etc..Memory 704 can be by any kind of volatibility or non-volatile memory device or their group
It closes and realizes, such as static RAM (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile
Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash
Device, disk or CD.
Electric power assembly 706 provides electric power for the various assemblies of device 700.Electric power assembly 706 can include power management system
System, one or more power supplys and other generate, manage and distribute electric power associated component with for device 700.
Multimedia component 708 is included in the screen of one output interface of offer between described device 700 and user.One
In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen
Curtain may be implemented as touch-screen, to receive input signal from the user.Touch panel includes one or more touch sensings
Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding action
Border, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, more matchmakers
Body component 708 includes a front camera and/or rear camera.When device 700 is in operation mode, such as screening-mode or
During video mode, front camera and/or rear camera can receive external multi-medium data.Each front camera and
Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 710 is configured as output and/or input audio signal.For example, audio component 710 includes a Mike
Wind (MIC), when device 700 is in operation mode, during such as call model, logging mode and speech recognition mode, microphone by with
It is set to reception external audio signal.The received audio signal can be further stored in memory 704 or via communication set
Part 716 is sent.In some embodiments, audio component 710 further includes a loud speaker, for exports audio signal.
I/O interfaces 712 provide interface between processing component 702 and peripheral interface module, and above-mentioned peripheral interface module can
To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock
Determine button.
Sensor module 714 includes one or more sensors, and the state for providing various aspects for device 700 is commented
Estimate.For example, sensor module 714 can detect opening/closed state of device 700, and the relative positioning of component, for example, it is described
Component is the display and keypad of device 700, and sensor module 714 can be with 700 1 components of detection device 700 or device
Position change, the existence or non-existence that user contacts with device 700,700 orientation of device or acceleration/deceleration and device 700
Temperature change.Sensor module 714 can include proximity sensor, be configured to detect without any physical contact
Presence of nearby objects.Sensor module 714 can also include optical sensor, such as CMOS or ccd image sensor, for into
As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors
Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 716 is configured to facilitate the communication of wired or wireless way between device 700 and other equipment.Device
700 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or combination thereof.In an exemplary implementation
In example, communication component 716 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel.
In one exemplary embodiment, the communication component 716 further includes near-field communication (NFC) module, to promote short range communication.Example
Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology,
Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 700 can be believed by one or more application application-specific integrated circuit (ASIC), number
Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array
(FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for performing above-mentioned character detecting method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided
Such as include the memory 704 of instruction, above-metioned instruction can be performed to complete above-mentioned text detection side by the processor 720 of device 700
Method.For example, the non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, magnetic
Band, floppy disk and optical data storage devices etc..
Those skilled in the art will readily occur to other embodiment party of the disclosure after considering specification and putting into practice the disclosure
Case.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or adaptability
Variation follows the general principle of the disclosure and including the undocumented common knowledge in the art of the disclosure or usual skill
Art means.Description and embodiments are considered only as illustratively, and the true scope and spirit of the disclosure are by following claim
It points out.
It should be appreciated that the present disclosure is not limited to the precise structures that have been described above and shown in the drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present disclosure is only limited by appended claim.
Claims (12)
1. a kind of character detecting method, which is characterized in that including:
Word candidate region extraction is carried out to image, obtains multiple word candidate regions;
Calculate the multiple word candidate region word probability and, calculate word probability and meet the text of probability threshold value requirement
The mask figure of its character area of word candidate region;
The external border of minimum of the mask figure is calculated, obtains text detection result.
2. character detecting method according to claim 1, which is characterized in that described to calculate the multiple word candidate region
Word probability and, calculate word probability and meet the mask figure of its character area of the word candidate region of probability threshold value requirement
Including:
The word probability of the multiple word candidate region is calculated using convolutional neural networks and word probability meets probability threshold value
It is required that its character area of word candidate region mask figure.
3. character detecting method according to claim 1 or 2, which is characterized in that the calculating word probability meets probability
The mask figure of its character area of the word candidate region of threshold requirement includes:
The word candidate region that word probability meets probability threshold value requirement is filtered out from the multiple word candidate region;
The calculating of the mask figure of its character area is carried out to the word candidate region filtered out.
4. character detecting method according to claim 1 or 2, which is characterized in that the calculating word probability meets probability
The mask figure of its character area of the word candidate region of threshold requirement includes:
Calculate the mask figure of its character area of the multiple word candidate region;
From the mask figure of its character area of the multiple word candidate region, the text that probability meets probability threshold value requirement is filtered out
The mask figure of word candidate region.
5. character detecting method according to claim 4, which is characterized in that described to calculate the multiple word candidate region
The mask figure of its character area includes:
Feature extraction is carried out to described image using the first convolutional neural networks, obtains characteristic pattern, wherein first volume product god
It is the convolutional neural networks for having completed image characteristics extraction training through network;
The multiple word candidate region is respectively mapped in the characteristic pattern, obtains the multiple word candidate region difference
Corresponding characteristic area;
The corresponding characteristic area in the multiple word candidate region is mapped as to the feature vector of fixed size;
The feature vector of the corresponding fixed size in the multiple word candidate region is input to the second convolutional neural networks,
The mask figure of the corresponding word probability in the multiple word candidate region and its character area is calculated, wherein, it is described
Second convolutional neural networks are the convolutional neural networks for having completed word probability and the training of mask figure.
6. a kind of text detection device, which is characterized in that including:
Region extraction module is configured as carrying out word candidate region extraction to image, obtains multiple word candidate regions;
First computing module, the word for being configured as calculating multiple word candidate regions that the region extraction module obtains are general
Rate and, calculate word probability and meet the mask figure of its character area of the word candidate region of probability threshold value requirement;
Second computing module is configured as calculating the external border of minimum for the mask figure that first computing module calculates, obtains
To text detection result.
7. text detection device according to claim 6, which is characterized in that first computing module is configured as utilizing
Convolutional neural networks calculate the word probability of the multiple word candidate region and word probability meets the text of probability threshold value requirement
The mask figure of its character area of word candidate region.
8. the text detection device according to claim 6 or 7, which is characterized in that first computing module includes:
Probability calculation submodule is configured as calculating the word probability of the multiple word candidate region;
Probability screens submodule, is configured as filtering out word probability from the multiple word candidate region and meets probability threshold value
It is required that word candidate region;
Mask figure computational submodule is configured as carrying out the word candidate region filtered out the meter of the mask figure of its character area
It calculates.
9. the text detection device according to claim 6 or 7, which is characterized in that first computing module includes:
Probability and mask figure computational submodule are configured as calculating the word probability and its word of the multiple word candidate region
The mask figure in region;
Probability and mask figure screening submodule, are configured as the mask figure from its character area of the multiple word candidate region
In, filter out the mask figure that probability meets the word candidate region of probability threshold value requirement.
10. text detection device according to claim 9, which is characterized in that the probability and mask figure computational submodule
Including:
First convolution computational submodule is configured as carrying out feature extraction to described image using the first convolutional neural networks, obtain
To characteristic pattern, wherein first volume product neutral net is to have completed the convolutional neural networks of image characteristics extraction training;
Characteristic area mapping submodule is configured as the multiple word candidate region being respectively mapped in the characteristic pattern,
Obtain the corresponding characteristic area in the multiple word candidate region;
DUAL PROBLEMS OF VECTOR MAPPING submodule is configured as being mapped as fixing by the corresponding characteristic area in the multiple word candidate region
The feature vector of size;
Second convolution computational submodule is configured as the feature of the corresponding fixed size in the multiple word candidate region
Vector is input to the second convolutional neural networks, be calculated the corresponding word probability in the multiple word candidate region and its
The mask figure of character area, wherein, second convolutional neural networks are the convolution for having completed word probability and the training of mask figure
Neutral net.
11. a kind of text detection device, which is characterized in that including:
Processor;
For storing the memory of processor-executable instruction;
Wherein, the processor is configured as:
Word candidate region extraction is carried out to image, obtains multiple word candidate regions;
Calculate the multiple word candidate region word probability and, calculate word probability and meet the text of probability threshold value requirement
The mask figure of its character area of word candidate region;
The external border of minimum of the mask figure is calculated, obtains text detection result.
12. a kind of computer readable storage medium, is stored thereon with computer program instructions, which is characterized in that the program instruction
The step of method any one of Claims 1 to 5 is realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711332870.9A CN108062547B (en) | 2017-12-13 | 2017-12-13 | Character detection method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711332870.9A CN108062547B (en) | 2017-12-13 | 2017-12-13 | Character detection method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108062547A true CN108062547A (en) | 2018-05-22 |
CN108062547B CN108062547B (en) | 2021-03-09 |
Family
ID=62138536
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711332870.9A Active CN108062547B (en) | 2017-12-13 | 2017-12-13 | Character detection method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108062547B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109840483A (en) * | 2019-01-11 | 2019-06-04 | 深圳大学 | A kind of method and device of landslide fissure detection and identification |
CN109858432A (en) * | 2019-01-28 | 2019-06-07 | 北京市商汤科技开发有限公司 | Method and device, the computer equipment of text information in a kind of detection image |
CN110348522A (en) * | 2019-07-12 | 2019-10-18 | 创新奇智(青岛)科技有限公司 | A kind of image detection recognition methods and system, electronic equipment, image classification network optimized approach and system |
CN110569708A (en) * | 2019-06-28 | 2019-12-13 | 北京市商汤科技开发有限公司 | Text detection method and device, electronic equipment and storage medium |
CN111259878A (en) * | 2018-11-30 | 2020-06-09 | 中移(杭州)信息技术有限公司 | Method and equipment for detecting text |
CN111680690A (en) * | 2020-04-26 | 2020-09-18 | 泰康保险集团股份有限公司 | Character recognition method and device |
CN112560857A (en) * | 2021-02-20 | 2021-03-26 | 鹏城实验室 | Character area boundary detection method, equipment, storage medium and device |
CN112733857A (en) * | 2021-01-08 | 2021-04-30 | 北京匠数科技有限公司 | Image character detection model training method and device for automatically segmenting character area |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106355573A (en) * | 2016-08-24 | 2017-01-25 | 北京小米移动软件有限公司 | Target object positioning method and device in pictures |
CN106384098A (en) * | 2016-09-23 | 2017-02-08 | 北京小米移动软件有限公司 | Image-based head posture detection method, device and terminal |
CN106599939A (en) * | 2016-12-30 | 2017-04-26 | 深圳市唯特视科技有限公司 | Real-time target detection method based on region convolutional neural network |
CN106780612A (en) * | 2016-12-29 | 2017-05-31 | 浙江大华技术股份有限公司 | Object detecting method and device in a kind of image |
CN106803071A (en) * | 2016-12-29 | 2017-06-06 | 浙江大华技术股份有限公司 | Object detecting method and device in a kind of image |
US20170220904A1 (en) * | 2015-04-02 | 2017-08-03 | Tencent Technology (Shenzhen) Company Limited | Training method and apparatus for convolutional neural network model |
US20170300786A1 (en) * | 2015-10-01 | 2017-10-19 | Intelli-Vision | Methods and systems for accurately recognizing vehicle license plates |
CN107273870A (en) * | 2017-07-07 | 2017-10-20 | 郑州航空工业管理学院 | The pedestrian position detection method of integrating context information under a kind of monitoring scene |
CN107346420A (en) * | 2017-06-19 | 2017-11-14 | 中国科学院信息工程研究所 | Text detection localization method under a kind of natural scene based on deep learning |
-
2017
- 2017-12-13 CN CN201711332870.9A patent/CN108062547B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170220904A1 (en) * | 2015-04-02 | 2017-08-03 | Tencent Technology (Shenzhen) Company Limited | Training method and apparatus for convolutional neural network model |
US20170300786A1 (en) * | 2015-10-01 | 2017-10-19 | Intelli-Vision | Methods and systems for accurately recognizing vehicle license plates |
CN106355573A (en) * | 2016-08-24 | 2017-01-25 | 北京小米移动软件有限公司 | Target object positioning method and device in pictures |
CN106384098A (en) * | 2016-09-23 | 2017-02-08 | 北京小米移动软件有限公司 | Image-based head posture detection method, device and terminal |
CN106780612A (en) * | 2016-12-29 | 2017-05-31 | 浙江大华技术股份有限公司 | Object detecting method and device in a kind of image |
CN106803071A (en) * | 2016-12-29 | 2017-06-06 | 浙江大华技术股份有限公司 | Object detecting method and device in a kind of image |
CN106599939A (en) * | 2016-12-30 | 2017-04-26 | 深圳市唯特视科技有限公司 | Real-time target detection method based on region convolutional neural network |
CN107346420A (en) * | 2017-06-19 | 2017-11-14 | 中国科学院信息工程研究所 | Text detection localization method under a kind of natural scene based on deep learning |
CN107273870A (en) * | 2017-07-07 | 2017-10-20 | 郑州航空工业管理学院 | The pedestrian position detection method of integrating context information under a kind of monitoring scene |
Non-Patent Citations (1)
Title |
---|
CHEN_H: ""CNN 在图像分割中的简史:从 R-CNN 到 Mask R-CNN"", 《微信公众号:CODERPAI》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259878A (en) * | 2018-11-30 | 2020-06-09 | 中移(杭州)信息技术有限公司 | Method and equipment for detecting text |
CN109840483B (en) * | 2019-01-11 | 2020-09-11 | 深圳大学 | Landslide crack detection and identification method and device |
CN109840483A (en) * | 2019-01-11 | 2019-06-04 | 深圳大学 | A kind of method and device of landslide fissure detection and identification |
CN109858432A (en) * | 2019-01-28 | 2019-06-07 | 北京市商汤科技开发有限公司 | Method and device, the computer equipment of text information in a kind of detection image |
CN109858432B (en) * | 2019-01-28 | 2022-01-04 | 北京市商汤科技开发有限公司 | Method and device for detecting character information in image and computer equipment |
CN110569708A (en) * | 2019-06-28 | 2019-12-13 | 北京市商汤科技开发有限公司 | Text detection method and device, electronic equipment and storage medium |
CN110348522B (en) * | 2019-07-12 | 2021-12-07 | 创新奇智(青岛)科技有限公司 | Image detection and identification method and system, electronic equipment, and image classification network optimization method and system |
CN110348522A (en) * | 2019-07-12 | 2019-10-18 | 创新奇智(青岛)科技有限公司 | A kind of image detection recognition methods and system, electronic equipment, image classification network optimized approach and system |
CN111680690A (en) * | 2020-04-26 | 2020-09-18 | 泰康保险集团股份有限公司 | Character recognition method and device |
CN111680690B (en) * | 2020-04-26 | 2023-07-11 | 泰康保险集团股份有限公司 | Character recognition method and device |
CN112733857B (en) * | 2021-01-08 | 2021-10-15 | 北京匠数科技有限公司 | Image character detection model training method and device for automatically segmenting character area |
CN112733857A (en) * | 2021-01-08 | 2021-04-30 | 北京匠数科技有限公司 | Image character detection model training method and device for automatically segmenting character area |
CN112560857A (en) * | 2021-02-20 | 2021-03-26 | 鹏城实验室 | Character area boundary detection method, equipment, storage medium and device |
Also Published As
Publication number | Publication date |
---|---|
CN108062547B (en) | 2021-03-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108062547A (en) | Character detecting method and device | |
CN105809704B (en) | Identify the method and device of image definition | |
CN107944447B (en) | Image classification method and device | |
CN104850852B (en) | Feature vector computational methods and device | |
CN104504684B (en) | Edge extraction method and device | |
CN104918107B (en) | The identification processing method and device of video file | |
CN107239535A (en) | Similar pictures search method and device | |
CN104484871B (en) | edge extracting method and device | |
CN106682736A (en) | Image identification method and apparatus | |
CN106778773A (en) | The localization method and device of object in picture | |
CN107832741A (en) | The method, apparatus and computer-readable recording medium of facial modeling | |
CN108010060A (en) | Object detection method and device | |
CN106778531A (en) | Face detection method and device | |
CN107742120A (en) | The recognition methods of bank card number and device | |
CN106980840A (en) | Shape of face matching process, device and storage medium | |
CN107563994A (en) | The conspicuousness detection method and device of image | |
CN107729880A (en) | Method for detecting human face and device | |
CN107992848A (en) | Obtain the method, apparatus and computer-readable recording medium of depth image | |
CN108717542A (en) | Identify the method, apparatus and computer readable storage medium of character area | |
CN107967459A (en) | convolution processing method, device and storage medium | |
CN107832746A (en) | Expression recognition method and device | |
CN108108671A (en) | Description of product information acquisition method and device | |
CN107292306A (en) | Object detection method and device | |
CN105094364B (en) | Vocabulary display methods and device | |
CN109376674A (en) | Method for detecting human face, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |