CN109902768A - The processing of the output result of optical character recognition technology - Google Patents

The processing of the output result of optical character recognition technology Download PDF

Info

Publication number
CN109902768A
CN109902768A CN201910346265.XA CN201910346265A CN109902768A CN 109902768 A CN109902768 A CN 109902768A CN 201910346265 A CN201910346265 A CN 201910346265A CN 109902768 A CN109902768 A CN 109902768A
Authority
CN
China
Prior art keywords
output result
row
confidence level
text
threshold value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910346265.XA
Other languages
Chinese (zh)
Other versions
CN109902768B (en
Inventor
胡东鑫
蔡海蛟
冯歆鹏
周骥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhao Ming Electronic Technology Co Ltd
Original Assignee
Shanghai Zhao Ming Electronic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhao Ming Electronic Technology Co Ltd filed Critical Shanghai Zhao Ming Electronic Technology Co Ltd
Priority to CN201910346265.XA priority Critical patent/CN109902768B/en
Publication of CN109902768A publication Critical patent/CN109902768A/en
Application granted granted Critical
Publication of CN109902768B publication Critical patent/CN109902768B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

The disclosure provides a kind of method for handling the output result of optical character recognition technology, chip circuit, reading aids, storage medium.The method of output result for handling optical character recognition technology includes the word confidence level according to each text identified, calculate the average confidence of output result, average confidence is compared with first threshold, if average confidence is less than or equal to the first threshold, then abandon output result, wherein output result includes multiple texts of at least a line identified, and each identified text has word confidence level for indicating the credibility of corresponding recognition result.

Description

The processing of the output result of optical character recognition technology
Technical field
This disclosure relates to image procossing, the in particular to processing to the output result of optical character recognition technology.
Background technique
Currently, optical character recognition technology (Optical Character Recognition, OCR) has been cured in the market Hair is mature, is all greatly improved on recognition speed, precision.On this basis, it also produces many about text knowledge Other application, such as it is directed to letter reader, the official documents and correspondence scanning tools Intelligent mobile equipment of amblyopia crowd.
The accuracy of Text region depends greatly on the clarity of image, has only reached centainly clear Degree, could really guarantee the accuracy of recognition result.But on Intelligent mobile equipment, be typically based on the camera of smart machine into Row Image Acquisition, the angle taken pictures, position can all have an impact the clarity of image, will lead to the accurate of Text region in this way Degree reduces.
In addition, some intelligent hardware devices, the problems such as due to power consumption, it will usually remove the part that such as display screen power consumption is big Hardware.In the case where no display screen, user can not watch current effect of taking pictures, so can not determine the image of shooting is It is no clear, but this will affect the accuracy of pictograph identification.Such recognition result is directly returned into user, it is possible to meeting It is supplied to the text information of user's mistake, it is very unfriendly to user.
Summary of the invention
One purpose of the disclosure is to provide a kind of method of output result for handling optical character recognition technology, chip electricity Road, reading aids, storage medium.
According to one aspect of the disclosure, it provides a kind of for handling the side of the output result of optical character recognition technology Method, comprising: according to the word confidence level of each text identified, calculate the average confidence of output result;And it will be averaged Confidence level is compared with first threshold, if average confidence is less than or equal to the first threshold, abandons output result.Its In, output result includes multiple texts of at least a line identified, and each identified text has word confidence Spend the credibility for indicating corresponding recognition result.
According to the another aspect of the disclosure, a kind of chip circuit is provided, for handling the output of optical character recognition technology As a result, including the steps that the circuit unit for being configured as executing the above method.
According to the another aspect of the disclosure, a kind of reading aids are provided, comprising: be configured as obtaining comprising in text The sensor of the image of appearance;Chip circuit above-mentioned, the chip circuit further include being configured as carrying out text to described image Identification with obtain include text output result circuit unit and be configured as by treated output result convert audio The circuit unit of frequency information;And audio output apparatus, it is configured as exporting the audio-frequency information.
According to the another aspect of the disclosure, a kind of reading aids are also provided, comprising: for storing comprising word content Image memory;Chip circuit above-mentioned, the chip circuit further include being configured as carrying out text knowledge to described image Not with obtain comprising text output result circuit unit and be configured as by treated export result be converted into audio The circuit unit of information;And audio output apparatus, it is configured as exporting the audio-frequency information.
According to the another aspect of the disclosure, a kind of computer readable storage medium is provided, wherein deposit on storage medium The program including instruction is contained, which causes electronic equipment to execute the above method in the processor execution by electronic equipment Step.
From the exemplary embodiment described with reference to the accompanying drawing, the more features and advantage of the disclosure be will become apparent.
Detailed description of the invention
Attached drawing schematically illustrates embodiment and constitutes part of specification, together with the verbal description of specification For explaining the illustrative embodiments of embodiment.Shown embodiment is not intended to limit right and wants merely for the purpose of illustration The range asked.In all the appended drawings, identical appended drawing reference refers to similar but not necessarily identical element.
Fig. 1 shows the flow chart handled according to first embodiment OCR output result;
Fig. 2 shows the flow charts handled according to second embodiment OCR output result;
Fig. 3 shows the more detailed flow chart for the step of being handled line by line in Fig. 2 OCR output result;
Fig. 4 shows the flow chart handled according to third embodiment OCR output result;
Fig. 5 shows the more detailed flow chart for the step of being handled line by line in Fig. 4 OCR output result;
Fig. 6 shows the structural block diagram of reading aids according to the exemplary embodiment of the disclosure;
Fig. 7 shows the structural block diagram that can be applied to the exemplary computer device of illustrative embodiments.
Specific embodiment
In the disclosure, unless otherwise stated, describing various elements using term " first ", " second " etc. unexpectedly Figure limits positional relationship, sequential relationship or the important sexual intercourse of these elements, and this term is only intended to an element and another One element distinguishes.In some instances, the first element and the second element can be directed toward the same instance of the element, and certain In the case of, the description based on context, they also may refer to different instances.
Fig. 1 shows the flow chart handled according to first embodiment OCR output result.
In step S101, the image comprising word content is obtained.Image for example by camera or can have bat Equipment (such as mobile phone, tablet computer, wearable device etc.) shooting according to function obtains, the image stored before being also possible to. The word content being taken can reside in different surfaces, such as books, newspaper, screen, menu, mark and Product labelling Deng.
In step S102, by the text in existing OCR technique identification image and generation one includes identified text Output result.Output result is usually editable, and format includes but is not limited to Word, Excel, TXT document.
It will be appreciated by those skilled in the art that existing OCR technique can (picture format for example wraps from PDF or image Include JPG, BMP, TIFF, GIF) in identification extract text.Part OCR technique (such as closing OCR, Adobe) is providing identification The text confidence level of each text in institute's recognition result is also provided except as a result.Here, text confidence level can be understood as OCR identification Afterwards, the value of the confidence level its recognition result provided.The range of confidence level is generally between 1.0~0, and value is closer to 0 Indicate that the confidence level of corresponding recognition result is lower, conversely, value indicates that the confidence level of corresponding recognition result is got over closer to 1 It is high.
According to first embodiment, in step s 102, the value of a confidence level is assigned to each text identified, That is word confidence level.For the range of word confidence level generally between 1.0~0, value indicates the credible of corresponding recognition result closer to 0 Degree is lower, conversely, value indicates that the confidence level of corresponding recognition result is higher closer to 1.
It is noted that being applicable not only to the Chinese character that OCR is identified according to the processing method of the disclosure is also applied for example Word under the word or any language that are formed such as letter, on condition that OCR technique provider can identify corresponding word simultaneously A word confidence level is assigned for the word of each identification.In addition, comparison operator is simultaneously when being compared to confidence level with threshold value The situation mentioned in being not limited to the above embodiments.For example, being used respectively " big compared to " being greater than or equal to ", " being less than or equal to " In ", " being less than " and threshold value appropriate of arranging in pairs or groups be also feasible and without departing from the scope of the present disclosure.
In step S103, according to the confidence level of each text identified, the average confidence of entire output result is calculated Degree.Under normal conditions, output result may include the multiple texts for being arranged at least a line.Each text in result will be exported Word confidence level sum again divided by the total number of identify text, to obtain the average confidence of entire output result.
It averages, obtains to the word confidence level of each text in output every a line of result alternatively, can also first pass through To the row confidence level of the row, then by averaging to obtain the average confidence of entire output result to each row confidence level. Similar to word confidence level, also between 1.0~0, value indicates corresponding closer to 0 and knows the range of row confidence level and average confidence The confidence level of other result is lower, conversely, value indicates that the confidence level of corresponding recognition result is higher closer to 1.
In step S104, the average confidence being calculated is compared with first threshold, is determined according to comparison result Whether output result is abandoned.It is set as 0.5 to the first threshold property of can be exemplified.If average confidence is less than or equal to 0.5, Judge to export result be it is fuzzy, proceed to step S105, abandon the output as a result, therefore the threshold value is referred to as fuzziness. On the contrary, proceeding to step S106 if average confidence is greater than 0.5 and retaining the output result.First threshold (fuzziness) It can be configured as other values, such as 0.65.
Output result be judged as fuzzy reason it is usually because user during shoot text because shake, light The reasons such as line, position and cause shooting fogging image.The case where for abandoning output result, user can choose for same The text of sample re-shoots the higher image of a clarity and carries out OCR identification.
This first embodiment entirely exports the average confidence of result using the word confidence calculations of each text, and will Output result quickly and easily can be carried out clarity as a whole and sentenced by average confidence compared with preset fuzziness It is disconnected, thus whether to retain and exporting result and provide judgment basis.For example, be applied in the situation of voice broadcast in output result, User may be influenced to content of text because of excessive wrong identification result by carrying out voice broadcast to output result defective Understanding, it is not only unfriendly to user also to waste computing resource.According to this first embodiment, if it is determined that output result is fuzzy (i.e. average confidence be less than or equal to first threshold), then directly abandon output result to avoid still to this output result into Row voice broadcast causes user to confuse or do not understand to cause user experience very poor.
Fig. 2 shows the flow charts handled according to second embodiment OCR output result.This second embodiment The OCR output result for being greater than first threshold to average confidence using a second threshold is made further to judge with determination line by line Whether the row is used.
In this second embodiment, calculate output result average confidence the step of with the phase in first embodiment It answers step S101 to S103 identical, therefore omits and show and omit associated description.
In step S204, the average confidence being calculated is compared with first threshold, is determined according to comparison result Whether output result is abandoned.It is set as 0.5 to the first threshold property of can be exemplified.If average confidence is less than or equal to 0.5, Judge export result be it is fuzzy, proceed to step S205, abandon the output result.On the contrary, if average confidence is greater than 0.5, Then proceed to step S206.
In step S206, the OCR row confidence level for exporting every a line in result is compared with second threshold respectively, and The row text for abandoning being identified is determined whether according to comparison result, wherein row confidence level corresponds to the word of all texts in the row The average value of confidence level.The occurrence of second threshold can configure, and here for ease of description, illustratively be set as 0.60.
Fig. 3 describes the detailed operation of the step S206 handled line by line in Fig. 2 OCR output result.
In step S300, if it is determined that the row confidence level of certain a line is less than or equal to second threshold, then step is proceeded to Rapid S301 deletes the row text from OCR output result;On the contrary, if it is determined that the row confidence level of certain a line be greater than second threshold, Step S302 is then proceeded to, the row text is retained.Then, step S303 and S304 are executed, judges whether traversal output result Each row.If traversing each row not yet, operation returns to S300 and continues to carry out judgement operation to next line, until all rows all It is disposed.
As described in the first embodiment, first threshold is used to whether clearly make an entirety to OCR output result Judgement.But it is judged as clearly OCR output result on the whole and there will still likely be the part with identification mistake.Second is real It applies mode and is further sentenced line by line by setting second threshold to export result greater than the OCR of first threshold to average confidence It is disconnected, to determine whether the output result for retaining corresponding row.Therefore, second embodiment further improves processing OCR output As a result accuracy.
Fig. 4 shows the flow chart handled according to third embodiment OCR output result.This third embodiment Using two threshold values (third threshold value and the 4th threshold value) to average confidence be greater than first threshold OCR output result make line by line into One step judges to determine whether to use the row.
In this third embodiment, calculate output result average confidence the step of with the phase in first embodiment It answers step S101 to S103 identical, therefore omits and show and omit associated description.
In step S404, the average confidence being calculated is compared with first threshold, is determined according to comparison result Whether output result is abandoned.It is set as 0.5 to the first threshold property of can be exemplified.If average confidence is less than or equal to 0.5, Judge export result be it is fuzzy, proceed to step S405, abandon the output result.On the contrary, if average confidence is greater than 0.5, Then proceed to step S406.
In step S406, by OCR export the row confidence level of every a line in result respectively with third threshold value and the 4th threshold value into Row compares, and determines how the row text that processing is identified according to comparison result, and wherein row confidence level corresponds to institute in the row There is the average value of the word confidence level of text.The occurrence of third threshold value and the 4th threshold value can configure, here for ease of description, Illustratively it is set to 0.60 and 0.80.
Fig. 5 describes the detailed operation of the step S406 handled line by line in Fig. 4 OCR output result.
In step S500, if it is determined that the row confidence level of certain a line is less than or equal to 0.6 (third threshold value), then advance To step S501, the row text is deleted from OCR output result;Otherwise, step S502 is proceeded to, judges the row confidence level of the row Whether 0.8 (the 4th threshold value) is more than or equal to.If it is judged that be it is yes, then proceed to step S503, retain the row text; Otherwise, if the row confidence level of the row is greater than 0.6 and less than 0.8, step S504 is proceeded to, respectively by text each in the row Word confidence level judges whether to delete corresponding text according to comparison result compared with the 5th threshold value.
Specifically, if the word confidence level of some text is less than or equal to the 5th threshold value, being deleted in step S504 The text identified.On the contrary, retaining the text if word confidence level is greater than the 5th threshold value.The occurrence of 5th threshold value can With configuration, here for ease of description, it is illustratively set as 0.78.
Then, step S505 and S506 are executed, judges whether to traverse each row.If it is determined that not traversing each row, operate It returns to S500 to continue to carry out judgement operation to next line, until all rows are all disposed.
Compared with the second embodiment above-mentioned for using a second threshold, two threshold values about row confidence level are set It can be further improved the accuracy of OCR output result.Specifically, in the case where only setting a second threshold, it is assumed that The value of second threshold configuration is higher, then may cause excessive row and be judged as fuzzy and then abandon, influence entirely to export result Integrality and comprehensibility;On the contrary, it is assumed that the value of second threshold configuration is lower, then may cause influences voice broadcast accuracy Row defective be directly applied.This third embodiment efficiently solves above-mentioned ask by setting two different threshold values Topic.
In addition, by setting the 5th threshold value, this third embodiment can also targetedly deletion error identification text Word.This kind of mistake is, for example, caused by being obscured as punctuation mark.The fuzzy row that may cause full line of single punctuation mark is set Reliability is low, in turn results in the output result for abandoning full line text.By the way that the 5th threshold value is arranged, the punctuate being blurred can be effectively deleted Symbol, accurate output character, does not influence semantic understanding and reading.In addition, this kind of mistake for example may also be due to spot or Because caused by the edge (font distortion) in text.By positioning and deleting these mistakes, it can effectively slow down voice and broadcast The recognition result for mistake of giving the correct time is to interference caused by user security risk.
Processed OCR output result is provided to corresponding application program and carries out in each embodiment of the disclosure It is further processed, such as voice broadcast, word processing etc..
An aspect of this disclosure may include a kind of reading aids.Fig. 6 is the exemplary reality shown according to the disclosure Apply the structural block diagram of the reading aids of example.
As shown in fig. 6, reading aids 600 may include: that sensor 601 (such as can be realized as camera, camera Deng), can be configured as acquisition image above-mentioned, (image for example can be still image or video image, may include in image Text);And chip circuit 603, chip circuit can be configured as the circuit list for the step of executing according to aforementioned any method Member.The chip circuit can also include being configured as carrying out Text region to described image to obtain the output result comprising text Circuit unit, and be configured as that output result is converted into the circuit unit of audio-frequency information by treated.It is configured to institute It states image progress Text region and any Text region can use for example with the circuit unit for obtaining the output result comprising text (such as optical character identify OCR) software or circuit are configured as the electricity of treated output result is converted into audio-frequency information Road unit for example can use any text-to-speech switching software or circuit.These circuit units for example can by asic chip or Fpga chip is realized.Reading aids 600 can also include audio output apparatus 605 (such as loudspeaker, earphone, vibration Device etc.), it is configured as output audio-frequency information (such as voice data).
In addition, the substituted or supplemented embodiment as sensor 601, reading aids 600 can also include storage Device, for storing the image comprising word content.
Reading aids may be implemented as wearable device, for example, may be implemented as can be used as glasses form and by The equipment of wearing, the wearable equipment on ear, could attach to glasses (example at headset equipment (such as helmet or cap etc.) Such as spectacle frame, leg of spectacles) on accessory, could attach to accessory on cap etc..
By the reading aids, vision disorder user can be as normal visual acuity reader, using similar reading " reading " to conventional reading matter (such as books, magazine etc.) can be realized in posture.During " reading ", reading aids are pressed The image comprising text of capture is handled automatically according to the method in previous embodiment, passes through loudspeaker, earphone or vibration The output devices such as device are issued to be listened to for user.
An aspect of this disclosure may include the computer readable storage medium for storing program, and described program includes instruction, Described instruction causes the electronic equipment to execute aforementioned any method in the processor execution by electronic equipment.
Fig. 7 is the calculating equipment 2000 of method or process used to implement the present disclosure, is that can be applied to the disclosure Various aspects hardware device example.Any machine for executing processing and/or calculating can be configured as by calculating equipment 2000 Device can be but not limited to wearable device, tablet computer, smart phone or any combination thereof.According to the reading of the disclosure Ancillary equipment can be realized wholly or at least partially by calculating equipment 2000 or similar devices or system.
Calculating equipment 2000 may include that (may be via one or more interfaces) be connect or and bus with bus 2002 The element of 2002 communications.For example, calculate equipment 2000 may include bus 2002, one or more processors 2004, one or Multiple input equipments 2006 and one or more output equipments 2008.One or more processors 2004 can be any type Processor, and can include but is not limited to one or more general processors and/or one or more application specific processor (example Such as specially treated chip).Input equipment 2006 can be can be to any kind of equipment for calculating the input information of equipment 2000, packet It includes but is not limited to camera.Output equipment 2008 can be any kind of equipment that information can be presented, including but not limited to sound Frequency output equipment such as earphone, loudspeaker, bone conduction vibrator or display.Calculating equipment 2000 can also include nonvolatile Property storage equipment 2010 or with non-transitory storage equipment 2010 connect, non-transitory store equipment can be non-transitory And may be implemented data storage any storage equipment, and can include but is not limited to disc driver, optical storage is set Standby, solid-state memory, floppy disk, flexible disk, hard disk, tape or any other magnetic medium, CD or any other optical medium, ROM (read-only memory), RAM (random access memory), cache memory and/or any other memory chip or Box, and/or computer can be read from any other medium of data, instruction and/or code.Non-transitory stores equipment 2010 It can be dismantled from interface.Non-transitory storage equipment 2010 can have the data/program for realizing the above method and step (including instruction)/code.Calculating equipment 2000 can also include communication equipment 2012.Communication equipment 2012 may be such that can With external equipment and/or with any kind of equipment or system of network communication, and can include but is not limited to wirelessly communicate Equipment and/or chipset, for example, bluetooth equipment, 1302.11 equipment, WiFi equipment, WiMax equipment, cellular communication apparatus and/or Analog.
Calculating equipment 2000 can also include working storage 2014, and can be storage has the work of processor 2004 Any kind of working storage of program (including instruction) and/or data, and can include but is not limited to deposit at random Access to memory and/or read-only storage equipment.
Software elements (program) can be located in working storage 2014, including but not limited to operating system 2016, one Or multiple applications 2018, driver and/or other data and code.Instruction for executing the above method and step can be by It is included in one or more application 2018.
When calculating equipment 2000 shown in Fig. 7 is applied to embodiment of the present disclosure, working storage 2014 can be with The program code and/or the image to be identified comprising word content for executing flow chart shown in Fig. 1 to 5 are stored, wherein Using may include the optical character identification application (such as Adobe) provided by third party in 2018, voice conversion application, can compile Collect text processing application etc..Input equipment 2006 can be sensor for obtaining the image comprising word content.Wherein institute The image comprising word content or acquired image of storage can be the output knot comprising text by OCR application processing Fruit, output equipment 2008 are, for example, that loudspeaker or earphone are used for voice broadcast, and wherein processor 2004 is used for according to memory Program code in 2014 executes the method and step according to all aspects of this disclosure.
It should also be understood that calculating the component of equipment 2000 can be distributed on network.It is, for example, possible to use a processors Some processing are executed, and other processing can be executed by another processor far from a processor simultaneously.Calculate equipment 2000 other assemblies can also be similarly distributed.In this way, calculating equipment 2000 can be interpreted at multiple position execution The distributed computing system of reason.
Although embodiment of the disclosure or example is described with reference to the accompanying drawings, it should be appreciated that above-mentioned method, system and Equipment is only exemplary embodiment or example, and the scope of the present invention is not limited by these embodiment or examples, but only By after authorizing claims and its equivalency range limit.Various elements in embodiment or example can be omitted or It can be substituted by its equivalent elements.Furthermore, it is possible to execute each step by being different from order described in the disclosure.Further Ground can combine the various elements in embodiment or example in various ways.It is important that being described herein with the evolution of technology Many elements can be replaced by the equivalent elements occurred after the disclosure.
In a first aspect, a kind of method for handling the output result of optical character recognition technology, wherein the output is tied Fruit includes multiple texts of at least a line identified, and each identified text has word confidence level for indicating corresponding The credibility of recognition result, which comprises according to the word confidence level of each text identified, calculate the output As a result average confidence;The average confidence is compared with first threshold, if the average confidence be less than or Equal to the first threshold, then the output result is abandoned.
Second aspect, method as described in relation to the first aspect, wherein the average confidence is each row in the output result Row confidence level average value, and wherein, the row confidence level of each row, which passes through, is averaging the word confidence level of each text in the row Value obtains.
The third aspect, according to the method for first or second aspect, wherein if the average confidence is greater than described first The row confidence level of each row in the output result is compared by threshold value at least another threshold value respectively, and according to comparing As a result corresponding row is handled.
Fourth aspect, according to the method for the third aspect, wherein respectively by the row confidence level of each row in the output result It is compared at least another threshold value, and the step of being handled according to comparison result corresponding row includes: respectively by institute The row confidence level for stating each row in output result is compared with second threshold, is deleted row confidence level and is less than or equal to described second The output result of the row of threshold value;Reservation line confidence level is greater than the output result of the row of the second threshold.
5th aspect, according to the method for the third aspect, wherein at least another described threshold value includes third threshold value and the 4th Threshold value, wherein the 4th threshold value is greater than the third threshold value, described the step of handling corresponding row includes: to delete row Confidence level is less than or equal to the output result of the row of the third threshold value;Reservation line confidence level is greater than or equal to the 4th threshold value Row output result;If the row confidence level of corresponding row is greater than the third threshold value and is less than the 4th threshold value, respectively should The word confidence level of each text judges whether the output for deleting corresponding text according to comparison result compared with the 5th threshold value in row As a result.
6th aspect, according to the method for the 5th aspect, wherein if the word confidence level of corresponding text is less than the described 5th Threshold value then deletes the output result of the text;If the word confidence level of corresponding text is greater than or equal to the 5th threshold value, Retain the output result of the text.
7th aspect, the method according to any one of first to the 6th aspect, wherein the output result is used for language Sound casting.
Eighth aspect, a kind of chip circuit, for handling the output result of optical character recognition technology, comprising: be configured For the circuit unit for the step of executing method described in any one of first to the 7th aspect.
9th aspect, a kind of reading aids, comprising: sensor is configured as obtaining the image comprising word content; Such as according to the chip circuit in terms of the 7th, the chip circuit further includes being configured as carrying out Text region to described image to obtain Comprising text output result circuit unit and be configured as by treated export result be converted into audio-frequency information Circuit unit;And audio output apparatus, it is configured as exporting the audio-frequency information.
Tenth aspect, a kind of reading aids, comprising: memory, for storing the image comprising word content;Such as root According to the chip circuit of the 7th aspect, the chip circuit further includes being configured as carrying out Text region to described image to be wrapped The circuit unit of output result containing text and it is configured as that output result is converted into the circuit of audio-frequency information by treated Unit;And audio output apparatus, it is configured as exporting the audio-frequency information.
Tenth on the one hand, and a kind of computer readable storage medium storing program, described program includes instruction, described instruction In the processor execution by electronic equipment, the electronic equipment is caused to execute side described in any one of first to the 7th aspect Method.

Claims (10)

1. a kind of method for handling the output result of optical character recognition technology, wherein the output result includes being identified Multiple texts of at least a line out, what each identified text had that word confidence level is used to indicate corresponding recognition result can Letter degree, which comprises
According to the word confidence level of each text identified, the average confidence of the output result is calculated;And
The average confidence is compared with first threshold, if the average confidence is less than or equal to first threshold Value, then abandon the output result.
2. the method according to claim 1, wherein, respectively will be described if the average confidence is greater than the first threshold The row confidence level of each row in output result is compared at least another threshold value, and is advanced according to comparison result to corresponding Row processing.
3. method according to claim 2, wherein respectively by it is described output result in each row row confidence level with it is at least another A threshold value is compared, and the step of being handled according to comparison result corresponding row includes:
The row confidence level of each row in the output result is compared with second threshold respectively;
Delete the output result that row confidence level is less than or equal to the row of the second threshold;And
Reservation line confidence level is greater than the output result of the row of the second threshold.
4. method according to claim 2, wherein at least another described threshold value includes third threshold value and the 4th threshold value, wherein 4th threshold value is greater than the third threshold value, and described the step of handling corresponding row includes:
Delete the output result that row confidence level is less than or equal to the row of the third threshold value;
Reservation line confidence level is greater than or equal to the output result of the row of the 4th threshold value;And
If the row confidence level of corresponding row is greater than the third threshold value and is less than the 4th threshold value, respectively by text each in the row Word confidence level judges whether the output result for deleting corresponding text according to comparison result compared with the 5th threshold value.
5. method according to claim 4, wherein if the word confidence level of corresponding text is less than the 5th threshold value, delete Except the output result of the text;If the word confidence level of corresponding text is greater than or equal to the 5th threshold value, retain this article The output result of word.
6. the method according to any one of claims 1 to 5, wherein the output result is used for voice broadcast.
7. a kind of chip circuit, for handling the output result of optical character recognition technology, comprising:
It is configured as the circuit unit for the step of executing method according to any one of claim 1 to 6.
8. a kind of reading aids, comprising:
Sensor is configured as obtaining the image comprising word content;
Chip circuit as claimed in claim 7, the chip circuit further include:
It is configured as carrying out Text region to described image to obtain the circuit unit of the output result comprising text;And
It is configured as that output result is converted into the circuit unit of audio-frequency information by treated;And
Audio output apparatus is configured as exporting the audio-frequency information.
9. a kind of reading aids, comprising:
Memory, for storing the image comprising word content;
Chip circuit as claimed in claim 7, the chip circuit further include:
It is configured as carrying out Text region to described image to obtain the circuit unit of the output result comprising text;And
It is configured as that output result is converted into the circuit unit of audio-frequency information by treated;And
Audio output apparatus is configured as exporting the audio-frequency information.
10. a kind of computer readable storage medium for storing program, described program include instruction, described instruction is by electronic equipment Processor when executing, cause the electronic equipment to execute method according to any one of claim 1 to 6.
CN201910346265.XA 2019-04-26 2019-04-26 Processing of output results of optical character recognition techniques Active CN109902768B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910346265.XA CN109902768B (en) 2019-04-26 2019-04-26 Processing of output results of optical character recognition techniques

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910346265.XA CN109902768B (en) 2019-04-26 2019-04-26 Processing of output results of optical character recognition techniques

Publications (2)

Publication Number Publication Date
CN109902768A true CN109902768A (en) 2019-06-18
CN109902768B CN109902768B (en) 2021-06-29

Family

ID=66956495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910346265.XA Active CN109902768B (en) 2019-04-26 2019-04-26 Processing of output results of optical character recognition techniques

Country Status (1)

Country Link
CN (1) CN109902768B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991411A (en) * 2019-12-20 2020-04-10 谢骏 Intelligent document structured extraction method suitable for logistics industry
CN111242455A (en) * 2020-01-07 2020-06-05 北京百度网讯科技有限公司 Method and device for evaluating voice function of electronic map, electronic equipment and storage medium
WO2021147219A1 (en) * 2020-01-22 2021-07-29 平安科技(深圳)有限公司 Image-based text recognition method and apparatus, electronic device, and storage medium
CN113723422A (en) * 2021-09-08 2021-11-30 重庆紫光华山智安科技有限公司 License plate information determination method, system, device and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5526447A (en) * 1993-07-26 1996-06-11 Cognitronics Imaging Systems, Inc. Batched character image processing
CN101256631A (en) * 2007-02-26 2008-09-03 富士通株式会社 Method, apparatus, program and readable storage medium for character recognition
CN102663454A (en) * 2012-04-18 2012-09-12 安徽科大讯飞信息科技股份有限公司 Method and device for evaluating character writing standard degree
CN104008384A (en) * 2013-02-26 2014-08-27 山东新北洋信息技术股份有限公司 Character identification method and character identification apparatus
US20150286888A1 (en) * 2014-04-02 2015-10-08 Benoit Maison Optical Character Recognition System Using Multiple Images and Method of Use
CN104978578A (en) * 2015-04-21 2015-10-14 深圳市前海点通数据有限公司 Mobile phone photo taking text image quality evaluation method
CN107301385A (en) * 2017-06-09 2017-10-27 浙江宇视科技有限公司 One kind blocks licence plate recognition method and device
CN107679074A (en) * 2017-08-25 2018-02-09 百度在线网络技术(北京)有限公司 A kind of Picture Generation Method and equipment
CN108399405A (en) * 2017-02-07 2018-08-14 腾讯科技(上海)有限公司 Business license recognition methods and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5526447A (en) * 1993-07-26 1996-06-11 Cognitronics Imaging Systems, Inc. Batched character image processing
CN101256631A (en) * 2007-02-26 2008-09-03 富士通株式会社 Method, apparatus, program and readable storage medium for character recognition
CN102663454A (en) * 2012-04-18 2012-09-12 安徽科大讯飞信息科技股份有限公司 Method and device for evaluating character writing standard degree
CN104008384A (en) * 2013-02-26 2014-08-27 山东新北洋信息技术股份有限公司 Character identification method and character identification apparatus
US20150286888A1 (en) * 2014-04-02 2015-10-08 Benoit Maison Optical Character Recognition System Using Multiple Images and Method of Use
CN104978578A (en) * 2015-04-21 2015-10-14 深圳市前海点通数据有限公司 Mobile phone photo taking text image quality evaluation method
CN108399405A (en) * 2017-02-07 2018-08-14 腾讯科技(上海)有限公司 Business license recognition methods and device
CN107301385A (en) * 2017-06-09 2017-10-27 浙江宇视科技有限公司 One kind blocks licence plate recognition method and device
CN107679074A (en) * 2017-08-25 2018-02-09 百度在线网络技术(北京)有限公司 A kind of Picture Generation Method and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DA-HAN WANG ET AL.: ""String-level learning of confidence transformation for Chinese handwritten text recognition"", 《PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR2012)》 *
吴健辉 等: ""基于置信度的多分类器互补集成手写数字识别"", 《计算机工程与应用》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991411A (en) * 2019-12-20 2020-04-10 谢骏 Intelligent document structured extraction method suitable for logistics industry
CN111242455A (en) * 2020-01-07 2020-06-05 北京百度网讯科技有限公司 Method and device for evaluating voice function of electronic map, electronic equipment and storage medium
WO2021147219A1 (en) * 2020-01-22 2021-07-29 平安科技(深圳)有限公司 Image-based text recognition method and apparatus, electronic device, and storage medium
CN113723422A (en) * 2021-09-08 2021-11-30 重庆紫光华山智安科技有限公司 License plate information determination method, system, device and medium
CN113723422B (en) * 2021-09-08 2023-10-17 重庆紫光华山智安科技有限公司 License plate information determining method, system, equipment and medium

Also Published As

Publication number Publication date
CN109902768B (en) 2021-06-29

Similar Documents

Publication Publication Date Title
CN109902768A (en) The processing of the output result of optical character recognition technology
RU2762142C1 (en) Method and apparatus for determining the key point of the face, computer apparatus, and data storage
CN108076290B (en) Image processing method and mobile terminal
US8463075B2 (en) Dynamically resizing text area on a display device
CN103514581B (en) Screen picture capturing method, device and terminal equipment
CN104683692A (en) Continuous shooting method and continuous shooting device
CN111445902B (en) Data collection method, device, storage medium and electronic equipment
CN111368685A (en) Key point identification method and device, readable medium and electronic equipment
JP7132654B2 (en) LAYOUT ANALYSIS METHOD, READING AID DEVICE, CIRCUIT AND MEDIUM
US10789914B2 (en) Computer system, screen sharing method, and program
CN105282547A (en) Code rate control method and device of video encoding
CN111564157A (en) Conference record optimization method, device, equipment and storage medium
CN105892612A (en) Method and apparatus for powering off terminal
CN108108217A (en) Method and device for cutting long screen
US20230206093A1 (en) Music recommendation method and apparatus
CN105701762A (en) Picture processing method and electronic equipment
CN103927341B (en) A kind of method and device for obtaining scene information
CN110971924B (en) Method, device, storage medium and system for beautifying in live broadcast process
WO2020124454A1 (en) Font switching method and related product
CN113139527B (en) Video privacy protection method, device, equipment and storage medium
CN106529307A (en) Photo encryption method and device
CN110969161B (en) Image processing method, circuit, vision-impaired assisting device, electronic device, and medium
CN114049670A (en) Image processing method, image processing device, electronic equipment and storage medium
CN114186535A (en) Structure diagram reduction method, device, electronic equipment, medium and program product
CN109993807B (en) Head portrait generation method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant