CN109034148A - One kind is based on character image identification audio reading method and its device - Google Patents

One kind is based on character image identification audio reading method and its device Download PDF

Info

Publication number
CN109034148A
CN109034148A CN201810747552.7A CN201810747552A CN109034148A CN 109034148 A CN109034148 A CN 109034148A CN 201810747552 A CN201810747552 A CN 201810747552A CN 109034148 A CN109034148 A CN 109034148A
Authority
CN
China
Prior art keywords
character image
audio
image information
information
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810747552.7A
Other languages
Chinese (zh)
Inventor
岳子煊
李雨晴
霍文奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Xuzhou College of Industrial Technology
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN201810747552.7A priority Critical patent/CN109034148A/en
Publication of CN109034148A publication Critical patent/CN109034148A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/142Image acquisition using hand-held instruments; Constructional details of the instruments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B17/00Teaching reading
    • G09B17/003Teaching reading electrically operated apparatus or devices
    • G09B17/006Teaching reading electrically operated apparatus or devices with audible presentation of the material to be studied
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/062Combinations of audio and printed presentations, e.g. magnetically striped cards, talking books, magnetic tapes with printed texts thereon

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses one kind based on character image identification audio reading method and its device, to solve the problems, such as that the aid reading tool of existing talking pen etc can not be widely used in common language material.This method comprises: the character image information on acquisition written material, the figure of the corresponding at least letter symbol of each character image information is made by autozoom;The character image information is pre-processed, identifies that the character image information is gone forward side by side row text information matches;The matched text information is matched with the word audio information of audio database, and the word audio information is played in real time.The character image information that the present invention passes through acquisition written material, utilize image recognition technology and Audio Matching technology, the text information of character image information identify and corresponding audio is played by loudspeaker, so as to carry out aid reading to common language material, the reading of user is greatly facilitated.

Description

One kind is based on character image identification audio reading method and its device
Technical field
The present invention relates to intelligent arrangement for reading technical fields, and in particular to one kind identifies audio reading side based on character image Method and its device.
Background technique
As intelligence technologically continues to develop, there are more and more intelligence and read auxiliary tool, wherein talking pen is It is that the New Generation of Intelligent after learning machine, point reader is read and educated using the high-tech product of optical image recognition technology Learning tool.At present general talking pen software and hardware architecture include Sensor (infrared photosensitive), MCU, OID algorithm, can reflect it is red The special coating of outer light prints books (i.e. mating books), the core OID algorithm of talking pen, and OID is also known as photosensitive pen or optics Identification instrument, principle are exactly by the instrument with lighting apparatus come the digital signal of influence chart on piece, thus to serial book is matched Certain response occurs in this.There is a kind of thing to be called pen in OID, pen of the pen here namely described in us, but he It is not our common pens to write again, but the inside is integrated with a kind of optical sensing and identifier of some electronic components Device.It is to be sensed using contact in OID, i.e. pen touches mating books, then perceives the information of mating books, last root Certain reflection is made according to the information received.Contain one layer of ins and outs layer on its mating books, contains an institute on ins and outs layer The information that can be sensed is incorporated into some code code to its picture or text according to the demand of specific mating books, and each A code is to have digital number index to be identified to it, when pen light pen recognizes a code, First identify its No. index, then by this No. index reflection to own chip the inside (certain chip be write in advance as Event driven program corresponding with these pictures, event driven program can also generate according to OID software), then pen meeting Certain response is made according to the content stored inside chip.This is in fact the same just as computer, when user gives a task, It can realize this task according to the program being previously stored in light pen.For example, using most voice function in terms of OID Can, i.e., the content on picture is clicked by pen and made a sound.
But talking pen is confined to the mating books of containing the ins and outs layer, and to common books without recognition capability, mating books in production Coding it is at high cost, can not be widely used.
Summary of the invention
The purpose of the present invention is to provide one kind based on character image identification audio reading method and its device, to solve The aid reading tool of existing talking pen etc can not be applied to the problem of common language material.
To achieve the above object, technical solution of the present invention provides a kind of based on character image identification audio reading side Method, this method comprises:
The character image information on written material is acquired, makes each character image information corresponding at least by autozoom The figure of one letter symbol;
The character image information is pre-processed, identifies that the character image information is gone forward side by side row text information matches;
The matched text information is matched with the word audio information of audio database, and by the text sound Frequency information is played in real time.
Further, it is described to the character image information carry out pretreatment include:
Gray proces are carried out to the character image information, and the character image information is carried out using adaptive threshold Threshold process increases contrast.
Further, the character image information on the acquisition written material, makes each text figure by autozoom As the figure of the corresponding at least letter symbol of information includes:
When camera is close to written material described in simultaneously face, according to the font size of the written material, by automatic Zoom makes the figure for only having a letter symbol in each picture.
Further, when the character image information recognition failures or the word audio information matches fail, hair Warning note out.
Further, the audio database and corresponding text image data library are updated using cloud database.
Further, the text information identified using a display screen real-time display, and show what cloud database was sent The corresponding literal interpretation of the text information.
Based on the same inventive concept, present invention also provides one kind identifies audio reading device, the dress based on character image Setting includes: device noumenon, the autozoom camera that the loudspeaker of device noumenon upper end is arranged in, device noumenon end is arranged in And it is arranged in the intrinsic main control board of described device and lithium battery;It is integrally disposed on the main control board to have micro process Device, and the figure identification chip, stereo process chip, audio memory and the picture that are electrically connected with the microprocessor store Device, the character image information of the autozoom camera acquisition written material, passes through the figure identification chip and the figure Piece memory matched identifies the character image information, and the microprocessor is by the text information identified and the audio storage The audio database of device is matched, by being sent to the loudspeaker after the stereo process chip.
Further, be additionally provided with power management chip on the main control board, the power management chip with it is described Microprocessor connection, the corresponding power management chip are provided with charging interface.
Further, display screen and auxiliary camera are additionally provided on described device ontology, the display screen and described auxiliary Camera is helped to connect with the microprocessor.
Further, described device ontology two sides be respectively arranged with USB data interface, audio key, switching key and Earphone jack.
Optionally, wireless communication chips are provided on the main control board, for carrying out wireless communication with intelligent terminal. The wireless communication chips are WiFi communication device, bluetooth communication device, ZigBee communication device or twireless radio-frequency communication device Part etc..
The present invention has the advantage that
It is provided in an embodiment of the present invention that audio reading method and its device are identified based on character image, pass through and acquires text material The character image information of material carries out the text information of character image information using image recognition technology and Audio Matching technology It identifies and passes through loudspeaker and play corresponding audio, it is greatly convenient so as to carry out aid reading to common language material The reading of user.
Detailed description of the invention
Provided in an embodiment of the present invention kind of Fig. 1 identifies audio reading method flow diagram based on character image.
Provided in an embodiment of the present invention kind of Fig. 2 identifies audio reading device left view structure chart based on character image.
Provided in an embodiment of the present invention kind of Fig. 3 identifies the right view structure chart of audio reading device based on character image.
Provided in an embodiment of the present invention kind of Fig. 4 identifies audio reading device use state structure chart based on character image.
Provided in an embodiment of the present invention kind of Fig. 5 identifies audio reading device system structure diagram based on character image.
Specific embodiment
The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention..
Embodiment 1
Audio reading method, this method packet are identified based on character image as shown in Figure 1, the embodiment of the invention provides one kind It includes:
Character image information on S101, acquisition written material, makes each character image information pair by autozoom Answer the figure of an at least letter symbol;
Character image information only one or several letter symbols acquired by autozoom, so that text figure As the spatial cache that information occupies is small, and the composition of image information is not loaded, the resource expended when reducing Text region.
S102, the character image information is pre-processed, identify the character image information and carries out text information Matching;
Pass through character image number so that character image information is clear and is conducive to be identified by gray scale, contrast processing It is matched according to grapholect image pre-stored in library, to achieve the purpose that Text region.Such as pytesser, OCR In Python using the Tesseract engine from Google, these Text region application programs can be very The good above-mentioned work of completion.
S103, the matched text information is matched with the word audio information of audio database, and will be described Word audio information is played in real time.
Wherein, it is described to the character image information carry out pretreatment include:
Gray proces are carried out to the character image information, and the character image information is carried out using adaptive threshold Threshold process increases contrast.
Wherein, the character image information on the acquisition written material, believes each character image by autozoom The figure for ceasing a corresponding at least letter symbol includes:
When camera is close to written material described in simultaneously face, according to the font size of the written material, by automatic Zoom makes the figure for only having a letter symbol in each picture.Due to only one letter symbol of each picture, text Word identifies low in energy consumption, to be suitable for small-sized mobile device.
Optionally, it when the character image information recognition failures or the word audio information matches fail, issues Warning note.
Optionally, the audio database and corresponding text image data library are updated using cloud database.
Optionally, the text information identified using a display screen real-time display, and show the institute that cloud database is sent State the corresponding literal interpretation of text information.
Embodiment 2
Based on the same inventive concept, as shown in Figure 2-5, present invention also provides one kind is read based on character image identification audio Read apparatus, the device include: device noumenon 10, and the loudspeaker 20 of 10 upper end of device noumenon is arranged in, is arranged in device noumenon 10 The autozoom camera 30 of end and the main control board 40 and lithium battery 50 being arranged in described device ontology 10;It is described Figure identification chip, mixed integrally disposed on main control board 40 to have a microprocessor, and being electrically connected with the microprocessor Sound handles chip, audio memory and picture memory, and the autozoom camera 30 acquires the character image of written material Information passes through character image information described in the figure identification chip and the picture memory match cognization, the micro process Device matches the text information identified with the audio database of the audio memory, passes through the stereo process chip After be sent to the loudspeaker 20.
Wherein, the focal length variations section of autozoom camera 30 is between 10mm-30mm, using electronics autozoom, Make only have one or several letter symbols in character image information under micro-processor control.Figure identification chip can be with It is the MA2450 chip of Movidius.
Wherein, be additionally provided with power management chip on the main control board 40, the power management chip with it is described micro- Processor connection, the corresponding power management chip are provided with charging interface 43 and switch key 41.
Wherein, display screen 60 and auxiliary camera 70, the display screen 60 and institute are additionally provided on described device ontology 10 Auxiliary camera 70 is stated to connect with the microprocessor.Auxiliary camera 70 is larger for acquiring distinguishingly remote or volume Letter symbol.
Wherein, described device ontology two sides are respectively arranged with USB data interface 42, audio key 45, switching key 46, ear Machine transplanting of rice hole 44 and repeat playing key 47.Switching key 46 is used to show the corresponding explanation information of text by display screen.
Optionally, wireless communication chips are provided on the main control board 40, for carrying out channel radio with intelligent terminal Letter.The wireless communication chips are WiFi communication device, bluetooth communication device, ZigBee communication device or twireless radio-frequency communication Device etc..
As shown in Figure 2,4, the light transmission contact 31 of arc is provided in the end of device noumenon, in use, light transmission contact exists Written material sliding, since in order to adapt to manpower operation, light transmission contact 31 tilts 8-15 degree, corresponding light transmission contact in vertical direction 31 are provided with reflecting optics 32, for enable incident ray vertical enter autozoom camera.
It is provided in an embodiment of the present invention that audio reading method and its device are identified based on character image, pass through and acquires text material The character image information of material carries out the text information of character image information using image recognition technology and Audio Matching technology It identifies and passes through loudspeaker and play corresponding audio, it is greatly convenient so as to carry out aid reading to common language material The reading of user.
Although above having used general explanation and specific embodiment, the present invention is described in detail, at this On the basis of invention, it can be made some modifications or improvements, this will be apparent to those skilled in the art.Therefore, These modifications or improvements without departing from theon the basis of the spirit of the present invention are fallen within the scope of the claimed invention.

Claims (10)

1. one kind identifies audio reading method based on character image, which is characterized in that the described method includes:
The character image information on written material is acquired, corresponding at least one text of each character image information is made by autozoom The figure of character number;
The character image information is pre-processed, identifies that the character image information is gone forward side by side row text information matches;
The matched text information is matched with the word audio information of audio database, and the word audio is believed Breath is played in real time.
2. according to claim 1 a kind of based on character image identification audio reading method, which is characterized in that described to institute It states character image information and pre-process and include:
Gray proces are carried out to the character image information, and threshold value is carried out to the character image information using adaptive threshold Processing increases contrast.
3. according to claim 1 a kind of based on character image identification audio reading method, which is characterized in that the acquisition Character image information on written material makes the corresponding at least letter symbol of each character image information by autozoom Figure includes:
When camera is close to written material described in simultaneously face, according to the font size of the written material, pass through autozoom So that only having the figure of a letter symbol in each picture.
4. according to claim 1 a kind of based on character image identification audio reading method, which is characterized in that when the text When word image information recognition failures or the word audio information matches fail, warning note is issued.
5. according to claim 1 a kind of based on character image identification audio reading method, which is characterized in that utilize cloud Audio database described in database update and corresponding text image data library.
6. according to claim 1 a kind of based on character image identification audio reading method, which is characterized in that aobvious using one The text information that display screen real-time display is identified, and show the corresponding text solution of the text information that cloud database is sent It releases.
7. one kind identifies audio reading device based on character image, which is characterized in that described device includes: device noumenon, setting Device noumenon upper end loudspeaker, be arranged in device noumenon end autozoom camera and be arranged in described device sheet Intracorporal main control board and lithium battery;It is integrally disposed on the main control board to have a microprocessor, and with the micro process Figure identification chip, stereo process chip, audio memory and the picture memory that device is electrically connected, the autozoom camera shooting The character image information of head acquisition written material, by described in the figure identification chip and the picture memory match cognization Character image information, the microprocessor carry out the audio database of the text information identified and the audio memory Match, by being sent to the loudspeaker after the stereo process chip.
8. according to claim 7 a kind of based on character image identification audio reading device, which is characterized in that the master control Power management chip is additionally provided on circuit board, the power management chip is connect with the microprocessor, the corresponding power supply Managing chip is provided with charging interface.
9. according to claim 7 a kind of based on character image identification audio reading device, which is characterized in that described device Display screen and auxiliary camera are additionally provided on ontology, the display screen and the auxiliary camera and the microprocessor connect It connects.
10. according to claim 7 a kind of based on character image identification audio reading device, which is characterized in that the dress It sets ontology two sides and is respectively arranged with USB data interface, audio key, switching key and earphone jack.
CN201810747552.7A 2018-07-09 2018-07-09 One kind is based on character image identification audio reading method and its device Pending CN109034148A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810747552.7A CN109034148A (en) 2018-07-09 2018-07-09 One kind is based on character image identification audio reading method and its device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810747552.7A CN109034148A (en) 2018-07-09 2018-07-09 One kind is based on character image identification audio reading method and its device

Publications (1)

Publication Number Publication Date
CN109034148A true CN109034148A (en) 2018-12-18

Family

ID=64641515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810747552.7A Pending CN109034148A (en) 2018-07-09 2018-07-09 One kind is based on character image identification audio reading method and its device

Country Status (1)

Country Link
CN (1) CN109034148A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948600A (en) * 2019-01-15 2019-06-28 深圳市同洲电子股份有限公司 A kind of intelligence Text region pen and character identification system
CN110688991A (en) * 2019-11-05 2020-01-14 广东麒麟精工科技有限公司 Intelligent reading method and intelligent learning table thereof
CN111586301A (en) * 2020-05-11 2020-08-25 广东小天才科技有限公司 Method for improving zoom speed of point-to-read scene camera and terminal equipment
CN113593542A (en) * 2020-04-30 2021-11-02 百度在线网络技术(北京)有限公司 Query method, query device, terminal equipment and storage medium
CN114338622A (en) * 2021-12-28 2022-04-12 歌尔光学科技有限公司 Audio transmission method, audio playing method, storage medium and related equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1773523A (en) * 2004-11-08 2006-05-17 乐金电子(昆山)电脑有限公司 Character identification and sound outputting apparatus and method for portable infomation terminal machine with photographic head
US20120189210A1 (en) * 2011-01-20 2012-07-26 Fuji Xerox Co., Ltd. Image processing apparatus, computer readable medium, and image processing method
CN204833766U (en) * 2015-08-03 2015-12-02 熊子箭 Pen is read to point
CN205281851U (en) * 2016-01-05 2016-06-01 深圳市柯达科电子科技有限公司 Electronic reading equipment
CN108171231A (en) * 2016-12-07 2018-06-15 中兴通讯股份有限公司 A kind of communication means and device based on image identification

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1773523A (en) * 2004-11-08 2006-05-17 乐金电子(昆山)电脑有限公司 Character identification and sound outputting apparatus and method for portable infomation terminal machine with photographic head
US20120189210A1 (en) * 2011-01-20 2012-07-26 Fuji Xerox Co., Ltd. Image processing apparatus, computer readable medium, and image processing method
CN204833766U (en) * 2015-08-03 2015-12-02 熊子箭 Pen is read to point
CN205281851U (en) * 2016-01-05 2016-06-01 深圳市柯达科电子科技有限公司 Electronic reading equipment
CN108171231A (en) * 2016-12-07 2018-06-15 中兴通讯股份有限公司 A kind of communication means and device based on image identification

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948600A (en) * 2019-01-15 2019-06-28 深圳市同洲电子股份有限公司 A kind of intelligence Text region pen and character identification system
CN110688991A (en) * 2019-11-05 2020-01-14 广东麒麟精工科技有限公司 Intelligent reading method and intelligent learning table thereof
CN113593542A (en) * 2020-04-30 2021-11-02 百度在线网络技术(北京)有限公司 Query method, query device, terminal equipment and storage medium
CN111586301A (en) * 2020-05-11 2020-08-25 广东小天才科技有限公司 Method for improving zoom speed of point-to-read scene camera and terminal equipment
CN111586301B (en) * 2020-05-11 2021-12-21 广东小天才科技有限公司 Method for improving zoom speed of point-to-read scene camera and terminal equipment
CN114338622A (en) * 2021-12-28 2022-04-12 歌尔光学科技有限公司 Audio transmission method, audio playing method, storage medium and related equipment

Similar Documents

Publication Publication Date Title
CN109034148A (en) One kind is based on character image identification audio reading method and its device
CN103605975A (en) Image processing method and device and terminal device
CN110909543A (en) Intention recognition method, device, equipment and medium
CN102509063A (en) Synchronous reading system and synchronous reading method
CN110022397A (en) Image processing method, device, storage medium and electronic equipment
CN109118862A (en) A kind of put reads verifying device and point reading answer verifying methods of marking
CN209514900U (en) A kind of point reading verifying device
CN111310461A (en) Event element extraction method, device, equipment and storage medium
CN111626233B (en) Key point marking method, system, machine readable medium and equipment
CN101980526A (en) Remote controller and identifying and reading method thereof
CN112084780B (en) Coreference resolution method, device, equipment and medium in natural language processing
JP6944920B2 (en) Smart interactive processing methods, equipment, equipment and computer storage media
CN115730047A (en) Intelligent question-answering method, equipment, device and storage medium
CN112749248A (en) Text element content extraction method and device, equipment and computer storage medium
CN108932448B (en) Electronic screen-based click-to-read code identification method, terminal and click-to-read pen
CN109871432A (en) Look into method and device, the terminal device, computer readable storage medium of words
CN209912178U (en) Point-reading answering device
CN211181137U (en) Multifunctional language learning terminal with visual recognition and handwriting board
CN115438212B (en) Image projection system, method and equipment
CN111126081B (en) Global universal language terminal and method
CN205680103U (en) A kind of embedded real-time face identification device
WO2021098175A1 (en) Method and apparatus for guiding speech packet recording function, device, and computer storage medium
CN208027370U (en) Merchandise news extraction element and system
CN117807993A (en) Word segmentation method, word segmentation device, computer equipment and storage medium
CN117690147A (en) Text recognition method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20191210

Address after: 221000 No. 1 Xiang Nan Road, Gulou District, Jiangsu, Xuzhou

Applicant after: Xuzhou Institute of Industry Technology

Applicant after: China University of Mining and Technology

Address before: 221116 Nanhu Campus, China University of Mining and Technology, No. 1 Tongshan University Road, Xuzhou City, Jiangsu Province

Applicant before: China University of Mining and Technology

TA01 Transfer of patent application right
RJ01 Rejection of invention patent application after publication

Application publication date: 20181218

RJ01 Rejection of invention patent application after publication