CN110287830A - Intelligence wearing terminal, cloud server and data processing method - Google Patents

Intelligence wearing terminal, cloud server and data processing method Download PDF

Info

Publication number
CN110287830A
CN110287830A CN201910508817.2A CN201910508817A CN110287830A CN 110287830 A CN110287830 A CN 110287830A CN 201910508817 A CN201910508817 A CN 201910508817A CN 110287830 A CN110287830 A CN 110287830A
Authority
CN
China
Prior art keywords
text
image
obtains
voice
carried out
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910508817.2A
Other languages
Chinese (zh)
Inventor
佘少华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Xiaozhuan Technology Co Ltd
Original Assignee
Guangzhou Xiaozhuan Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Xiaozhuan Technology Co Ltd filed Critical Guangzhou Xiaozhuan Technology Co Ltd
Priority to CN201910508817.2A priority Critical patent/CN110287830A/en
Publication of CN110287830A publication Critical patent/CN110287830A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Abstract

This application discloses a kind of intelligence wearing terminal, cloud server and data processing methods.This method includes carrying out character recognition to the text image of acquisition, obtains writing text;Writing text is sent to cloud server, so that cloud server carries out voice conversion to writing text according to cloud sound bank, obtains casting voice, and feed back casting voice;Audio broadcasting is carried out to received casting voice.The application can solve the technical issues of vision disorder crowd can not effectively read in the related technology.

Description

Intelligence wearing terminal, cloud server and data processing method
Technical field
This application involves technical field of data processing, in particular to a kind of intelligence wearing terminal, cloud server and Data processing method.
Background technique
Recently as the appearance of intelligent wearable device, more and more wearable devices are used in multimedia consumer field use Carry out real more user functions and meets different user experience.Intelligent glasses as wearable device one kind, at present The major part that the similar audio-video operation of reality, content show, voice control, the generally universal intelligent equipment such as navigation are possessed Function.But it is also not comprehensive to the application in every field due to the application also untapped maturation at present of intelligent glasses, for example as In wearable device intelligent miscellaneous function most important to user, however it remains more special applications are not implemented.
With we, have vision disorder crowd quite a lot, they can't see it is bright, can not read, is even very little Step is difficult to walk;Simultaneously with the development of society, aging is increasingly severe, hypopsia caused by the age increases is brought to the elderly Huge puzzlement, especially they carry out books perhaps newspaper read when due to eyesight or printing etc., it is more difficult to distinguish Not and understand the lesser content of font in books/newspaper;When thirdly old man reads for a long time, occurs attention often and do not collect In or more difficult books related content the meaning;Currently, for the puzzlement on above-mentioned several vision disorder crowds, at present There are no too many very effective technological means to be solved.
Aiming at the problem that vision disorder crowd can not effectively read in the related technology, effective solution is not yet proposed at present Certainly scheme.
Summary of the invention
The main purpose of the application is to provide a kind of intelligence wearing terminal, cloud server and data processing method, with Solve the problems, such as that vision disorder crowd can not effectively read in the related technology.
To achieve the goals above, in a first aspect, this application provides a kind of data processing method, this method is applied to intelligence Terminal can be dressed, this method comprises:
Character recognition is carried out to the text image of acquisition, obtains writing text;
Writing text is sent to cloud server, so that cloud server carries out language to writing text according to cloud sound bank Sound conversion obtains casting voice, and feeds back casting voice;
Audio broadcasting is carried out to received casting voice.
Optionally, character recognition is carried out to text image, comprising:
Image procossing is carried out to text image, obtains the character area on text image;
Field segmentation is carried out to character area, obtains at least one cut zone;
For each cut zone, each text image data in the cut zone is extracted, and to each character image Data carry out OCR (Optical Character Recognition, optical character identification) identification, obtain writing text.
Optionally, carrying out image procossing to text image includes:
Image grayscale processing is carried out to text image, obtains gray level image;
Image binaryzation processing is carried out to gray level image, obtains binary image;
It identifies in binary image with the presence or absence of font style characteristic;
When identify in binary image there are when font style characteristic, delimit out include font style characteristic character area.
Optionally, this method further include:
Acquire the corresponding current distance value and/or electro-optical feedback value of current goal medium;
Determine whether distance value is not more than text and can recognize whether distance threshold and/or electro-optical feedback value are not less than image It can recognize threshold value;
When distance value can recognize whether distance threshold and/or electro-optical feedback value are recognizable not less than image no more than text When threshold value, the text image of current goal is acquired.
Optionally, this method comprises:
Obtain user speech;
User speech is sent to server, so that user speech is converted to text data according to cloud sound bank by server Instruction, and by text data instruction feedback;
Determine whether text data instruction matches with specified operational order;
When text data instruction matches with specified operational order, executes and character recognition is carried out to the text image of acquisition The step of.
Second aspect, present invention also provides another data processing method, this method is applied to cloud server, the party Method includes:
Receive the writing text that intelligence wearing terminal is sent, wherein intelligence wearing terminal carries out the text image of acquisition Character recognition obtains writing text;
Voice conversion is carried out to writing text according to cloud sound bank, obtains casting voice;
Casting voice feedback is given to intelligence wearing terminal, so that intelligence wearing terminal carries out audio broadcasting to casting voice.
The third aspect, present invention also provides a kind of intelligence wearing terminal, intelligence wearing terminal includes:
Identification module obtains writing text for carrying out character recognition to the text image of acquisition;
Sending module, for writing text to be sent to cloud server, so that cloud server is according to cloud sound bank pair Writing text carries out voice conversion, obtains casting voice, and feed back casting voice;
Playing module, for carrying out audio broadcasting to received casting voice.
Optionally, identification module is used for:
Image procossing is carried out to text image, obtains the character area on text image;
Field segmentation is carried out to character area, obtains at least one cut zone;
For each cut zone, each text image data in the cut zone is extracted, and to each character image Data carry out OCR identification, obtain writing text.
Fourth aspect, present invention also provides a kind of cloud server, which includes:
Receiving module, the writing text sent for receiving intelligence wearing terminal, wherein intelligence wearing terminal is to acquisition Text image carries out character recognition, obtains writing text;
Conversion module obtains casting voice for carrying out voice conversion to writing text according to cloud sound bank;
Feedback module gives intelligence wearing terminal for that will broadcast voice feedback, so that intelligence wearing terminal is to casting voice Carry out audio broadcasting.
5th aspect, present invention also provides a kind of computer readable storage medium, which is deposited Computer code is contained, when computer code is performed, above-mentioned data processing method is performed.
In data processing method provided by the present application, by carrying out character recognition to the text image of acquisition, text is obtained Word text;Writing text is sent to cloud server, so that cloud server carries out language to writing text according to cloud sound bank Sound conversion obtains casting voice, and feeds back casting voice;Audio broadcasting is carried out to received casting voice.In this way, passing through intelligence It dresses terminal and obtains the text images such as the books newspaper of user at the moment, then by using OCR technique to the text in text image Word is identified that the writing text converted out is converted to casting voice by cloud speech processes, then using intelligence wearing terminal Audio broadcasting is carried out to casting voice, user is made easily to complete the reading of bookcase at the moment and newspaper with hearing.To solve The technical issues of in the related technology vision disorders such as blind person, old man crowd can not effectively read.
Detailed description of the invention
The attached drawing constituted part of this application is used to provide further understanding of the present application, so that the application's is other Feature, objects and advantages become more apparent upon.The illustrative examples attached drawing and its explanation of the application is for explaining the application, not Constitute the improper restriction to the application.In the accompanying drawings:
Fig. 1 is a kind of flow diagram of data processing method provided by the embodiments of the present application;
Fig. 2 is a kind of flow diagram of step 100 provided by the embodiments of the present application;
Fig. 3 is a kind of flow diagram of step 110 provided by the embodiments of the present application;
Fig. 4 is the flow diagram of another data processing method provided by the embodiments of the present application;
Fig. 5 is the flow diagram of another data processing method provided by the embodiments of the present application;
Fig. 6 is the flow diagram of another data processing method provided by the embodiments of the present application
Fig. 7 is a kind of structural schematic diagram of intelligence wearing terminal provided by the embodiments of the present application;
Fig. 8 is a kind of structural schematic diagram of cloud server provided by the embodiments of the present application;
Fig. 9 is a kind of appearance diagram of intelligence wearing terminal provided by the embodiments of the present application.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only The embodiment of the application a part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people Member's every other embodiment obtained without making creative work, all should belong to the model of the application protection It encloses.
It should be noted that the description and claims of this application and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way Data be interchangeable under appropriate circumstances, so as to embodiments herein described herein.In addition, term " includes " and " tool Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of steps or units Process, method, system, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include without clear Other step or units listing to Chu or intrinsic for these process, methods, product or equipment.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
According to the one aspect of the application, the embodiment of the present application provides a kind of data processing method, and this method is applied to Intelligence wearing terminal, intelligently wearing terminal can be intelligent glasses and intelligent helmet etc. intelligent terminal for this, for example, the intelligence is worn Wear terminal be intelligent glasses when, Fig. 9 be it is provided by the embodiments of the present application it is a kind of intelligence wearing terminal appearance diagram, such as Fig. 9 Shown, which includes at least image acquisition units 1, audio playing unit 2 and communication unit etc., image acquisition units 1 It may include the camera etc. being arranged on the frame of intelligent glasses, current mesh can be obtained in real time by image acquisition units 1 The image of acquisition can be sent to cloud server by communication unit by target image, and audio playing unit 2 may include setting The loudspeaker etc. on the leg of spectacles of intelligent glasses are set, language can be carried out to the user of the intelligent glasses by audio playing unit 2 Sound casting.Fig. 1 is a kind of flow diagram of data processing method provided by the embodiments of the present application, as shown in Figure 1, this method packet Following step 100 is included to step 300:
100, character recognition is carried out to the text image of acquisition, obtains writing text.
Specifically, obtaining text image can be the current goal medium obtained in real time due to this intelligence wearing terminal The image of (for example, newspaper, books) obtains what text image was also possible to directly to read from local storage;Wherein, literary Word text should include at least one text, when in text image there are when writing text, by carrying out word to text image Symbol identification, can identify writing text.For example, when text image is the image of newspaper, by carrying out word to text figure Symbol identification, can identify the text on the newspaper image, and save as writing text.
It should be noted that the text that the writing text should include can be the text of various language, for example, the text can To be Chinese character, English words etc..
200, writing text is sent to cloud server so that cloud server according to cloud sound bank to writing text into The conversion of row voice obtains casting voice, and feeds back casting voice.
Wherein, cloud server includes at least cloud sound bank and speech retrieval and converting unit, in cloud sound bank at least Voice vocabulary library comprising multilingual, language data, speech retrieval and converting unit are used in cloud sound bank to input Writing text is parsed, retrieved and is converted, to be casting voice by the text text conversion.Due to turning writing text It changes process to be completed by cloud server, the hardware limitation of terminal can be dressed efficiently against intelligence, and due to cloud service The cloud sound bank amount of storage and renewal frequency that device includes are higher so that by the text text conversion be broadcast voice accuracy more High, speed is faster.
300, audio broadcasting is carried out to received casting voice.
Specifically, after receiving the casting voice of cloud server feedback, by audio playing unit 2 to the casting language Sound carries out audio broadcasting, wherein as shown in figure 9, the audio playing unit 2 includes one or more speakers or osteoacusis ear Machine, in this way, by obtaining the text images such as the books newspaper of user at the moment or reading local text image, then by adopting The text in text image is identified with OCR technique, the writing text converted out is converted to casting by cloud speech processes Then voice carries out audio broadcasting to casting voice using intelligence wearing terminal, realizes the vision disorders user such as blind person, old man The purpose of the reading of the text images such as bookcase and newspaper can be easily completed with the mode of hearing.
In a feasible embodiment, Fig. 2 is a kind of process signal of step 100 provided by the embodiments of the present application Figure carries out character recognition to text image as shown in Fig. 2, step 100, includes the following steps, namely 110 to step 130:
110, image procossing is carried out to text image, obtains the character area on text image;
120, field segmentation is carried out to character area, obtains at least one cut zone;
130, for each cut zone, each text image data in the cut zone is extracted, and to each text Image data carries out OCR identification, obtains writing text.
Specifically, the text image to acquisition carries out image recognition, when text is not present in text image, terminate to this The processing of text image;When in text image there are when text, can identify in text image include text literal field Domain since text is all blockette typesetting in newspaper or books, and is not blockette there is also the text in some books Typesetting, therefore, first need to judge in the character area on text image with the presence or absence of at least two field character areas, step 120, field segmentation is carried out to character area, at least one cut zone is obtained, specifically includes:
It is distributed and is determined in character area with the presence or absence of at least two field character areas based on text in character area;
When at least two field character areas are not present in character area, and there is only when the character area of an entirety, Using the character area of an entirety as a cut zone;
When, there are at least two field character areas, needed in character area to the character area carry out field segmentation, from And obtain each cut zone independently of other regions.
After in this way, for each cut zone, each text image data in the cut zone, the extraction are extracted Journey can be to each the Minimum Area with font style characteristic is split in each cut zone, available every after segmentation The corresponding region of a text, and then that extracts each text includes at least the text image data for having font sign, and then passes through OCR technique identifies text image data, so that it is determined that the corresponding text of each text image data out.Later, according to The distributing order (for example, distributing order under upper and/or from left to right) of cut zone is to the text of each cut zone pair Word text merges, and generates the corresponding writing text of text image.
In a feasible embodiment, Fig. 3 is a kind of process signal of step 110 provided by the embodiments of the present application Figure carries out image procossing to text image and includes the following steps, namely 111 to step 114 as shown in figure 3, step 110:
111, image grayscale processing is carried out to text image, obtains gray level image;
112, image binaryzation processing is carried out to gray level image, obtains binary image;
113, it identifies in binary image with the presence or absence of font style characteristic;
114, when identify in binary image there are when font style characteristic, delimit out include font style characteristic literal field Domain.
Specifically, the text image to acquisition first carries out image grayscale processing, after gray proces, text figure is obtained The gray level image of picture carries out image binaryzation processing to gray level image, obtains the binary image of text image later, it Afterwards, image recognition is carried out to binary image, identified with the presence or absence of font style characteristic in binary image, when in binary image There is no when font style characteristic, then terminate the processing to text image, when identifying in binary image there are when font style characteristic, Delimit out include font style characteristic character area.
In a feasible embodiment, Fig. 4 is the stream of another data processing method provided by the embodiments of the present application Journey schematic diagram, as shown in figure 4, this method further includes step 010 to step 030:
010, acquire the corresponding current distance value and/or electro-optical feedback value of current goal medium;
020, determine whether distance value is not more than text and can recognize whether distance threshold and/or electro-optical feedback value are not less than Image can recognize threshold value;
030, when distance value can no more than whether the recognizable distance threshold of text and/or electro-optical feedback value are not less than image When recognition threshold, the text image of current goal is acquired.
Wherein, as shown in figure 9, image acquisition units 1 can include at least paper away from detection unit and/or photoelectric sensor with And camera etc..
Specifically, can acquire this intelligence away from detection unit by paper dresses terminal and current goal medium (for example, newspaper And books) current distance value, which can be infrared distance sensor away from detection unit, be also possible to supersonic sounding sensing Device can detecte the electro-optical feedback value of current goal medium by photoelectric sensor, later, determine whether distance value is not more than text Word can recognize whether distance threshold and/or electro-optical feedback value are not less than image and can recognize threshold value, when distance value can no more than text Identification distance threshold and/or electro-optical feedback value whether be not less than image can recognize threshold value when, can determine current goal medium into The recognizable distance range of this intelligence wearing terminal is entered, to acquire the text image of current goal medium by camera.
In a feasible embodiment, Fig. 5 is the stream of another data processing method provided by the embodiments of the present application Journey schematic diagram, as shown in figure 5, this method further includes step 040 to step 060:
040, obtain user speech;
050, user speech is sent to server, so that user speech is converted to text according to cloud sound bank by server Data command, and by text data instruction feedback;
060, determine whether text data instruction matches with specified operational order;
When text data instruction matches with specified operational order, step 100 is executed, the text image of acquisition is carried out Character recognition.
In the present embodiment, a voice collecting unit can also be arranged in this intelligence wearing terminal, when user makes a sound When, voice collecting unit obtains user speech and the user speech is sent to server later, so that server is according to cloud language User speech is converted to text data instruction by sound library, and by text data instruction feedback, is determined text data instruction and specified Whether operational order matches (i.e. whether the user speech is to instruct to the voice operating of this intelligence wearing terminal), works as textual data (that is, the user speech is the voice behaviour for dressing terminal to this intelligence when matching according to instruction with specified operational order Instruct), the text image of current goal medium can be directly acquired, and then execute step 100.
In data processing method provided by the present application, by 100, character recognition is carried out to the text image of acquisition, is obtained To writing text;200, writing text is sent to cloud server, so that cloud server is according to cloud sound bank to text text This progress voice conversion obtains casting voice, and feeds back casting voice;300, audio broadcasting is carried out to received casting voice. In this way, dressing terminal by intelligence obtains the text images such as the books newspaper of user at the moment, then by using OCR technique to text Text in this image is identified that the writing text converted out is converted to casting voice by cloud speech processes, is then used Intelligence wearing terminal carries out audio broadcasting to casting voice, and user is made easily to complete readding for bookcase at the moment and newspaper with hearing It reads.To solve the technical issues of vision disorders such as blind person, old man crowd can not effectively read in the related technology.
Based on the same technical idea, the embodiment of the present application also provides another data processing method, this method applications In cloud server, Fig. 6 is the flow diagram of another data processing method provided by the embodiments of the present application, as shown in fig. 6, The method comprising the steps of 400 to step 600:
400, receive the writing text that intelligence wearing terminal is sent, wherein text image of the intelligence wearing terminal to acquisition Character recognition is carried out, writing text is obtained;
500, voice conversion is carried out to writing text according to cloud sound bank, obtains casting voice;
600, it gives casting voice feedback to intelligence wearing terminal, is broadcast so that intelligence wearing terminal carries out audio to casting voice It puts.
In data processing method provided by the present application, by 400, the writing text that intelligence wearing terminal is sent is received, Wherein, intelligence wearing terminal carries out character recognition to the text image of acquisition, obtains writing text;500, according to cloud sound bank pair Writing text carries out voice conversion, obtains casting voice;600, casting voice feedback is given to intelligence wearing terminal, so that intelligence is worn It wears terminal and audio broadcasting is carried out to casting voice.In this way, dressing terminal by intelligence obtains the texts such as the books newspaper of user at the moment Then this image identifies that the writing text converted out passes through cloud by using OCR technique to the text in text image The cloud speech processes of server are converted to casting voice, then carry out audio broadcasting to casting voice using intelligence wearing terminal, User is set easily to complete the reading of bookcase at the moment and newspaper with hearing.To solve blind person, old man etc. in the related technology The technical issues of vision disorder crowd can not effectively read.
Based on the same technical idea, present invention also provides a kind of intelligence wearing terminal, Fig. 7 is that the embodiment of the present application mentions The structural schematic diagram of a kind of intelligence wearing terminal supplied, as shown in fig. 7, intelligence wearing terminal includes:
Identification module 10 obtains writing text for carrying out character recognition to the text image of acquisition;
Sending module 20, for writing text to be sent to cloud server, so that cloud server is according to cloud sound bank Voice conversion is carried out to writing text, obtains casting voice, and feed back casting voice;
Playing module 30, for carrying out audio broadcasting to received casting voice.
Optionally, identification module 10 are used for:
Image procossing is carried out to text image, obtains the character area on text image;
Field segmentation is carried out to character area, obtains at least one cut zone;
For each cut zone, each text image data in the cut zone is extracted, and to each character image Data carry out OCR identification, obtain writing text.
Optionally, identification module 10 are also used to:
Image grayscale processing is carried out to text image, obtains gray level image;
Image binaryzation processing is carried out to gray level image, obtains binary image;
It identifies in binary image with the presence or absence of font style characteristic;
When identify in binary image there are when font style characteristic, delimit out include font style characteristic character area.
Optionally, this intelligently dresses terminal further include:
First acquisition module, for acquiring the corresponding current distance value and/or electro-optical feedback value of current goal medium;
First determining module can recognize distance threshold and/or electro-optical feedback for determining whether distance value is not more than text Whether value is not less than the recognizable threshold value of image;
Second acquisition module, for whether can recognize distance threshold and/or electro-optical feedback value no more than text when distance value When can recognize threshold value not less than image, the text image of current goal is acquired.
Optionally, this intelligently dresses terminal further include:
Voice obtains module, for obtaining user speech;
Voice sending module, for user speech to be sent to server, so that server is according to cloud sound bank by user Voice is converted to text data instruction, and by text data instruction feedback;
Second determining module, determines whether text data instruction matches with specified operational order;
When text data instruction matches with specified operational order, identification module 10 is executed, to the text image of acquisition Carry out character recognition.
Based on the same technical idea, present invention also provides a kind of cloud server, Fig. 8 is that the embodiment of the present application provides A kind of cloud server structural schematic diagram, as shown in figure 8, the cloud server includes:
Receiving module 40, the writing text sent for receiving intelligence wearing terminal, wherein intelligence wearing terminal is to acquisition Text image carry out character recognition, obtain writing text;
Conversion module 50 obtains casting voice for carrying out voice conversion to writing text according to cloud sound bank;
Feedback module 60 gives intelligence wearing terminal for that will broadcast voice feedback, so that intelligence wearing terminal is to casting language Sound carries out audio broadcasting.
Based on the same technical idea, the embodiment of the present application also provides a kind of computer readable storage medium, the calculating Machine readable storage medium storing program for executing is stored with computer code, and when computer code is performed, above-mentioned data processing method is performed.
Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored Be performed by computing device in the storage device, perhaps they are fabricated to each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific Hardware and software combines.
Computer program involved in the application can store in computer readable storage medium, described computer-readable Storage medium may include: any entity apparatus that can carry computer program code, virtual bench, flash disk, mobile hard disk, Magnetic disk, CD, computer storage, read-only computer storage (Read-Only Memory, ROM), random access computer Memory (Random Access Memory, RAM), electric carrier signal, telecommunication signal and other software distribution medium etc..
Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored Be performed by computing device in the storage device, perhaps they are fabricated to each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific Hardware and software combines.
The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.

Claims (10)

1. a kind of data processing method, which is characterized in that this method is applied to intelligence wearing terminal, this method comprises:
Character recognition is carried out to the text image of acquisition, obtains writing text;
The writing text is sent to cloud server, so that the cloud server is according to cloud sound bank to the text text This progress voice conversion obtains casting voice, and feeds back the casting voice;
Audio broadcasting is carried out to the received casting voice.
2. data processing method according to claim 1, which is characterized in that described to carry out character knowledge to the text image Not, comprising:
Image procossing is carried out to the text image, obtains the character area on the text image;
Field segmentation is carried out to the character area, obtains at least one cut zone;
For each cut zone, each text image data in the cut zone is extracted, and to each text Image data carries out optical character identification OCR identification, obtains writing text.
3. data processing method according to claim 2, which is characterized in that described to be carried out at image to the text image Reason includes:
Image grayscale processing is carried out to the text image, obtains gray level image;
Image binaryzation processing is carried out to the gray level image, obtains binary image;
It identifies in the binary image with the presence or absence of font style characteristic;
When identify in the binary image there are when the font style characteristic, delimit out include the font style characteristic text Region.
4. data processing method according to claim 1, which is characterized in that this method further include:
Acquire the corresponding current distance value and/or electro-optical feedback value of current goal medium;
Determine whether the distance value is not more than text and can recognize whether distance threshold and/or the electro-optical feedback value are not less than Image can recognize threshold value;
When the distance value can no more than whether the recognizable distance threshold of text and/or the electro-optical feedback value are not less than image When recognition threshold, the text image of the current goal is acquired.
5. data processing method according to claim 1, which is characterized in that this method further include:
Obtain user speech;
The user speech is sent to server, so that the server turns the user speech according to the cloud sound bank It is changed to text data instruction, and by the text data instruction feedback;
Determine whether the text data instruction matches with specified operational order;
When text data instruction matches with specified operational order, the text image for executing described pair of acquisition is carried out The step of character recognition.
6. a kind of data processing method, which is characterized in that this method is applied to cloud server, this method comprises:
Receive the writing text that intelligence wearing terminal is sent, wherein the text image of the intelligence wearing terminal to acquisition Character recognition is carried out, writing text is obtained;
Voice conversion is carried out to the writing text according to cloud sound bank, obtains casting voice;
By the casting voice feedback to it is described intelligence dress terminal so that it is described intelligence wearing terminal to the casting voice into Row audio plays.
7. a kind of intelligence wearing terminal, which is characterized in that the intelligence dresses terminal and includes:
Identification module obtains writing text for carrying out character recognition to the text image of acquisition;
Sending module, for the writing text to be sent to cloud server, so that the cloud server is according to cloud voice Library carries out voice conversion to the writing text, obtains casting voice, and feed back the casting voice;
Playing module, for carrying out audio broadcasting to the received casting voice.
8. intelligence wearing terminal according to claim 7, which is characterized in that the identification module is used for:
Image procossing is carried out to the text image, obtains the character area on the text image;
Field segmentation is carried out to the character area, obtains at least one cut zone;
For each cut zone, each text image data in the cut zone is extracted, and to each text Image data carries out OCR identification, obtains writing text.
9. a kind of cloud server, which is characterized in that the cloud server includes:
Receiving module, the writing text sent for receiving intelligence wearing terminal, wherein the intelligence wearing terminal is to acquisition The text image carries out character recognition, obtains writing text;
Conversion module obtains casting voice for carrying out voice conversion to the writing text according to cloud sound bank;
Feedback module, for the casting voice feedback to be dressed terminal to the intelligence, so that the intelligence wearing terminal pair The casting voice carries out audio broadcasting.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer code, when the meter Calculation machine code is performed, and data processing method as claimed in any one of claims 1 to 6 is performed.
CN201910508817.2A 2019-06-11 2019-06-11 Intelligence wearing terminal, cloud server and data processing method Pending CN110287830A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910508817.2A CN110287830A (en) 2019-06-11 2019-06-11 Intelligence wearing terminal, cloud server and data processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910508817.2A CN110287830A (en) 2019-06-11 2019-06-11 Intelligence wearing terminal, cloud server and data processing method

Publications (1)

Publication Number Publication Date
CN110287830A true CN110287830A (en) 2019-09-27

Family

ID=68004253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910508817.2A Pending CN110287830A (en) 2019-06-11 2019-06-11 Intelligence wearing terminal, cloud server and data processing method

Country Status (1)

Country Link
CN (1) CN110287830A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046223A (en) * 2019-11-14 2020-04-21 李秉伦 Voice assisting method, terminal, server and system for visually impaired
CN111443613A (en) * 2020-03-27 2020-07-24 珠海格力电器股份有限公司 Control method and device of electrical equipment, storage medium and electrical equipment
CN111538158A (en) * 2020-05-25 2020-08-14 上海中通吉网络技术有限公司 Intelligent glasses for express delivery network management and control method and device thereof
CN112307280A (en) * 2020-12-31 2021-02-02 飞天诚信科技股份有限公司 Method and system for converting character string into audio based on cloud server
EP3866475A1 (en) * 2020-02-11 2021-08-18 Nextvpu (Shanghai) Co., Ltd. Image text broadcasting method and device, electronic circuit, and computer program product
CN113986018A (en) * 2021-12-30 2022-01-28 江西影创信息产业有限公司 Vision impairment auxiliary reading and learning method and system based on intelligent glasses and storage medium
US11776286B2 (en) 2020-02-11 2023-10-03 NextVPU (Shanghai) Co., Ltd. Image text broadcasting

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609408A (en) * 2012-01-11 2012-07-25 清华大学 Cross-lingual interpretation method based on multi-lingual document image recognition
CN104143084A (en) * 2014-07-17 2014-11-12 武汉理工大学 Auxiliary reading glasses for visual impairment people
CN106341549A (en) * 2016-10-14 2017-01-18 努比亚技术有限公司 Mobile terminal audio reading apparatus and method
CN107346629A (en) * 2017-08-22 2017-11-14 贵州大学 A kind of intelligent blind reading method and intelligent blind reader system
CN109196520A (en) * 2018-08-28 2019-01-11 深圳市汇顶科技股份有限公司 Biometric devices, method and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609408A (en) * 2012-01-11 2012-07-25 清华大学 Cross-lingual interpretation method based on multi-lingual document image recognition
CN104143084A (en) * 2014-07-17 2014-11-12 武汉理工大学 Auxiliary reading glasses for visual impairment people
CN106341549A (en) * 2016-10-14 2017-01-18 努比亚技术有限公司 Mobile terminal audio reading apparatus and method
CN107346629A (en) * 2017-08-22 2017-11-14 贵州大学 A kind of intelligent blind reading method and intelligent blind reader system
CN109196520A (en) * 2018-08-28 2019-01-11 深圳市汇顶科技股份有限公司 Biometric devices, method and electronic equipment

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046223A (en) * 2019-11-14 2020-04-21 李秉伦 Voice assisting method, terminal, server and system for visually impaired
EP3866475A1 (en) * 2020-02-11 2021-08-18 Nextvpu (Shanghai) Co., Ltd. Image text broadcasting method and device, electronic circuit, and computer program product
JP2021129299A (en) * 2020-02-11 2021-09-02 ネクストヴイピーユー(シャンハイ)カンパニー リミテッドNextvpu(Shanghai)Co., Ltd. Image text broadcast method and device, electronic circuit, and storage medium
US11776286B2 (en) 2020-02-11 2023-10-03 NextVPU (Shanghai) Co., Ltd. Image text broadcasting
CN111443613A (en) * 2020-03-27 2020-07-24 珠海格力电器股份有限公司 Control method and device of electrical equipment, storage medium and electrical equipment
CN111538158A (en) * 2020-05-25 2020-08-14 上海中通吉网络技术有限公司 Intelligent glasses for express delivery network management and control method and device thereof
CN112307280A (en) * 2020-12-31 2021-02-02 飞天诚信科技股份有限公司 Method and system for converting character string into audio based on cloud server
CN112307280B (en) * 2020-12-31 2021-03-16 飞天诚信科技股份有限公司 Method and system for converting character string into audio based on cloud server
CN113986018A (en) * 2021-12-30 2022-01-28 江西影创信息产业有限公司 Vision impairment auxiliary reading and learning method and system based on intelligent glasses and storage medium

Similar Documents

Publication Publication Date Title
CN110287830A (en) Intelligence wearing terminal, cloud server and data processing method
CN109889920B (en) Network course video editing method, system, equipment and storage medium
CN109117777A (en) The method and apparatus for generating information
CN111190939A (en) User portrait construction method and device
CN110334712A (en) Intelligence wearing terminal, cloud server and data processing method
CN107491435B (en) Method and device for automatically identifying user emotion based on computer
CN110717470B (en) Scene recognition method and device, computer equipment and storage medium
CN103052953A (en) Information processing device, method of processing information, and program
CN111797820B (en) Video data processing method and device, electronic equipment and storage medium
CN114465737B (en) Data processing method and device, computer equipment and storage medium
CN111488487B (en) Advertisement detection method and detection system for all-media data
CN108154103A (en) Detect method, apparatus, equipment and the computer storage media of promotion message conspicuousness
CN109993450B (en) Movie scoring method, device, equipment and storage medium
CN114095782A (en) Video processing method and device, computer equipment and storage medium
CN112199932A (en) PPT generation method, device, computer-readable storage medium and processor
CN111524503A (en) Audio data processing method and device, audio recognition equipment and storage medium
CN113038175B (en) Video processing method and device, electronic equipment and computer readable storage medium
CN107203638A (en) Monitor video processing method, apparatus and system
CN111680514B (en) Information processing and model training method, device, equipment and storage medium
US20200257763A1 (en) Visual storyline generation from text story
CN106779992B (en) Method and device for generating financial record and electronic account book according to short message
CN113516963B (en) Audio data generation method and device, server and intelligent sound box
CN115114469A (en) Picture identification method, device and equipment and storage medium
CN114398517A (en) Video data acquisition method and device
CN113393845A (en) Method and device for speaker recognition, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190927