CN104751846B - The method and device of speech-to-text conversion - Google Patents

The method and device of speech-to-text conversion Download PDF

Info

Publication number
CN104751846B
CN104751846B CN201510126575.2A CN201510126575A CN104751846B CN 104751846 B CN104751846 B CN 104751846B CN 201510126575 A CN201510126575 A CN 201510126575A CN 104751846 B CN104751846 B CN 104751846B
Authority
CN
China
Prior art keywords
text
mark
label
text mark
audio file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510126575.2A
Other languages
Chinese (zh)
Other versions
CN104751846A (en
Inventor
王彦文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nubia Technology Co Ltd
Original Assignee
Nubia Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nubia Technology Co Ltd filed Critical Nubia Technology Co Ltd
Priority to CN201510126575.2A priority Critical patent/CN104751846B/en
Publication of CN104751846A publication Critical patent/CN104751846A/en
Application granted granted Critical
Publication of CN104751846B publication Critical patent/CN104751846B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The invention discloses a kind of methods of speech-to-text conversion, this method comprises: obtaining audio file;The voice contained in the audio file is converted into text to generate the first text information according to the time shaft of audio file sequence;It gets the recording in the audio file ready label and is converted to text mark;The text mark is inserted into the corresponding position in first text information, to generate the second text information.The invention also discloses a kind of devices of speech-to-text conversion.Using technical solution of the present invention, the text after conversion is marked, people is facilitated the operation such as to check, edit to text.

Description

The method and device of speech-to-text conversion
Technical field
The present invention relates to the method and devices that field of communication technology more particularly to a kind of speech-to-text are converted.
Background technique
With the rapid development of information age, information input/output function importance is added in the electronic device By force.People can be recorded by mobile phone or recording pen (or other equipment with sound-recording function), facilitate record information;It is recording During sound, can also use get function ready, such as when attending a lecture, can record while listening, important content is being recorded When in advance label on, ultimately produce recording file, people can continue to listen back to pervious say subsequently through the recording file When seat content, it can directly listen from label beginning, be listened one time without entirely recording;As that can beg on one side when discussing in session It records by one side, important conference content is marked in advance when record, ultimately produces recording file, people can be subsequent When continuing to listen back to pervious conference content by the recording file, can directly it be listened from label beginning, without entirely recording It listens one time.Speech recognition technology in the prior art, is had been achieved with voice document being converted into text file using more and more extensive It is shown, still, the prior art will be when that will have markd voice document and change into text file, to getting label ready without knowing Not, but voice document is directly converted into text file, it has not been convenient to which people are to the reading of text file, editor, as people think The content (emphasis for understanding record) for getting mark before seeing ready in voice document, cannot be quickly found out, need to open from text It is slowly looked at beginning.
Above content is only used to facilitate the understanding of the technical scheme, and is not represented and is recognized that above content is existing skill Art.
Summary of the invention
The main purpose of the present invention is to provide a kind of method and devices of speech-to-text conversion, it is intended to after conversion Text is marked, and people is facilitated the operation such as to check, edit to text.
To achieve the above object, the present invention provides a kind of method of speech-to-text conversion, this method comprises:
Obtain audio file;
The voice contained in the audio file is converted into text with life according to the time shaft of audio file sequence At the first text information;
It gets the recording in the audio file ready label and is converted to text mark;
The text mark is inserted into the corresponding position in first text information, to generate the second text information.
Preferably, the recording by the audio file gets the step of label is converted to text mark ready and includes:
It obtains the recording in the audio file and gets label ready;
Label and text mark mapping table are got ready according to preset recording, and the recording for searching the acquisition is got label ready and corresponded to Text mark.
Preferably, the text mark is being inserted into first text information, to generate the second text information After step, this method further include:
Word content between identical and adjacent two text mark in second text information is protruded It has been shown that, to generate third text information.
Preferably, in the text between identical and adjacent two text mark by second text information Appearance is highlighted, and includes: the step of third text information to generate
Sequence reads second text information;
If currently reading text mark, the text whether text mark currently read reads with the last time is judged This label is identical;
If the text mark currently read is identical as the text mark that the last time reads, currently read described Word content between text mark and the last text mark read is highlighted, to generate third text information.
Preferably, described, by the text between the text mark currently read and the last text mark read Content is highlighted, and includes: the step of third text information to generate
According to preset text mark and mode mapping table is highlighted, it is corresponding to search the text mark currently read Highlight mode;
By the word content between the text mark currently read and the last text mark read according to described The mode that highlights searched is highlighted, to generate third text information.
In addition, to achieve the above object, the present invention also provides a kind of devices of speech-to-text conversion, comprising:
Module is obtained, for obtaining audio file;
First generation module, the language that will contain in the audio file for the time shaft sequence according to the audio file Sound is converted to text to generate the first text information;
First conversion module is converted to text mark for getting the recording in the audio file ready label;
Second generation module, the corresponding position for being inserted into the text mark in first text information, with Generate the second text information.
Preferably, first conversion module includes:
First acquisition unit gets label ready for obtaining the recording in the audio file;
First searching unit searches the acquisition for getting label and text mark mapping table ready according to preset recording Recording get the corresponding text mark of label ready.
Preferably, the device further include:
Third generation module, for will be between identical and adjacent two text mark in second text information Word content is highlighted, to generate third text information.
Preferably, the third generation module includes:
Reading unit, for sequentially reading second text information;
Judging unit, for judging the text currently read when the reading unit currently reads text mark Originally mark whether identical as the text mark that the last time reads;
Unit is highlighted, for identical as the upper text mark once read in the text mark currently read When, the word content between the text mark currently read and the last text mark read is highlighted, To generate third text information.
Preferably, the unit that highlights includes:
Second searching unit, for identical as the upper text mark once read in the text mark currently read When, according to preset text mark and mode mapping table is highlighted, searches the corresponding protrusion of the text mark currently read Display mode;
Subelement is highlighted, for will be between the text mark currently read and the last text mark read Word content highlighted according to the mode that highlights that second searching unit is searched, to generate third text envelope Breath.
The present invention is by obtaining audio file;It will be contained in the audio file according to the time shaft of audio file sequence Some voices are converted to text to generate the first text information;It gets the recording in the audio file ready label and is converted to text Label;The text mark is inserted into the corresponding position in first text information, to generate the second text information.It is inciting somebody to action It when audio file is converted to text file, gets the recording in audio file ready label and is converted into text mark, and by the text Mark the corresponding position that is inserted into first text information, to generate the second text information, can facilitate people to conversion after Text the operation such as checked, edited.
Detailed description of the invention
Fig. 1 is the flow diagram of the method first embodiment of speech-to-text of the present invention conversion;
Fig. 2 is the refinement flow diagram of step S30 in Fig. 1;
Fig. 3 is the flow diagram of the method second embodiment of speech-to-text of the present invention conversion;
Fig. 4 is the refinement flow diagram of step S50 in Fig. 3;
Fig. 5 is the refinement flow diagram of step S53 in Fig. 4;
Fig. 6 is the functional block diagram of the device first embodiment of speech-to-text of the present invention conversion;
Fig. 7 is the functional block diagram of the device second embodiment of speech-to-text of the present invention conversion;
Fig. 8 is the detailed construction schematic diagram of third generation module in Fig. 7.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
Referring to Fig.1, Fig. 1 is the flow diagram of the method first embodiment of speech-to-text of the present invention conversion.
The present invention provides a kind of method of speech-to-text conversion, including
S10, audio file is obtained.
In step S10, audio file can be obtained by wired or wireless mode, such as: acquisition can be downloaded from the Internet Audio file, for example downloaded a lecture audio file from the Internet.The audio file includes that label is got in recording ready.
S20, the voice contained in the audio file is converted to life by text according to the time shaft sequence of the audio file At the first text information.
In step S20, digitized the speech by voice-to-text (Speech To Test, STT) function or algorithm Voice is successively extracted, and the voice of extraction is converted to text according to the time shaft of audio file sequence at text, will be turned Each text for changing generation synthesizes the first text information.
S30, it gets the recording in the audio file ready label and is converted to text mark.
In step S30, gets the recording in audio file ready label and be converted to text mark, text marking style Multiplicity can be various colors or icon indicia of various shapes.
S40, the text is marked to the corresponding position being inserted into first text information, to generate the second text information.
In step S40, corresponding recording is marked to get label ready in the position of audio file, by text according to the text Mark the corresponding position that is inserted into the first text information to generate the second text information so that second text information both included by The text that voice is converted into, and include the text mark getting label ready by recording and being converted into.
The embodiment of the present invention is converted to the voice contained in audio file during converting speech-to-text Text gets the recording in audio file ready label and is converted to text mark to generate the first text information, then will be after conversion Text mark is inserted into the corresponding position in first text information, to generate the second text information;The second text after generating This information had not only included the text being converted by voice, but also included the text mark getting label ready by recording and being converted into.User can be square Just the operation such as checked, edited to the second text information, if user is by checking that text mark can be in second text envelope In breath it is open-and-shut find before done recording and get the place of mark ready, without from the beginning of the second text information Successively check.
Further, as shown in Fig. 2, step S30 includes:
Label is got in S31, the recording obtained in the audio file ready.
S32, label and text mark mapping table are got ready according to preset recording, label pair is got in the recording for searching the acquisition ready The text mark answered.
The mapping table that label and text mark are got in the recording ready can be preset according to actual needs, as shown in Table 1.
Table one:
Label is got in recording ready Text mark
Get label A ready Five-pointed star
Get label B ready Red circle
Get label C ready Green triangle shape
…… ……
If the recording got in step S31 is got ready labeled as label A is got ready, in step S32, according to pre- If recording get label and text mark mapping table ready, finding this and getting the corresponding text mark of label A ready is five-pointed star.
According to actual needs, the recording can be also updated at any time and get label and text mark mapping table ready, so that the recording is beaten Point label more meets the use habit of user with text mark mapping table.
Referring to the flow diagram for the method second embodiment that Fig. 3, Fig. 3 are speech-to-text of the present invention conversion.
Based on the method first embodiment of above-mentioned speech-to-text conversion, after the step s 40, this method further include:
S50, the word content between identical and adjacent two text mark in second text information is dashed forward It shows out, to generate third text information.
In step S50, the word content between two identical and adjacent text marks is highlighted, i.e., Second text information can be edited automatically, label is got ready to two recording for having done identical and adjacent in audio file Between the corresponding text of voice highlighted automatically, which can be with are as follows: bold, red font Deng.When user the operation such as checks, edits to the third text information, highlighted text open-and-shut can be viewed Content improves efficiency.
Further, as shown in figure 4, step S50 includes:
S51, second text information is sequentially read.
If S52, currently reading text mark, judge what whether the text mark currently read read with the last time Text mark is identical, if they are the same, thens follow the steps S53.
In step S52, if currently reading text mark, the text mark currently read can be added to and read In text mark list, and the last text mark read is found from this read list, then judgement is current reads Text mark and the last text mark read it is whether identical, if they are the same, S53 is thened follow the steps, if not identical, from this Continue to read the second text information in the place for currently reading text mark.
S53, the word content between the text mark currently read and the last text mark read is dashed forward It shows out, to generate third text information.
It, will be in the text between the text mark currently read and the last text mark read in step S53 Appearance is highlighted, and can be edited automatically to the second text information, identical and adjacent to having done in audio file The corresponding texts of voice that two recording are got ready between label are highlighted automatically, which can be with are as follows: Runic, red etc..When user the operation such as checks, edits to the third text information, open-and-shut protrusion can be viewed The content of text of display, improves efficiency.
Further, as shown in figure 5, step S53 includes:
S531, according to preset text mark and mode mapping table is highlighted, search the text mark currently read It is corresponding to highlight mode.
Text label can be preset according to actual needs and highlights mode mapping table, as shown in Table 2.
Table two:
Text mark Highlight mode
Five-pointed star Bold
Red circle Red font
Green triangle shape Green font
…… ……
If the text mark currently read is red circle, reflected in the preset text mark with the mode of highlighting It is red font that firing table, which finds the corresponding mode that highlights of the red circle,.
According to actual needs, text label can be also updated at any time and highlights mode mapping table, so that text mark Remember and the mode mapping table of highlighting more meets the use habit of user.
Word content between S532, the text mark for reading the text mark currently read and last time is according to this The mode that highlights searched is highlighted, to generate third text information.
In step S532, according to step S531 find highlight mode to the text mark that currently reads with The word content between text mark that last time reads is highlighted, and can be edited automatically to the second text information, Generate third text information.When user the operation such as checks, edits to the third text information, open-and-shut it can view Highlighted content of text, improves efficiency.
It, should referring to the functional block diagram for the device first embodiment that Fig. 6, Fig. 6 are speech-to-text of the present invention conversion Device includes:
Module 10 is obtained, for obtaining audio file;
First generation module 20, the voice that will contain in the audio file for the time shaft sequence according to the audio file Text is converted to generate the first text information;
First conversion module 30 is converted to text mark for getting the recording in the audio file ready label;
Second generation module 40, for the text to be marked the corresponding position being inserted into first text information, with life At the second text information.
The acquisition module 10 can obtain audio file by wired or wireless mode, such as: can download acquisition sound from the Internet Frequency file, for example downloaded a lecture audio file from the Internet.The audio file includes that label is got in recording ready.
First generation module 20 is turned voice by voice-to-text (Speech To Test, STT) function or algorithm It changes text into, according to the time shaft of audio file sequence, successively extracts voice, and the voice of extraction is converted into text, it will Each text that conversion generates synthesizes the first text information.
First conversion module 30 gets the recording in audio file ready label and is converted to text mark, text label Pattern multiplicity, can be various colors or icon indicia of various shapes.
Second generation module 40 marks corresponding recording to get ready and marks in the position of audio file according to the text, will be literary This label is inserted into the corresponding position in the first text information and generates the second text information, so that second text information both included The text being converted by voice, and include the text mark getting label ready by recording and being converted into.
The embodiment of the present invention, during converting speech-to-text, the first generation module 20 will contain in audio file Some voices are converted to text to generate the first text information, and the first conversion module 30 gets the recording in audio file ready label Text mark is converted to, then the text mark after conversion is inserted into first text information by the second generation module 40 again Corresponding position, to generate the second text information;The second text information after generating not only had included the text being converted by voice, but also Including getting the text mark that label is converted into ready by recording.User can easily check the second text information, edit Operation, as user by check text mark can in second text information it is open-and-shut find before done recording get ready The place of mark, without successively being checked from the beginning of the second text information.
Further, which includes: first acquisition unit 31, for obtaining the record in the audio file Sound gets label ready;First searching unit 32 is searched this and is obtained for getting label and text mark mapping table ready according to preset recording The corresponding text mark of label is got in the recording taken ready.
The mapping table that label and text mark are got in the recording ready can be preset according to actual needs, such as above-mentioned one institute of table Show.
If the recording that first acquisition unit 31 is got, which is got ready, is labeled as getting ready label A, first searching unit 32 Label and text mark mapping table are got ready according to preset recording, and finding this and getting the corresponding text mark of label A ready is five-pointed star.
According to actual needs, the recording can be also updated at any time and get label and text mark mapping table ready, so that the recording is beaten Point label more meets the use habit of user with text mark mapping table.
Referring to the functional block diagram for the device second embodiment that Fig. 7, Fig. 7 are speech-to-text of the present invention conversion.
Based on the device first embodiment of aforementioned present invention speech-to-text conversion, the device further include:
Third generation module 50, for will be between identical and adjacent two text mark in second text information Word content is highlighted, to generate third text information.
The third generation module 50 carries out the word content between two identical and adjacent text marks prominent aobvious Show, the second text information can be edited automatically, two recording for having done identical and adjacent in audio file are got ready The corresponding text of voice between label is highlighted automatically, which can be with are as follows: bold, red Font etc..When user the operation such as checks, edits to the third text information, open-and-shut it can view highlighted Content of text improves efficiency.
Further, as shown in figure 8, the third generation module 50 includes:
Reading unit 51, for sequentially reading second text information;
Judging unit 52, for judging the text currently read when the reading unit 51 currently reads text mark Originally mark whether identical as the text mark that the last time reads;
Unit 53 is highlighted, the text mark for reading in this prior is identical as the text mark that the last time reads When, the word content between the text mark currently read and the last text mark read is highlighted, with Generate third text information.
If the reading unit 51 currently reads text mark, the text mark which will can currently read It is added to and has read in text mark list, and find the text mark of last reading from this read list, then Judge whether the text mark currently read is identical as the text mark of last time reading;If not identical, the reading unit 51 Continue to read the second text information in the place for currently reading text mark from this.
When the text mark currently read is identical as the text mark that the last time reads, this highlights unit 53 will The word content between text mark and the last text mark read currently read is highlighted, can be to second Text information is edited automatically, to getting voice between label ready having done identical and adjacent two recording in audio file Corresponding text is highlighted automatically, which can be with are as follows: runic, red etc..User is to third text When this information such as is checked, edited at the operation, highlighted content of text open-and-shut can be viewed, is improved efficiency.
Further, this highlights unit 53 and includes:
Second searching unit, when the text mark for reading in this prior is identical as the text mark that the last time reads, According to preset text mark and mode mapping table is highlighted, searches this text mark currently read is corresponding and highlight Mode;
Subelement is highlighted, for will be between the text mark currently read and the last text mark read Word content is highlighted according to the mode that highlights that second searching unit is searched, to generate third text information.
Text label can be preset according to actual needs and highlights mode mapping table, as shown in above-mentioned table two.
If the text mark currently read is red circle, the second searching unit is in the preset text mark and dashes forward It is red font that display mode mapping table, which finds the corresponding mode that highlights of the red circle, out.
According to actual needs, text label can be also updated at any time and highlights mode mapping table, so that text mark Remember and the mode mapping table of highlighting more meets the use habit of user.
This highlights subelement and highlights mode to the text currently read according to what second searching unit was found Word content between this label and the last text mark read is highlighted, can automatically to the second text information into Edlin generates third text information.It, can be very clear when user the operation such as checks, edits to the third text information View highlighted content of text, improve efficiency.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of method of speech-to-text conversion, which is characterized in that this method comprises:
Obtain audio file;
The voice contained in the audio file is converted into text according to the time shaft of audio file sequence to generate the One text information;
It gets the recording in the audio file ready label and is converted to text mark;
Label is got ready in the position of audio file according to the corresponding recording of the text mark, and the text mark is inserted into institute The corresponding position in the first text information is stated, to generate the second text information, wherein second text information had both included by language The text that sound is converted into, and include the text mark getting label ready by recording and being converted into.
2. the method for speech-to-text as described in claim 1 conversion, which is characterized in that it is described will be in the audio file Recording gets the step of label is converted to text mark ready and includes:
It obtains the recording in the audio file and gets label ready;
Label and text mark mapping table are got ready according to preset recording, and the corresponding text of label is got in the recording for searching the acquisition ready This label.
3. the method for speech-to-text conversion as claimed in claim 2, which is characterized in that be inserted by the text mark In first text information, after the step of the second text information of generation, this method further include:
Word content between identical and adjacent two text mark in second text information is highlighted, To generate third text information.
4. the method for speech-to-text conversion as claimed in claim 3, which is characterized in that described by second text information In identical and adjacent two text mark between word content highlighted, to generate the step of third text information Suddenly include:
Sequence reads second text information;
If currently reading text mark, the text the mark whether text mark currently read reads with the last time is judged Remember identical;
If the text mark currently read is identical as the text mark that the last time reads, by the text currently read Word content between label and the last text mark read is highlighted, to generate third text information.
5. the method for speech-to-text as claimed in claim 4 conversion, which is characterized in that it is described, it is currently read described Word content between text mark and the last text mark read is highlighted, to generate third text information Step includes:
According to preset text mark and mode mapping table is highlighted, searches the corresponding protrusion of the text mark currently read Display mode;
By the word content between the text mark currently read and the last text mark read according to the lookup The mode that highlights highlighted, to generate third text information.
6. a kind of device of speech-to-text conversion characterized by comprising
Module is obtained, for obtaining audio file;
First generation module, for being turned the voice contained in the audio file according to the time shaft sequence of the audio file Text is changed to generate the first text information;
First conversion module is converted to text mark for getting the recording in the audio file ready label;
Second generation module, for getting label ready in the position of audio file, by institute according to the corresponding recording of the text mark The corresponding position that text mark is inserted into first text information is stated, to generate the second text information, wherein described second Text information had not only included the text being converted by voice, but also included the text mark getting label ready by recording and being converted into.
7. the device of speech-to-text as claimed in claim 6 conversion, which is characterized in that first conversion module includes:
First acquisition unit gets label ready for obtaining the recording in the audio file;
First searching unit searches the record of the acquisition for getting label and text mark mapping table ready according to preset recording Sound gets the corresponding text mark of label ready.
8. the device of speech-to-text conversion as claimed in claim 7, which is characterized in that the device further include:
Third generation module, for by the text between identical and adjacent two text mark in second text information Content is highlighted, to generate third text information.
9. the device of speech-to-text as claimed in claim 8 conversion, which is characterized in that the third generation module includes:
Reading unit, for sequentially reading second text information;
Judging unit, for when the reading unit currently reads text mark, judging the text mark currently read Whether note is identical as the text mark that the last time reads;
Unit is highlighted, for inciting somebody to action when the text mark currently read is identical as the upper text mark once read Word content between the text mark currently read and the last text mark read is highlighted, to generate Third text information.
10. the device of speech-to-text as claimed in claim 9 conversion, which is characterized in that the unit that highlights includes:
Second searching unit, for when the text mark currently read is identical as the upper text mark once read, root According to preset text mark and mode mapping table is highlighted, searches the corresponding side of highlighting of the text mark currently read Formula;
Subelement is highlighted, the text between text mark for reading the text mark currently read and last time Word content is highlighted according to the mode that highlights that second searching unit is searched, to generate third text information.
CN201510126575.2A 2015-03-20 2015-03-20 The method and device of speech-to-text conversion Active CN104751846B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510126575.2A CN104751846B (en) 2015-03-20 2015-03-20 The method and device of speech-to-text conversion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510126575.2A CN104751846B (en) 2015-03-20 2015-03-20 The method and device of speech-to-text conversion

Publications (2)

Publication Number Publication Date
CN104751846A CN104751846A (en) 2015-07-01
CN104751846B true CN104751846B (en) 2019-03-01

Family

ID=53591408

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510126575.2A Active CN104751846B (en) 2015-03-20 2015-03-20 The method and device of speech-to-text conversion

Country Status (1)

Country Link
CN (1) CN104751846B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653729B (en) * 2016-01-28 2019-10-08 努比亚技术有限公司 A kind of device and method of recording file index
CN106067302B (en) * 2016-05-27 2019-06-25 努比亚技术有限公司 Denoising device and method
CN106341204B (en) * 2016-09-29 2019-02-22 北京小米移动软件有限公司 Audio-frequency processing method and device
CN106571137A (en) * 2016-10-28 2017-04-19 努比亚技术有限公司 Terminal voice dotting control device and method
CN107181849A (en) * 2017-04-19 2017-09-19 北京小米移动软件有限公司 The way of recording and device
CN106911832B (en) * 2017-04-28 2020-06-02 四川音创伟业科技有限公司 Voice recording method and device
CN109243469B (en) * 2017-12-13 2021-12-10 中国航空工业集团公司北京航空精密机械研究所 Digital detection information acquisition system
CN108647190B (en) * 2018-04-25 2022-04-29 北京华夏电通科技股份有限公司 Method, device and system for inserting voice recognition text into script document
CN109545187A (en) * 2018-11-21 2019-03-29 维沃移动通信有限公司 A kind of display control method and terminal
CN115237316A (en) * 2022-06-06 2022-10-25 华为技术有限公司 Audio track marking method and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1469370A (en) * 2002-06-26 2004-01-21 日本胜利株式会社 Text data recording method and apparatus
CN1652205A (en) * 2004-01-14 2005-08-10 索尼株式会社 Audio signal processing apparatus and audio signal processing method
CN1822189A (en) * 2006-03-02 2006-08-23 无敌科技(西安)有限公司 Content identifying method for digital recorded file
CN101253549A (en) * 2005-08-26 2008-08-27 皇家飞利浦电子股份有限公司 System and method for synchronizing sound and manually transcribed text
CN103247289A (en) * 2012-02-01 2013-08-14 鸿富锦精密工业(深圳)有限公司 Recording system, recording method, sound inputting device, voice recording device and voice recording method
CN103400592A (en) * 2013-07-30 2013-11-20 北京小米科技有限责任公司 Recording method, playing method, device, terminal and system
CN103399865A (en) * 2013-07-05 2013-11-20 华为技术有限公司 Method and device for multi-media file generation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6353809B2 (en) * 1997-06-06 2002-03-05 Olympus Optical, Ltd. Speech recognition with text generation from portions of voice data preselected by manual-input commands
EP2816549B1 (en) * 2013-06-17 2016-08-03 Yamaha Corporation User bookmarks by touching the display of a music score while recording ambient audio

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1469370A (en) * 2002-06-26 2004-01-21 日本胜利株式会社 Text data recording method and apparatus
CN1652205A (en) * 2004-01-14 2005-08-10 索尼株式会社 Audio signal processing apparatus and audio signal processing method
CN101253549A (en) * 2005-08-26 2008-08-27 皇家飞利浦电子股份有限公司 System and method for synchronizing sound and manually transcribed text
CN1822189A (en) * 2006-03-02 2006-08-23 无敌科技(西安)有限公司 Content identifying method for digital recorded file
CN103247289A (en) * 2012-02-01 2013-08-14 鸿富锦精密工业(深圳)有限公司 Recording system, recording method, sound inputting device, voice recording device and voice recording method
CN103399865A (en) * 2013-07-05 2013-11-20 华为技术有限公司 Method and device for multi-media file generation
CN103400592A (en) * 2013-07-30 2013-11-20 北京小米科技有限责任公司 Recording method, playing method, device, terminal and system

Also Published As

Publication number Publication date
CN104751846A (en) 2015-07-01

Similar Documents

Publication Publication Date Title
CN104751846B (en) The method and device of speech-to-text conversion
US20200294487A1 (en) Hands-free annotations of audio text
US11133025B2 (en) Method and system for speech emotion recognition
CN104240703B (en) Voice information processing method and device
CN110751943A (en) Voice emotion recognition method and device and related equipment
CN109254669B (en) Expression picture input method and device, electronic equipment and system
CN109545184B (en) Recitation detection method based on voice calibration and electronic equipment
US20120196260A1 (en) Electronic Comic (E-Comic) Metadata Processing
CN109410664A (en) A kind of pronunciation correction method and electronic equipment
CN104867494B (en) The name sorting technique and system of a kind of recording file
KR102076793B1 (en) Method for providing electric document using voice, apparatus and method for writing electric document using voice
CN109710949A (en) A kind of interpretation method and translator
CN111292751A (en) Semantic analysis method and device, voice interaction method and device, and electronic equipment
CN104252872B (en) Lyric generating method and intelligent terminal
CN107240394A (en) A kind of dynamic self-adapting speech analysis techniques for man-machine SET method and system
CN110111778A (en) A kind of method of speech processing, device, storage medium and electronic equipment
CN106484134A (en) The method and device of the phonetic entry punctuation mark based on Android system
CN106297841A (en) A kind of audio frequency is with reading bootstrap technique and device
CN105956014A (en) Music playing method based on deep learning
CN112053692A (en) Speech recognition processing method, device and storage medium
CN110767233A (en) Voice conversion system and method
KR20190143116A (en) Talk auto-recording apparatus method
CN110047473B (en) Man-machine cooperative interaction method and system
CN107331396A (en) Export the method and device of numeral
CN106911832A (en) A kind of method and device of voice record

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant