CN114863907A - Marking method and device for text-to-speech processing - Google Patents

Marking method and device for text-to-speech processing Download PDF

Info

Publication number
CN114863907A
CN114863907A CN202210791141.4A CN202210791141A CN114863907A CN 114863907 A CN114863907 A CN 114863907A CN 202210791141 A CN202210791141 A CN 202210791141A CN 114863907 A CN114863907 A CN 114863907A
Authority
CN
China
Prior art keywords
mark
text
target
marking
target text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210791141.4A
Other languages
Chinese (zh)
Other versions
CN114863907B (en
Inventor
刘丹
王荔
汤跃忠
陈龙
杨静波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Third Research Institute Of China Electronics Technology Group Corp
Beijing Zhongdian Huisheng Technology Co ltd
Original Assignee
Third Research Institute Of China Electronics Technology Group Corp
Beijing Zhongdian Huisheng Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Third Research Institute Of China Electronics Technology Group Corp, Beijing Zhongdian Huisheng Technology Co ltd filed Critical Third Research Institute Of China Electronics Technology Group Corp
Priority to CN202210791141.4A priority Critical patent/CN114863907B/en
Publication of CN114863907A publication Critical patent/CN114863907A/en
Application granted granted Critical
Publication of CN114863907B publication Critical patent/CN114863907B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/0482Interaction with lists of selectable items, e.g. menus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses a marking method and a marking device for text-to-speech processing, wherein the marking method comprises the following steps: providing a plurality of marking menu items, each marking menu item having a marking tool of a type of function; selecting a first target text, and adding a mark of a corresponding function to the selected first target text based on a mark menu item; providing a temporary marking area; acquiring a copy instruction after adding a target mark to the first target text so as to temporarily store the target mark in a temporary mark area; and under the condition that the second target text is selected, presenting a target mark in association with the second target text based on the temporary mark area, so as to endow the target mark to the second target text after acquiring a confirmation instruction of the user. According to the embodiment of the application, the mark which is expected to be copied by a user is temporarily stored in the temporary mark area, so that mark copying is realized, the interaction frequency in the mark process is greatly reduced, and the mark efficiency in the text-to-speech process is improved.

Description

Marking method and device for text-to-speech processing
Technical Field
The invention relates to the technical field of voice transcription, in particular to a marking method and a marking device for text-to-voice processing.
Background
In the text-to-speech audio software, the accuracy and naturalness of the synthesized speech can be improved by adding text pronunciation and prosody marks.
The mark icon of the text of the prior art marking method can be deleted, pop up after clicking or pull down menu, but the mark can not be selected and copied. When the user needs to add the same mark at different positions, the user needs to click the function icon again, and then select the function icon in a pull-down menu or input the function icon in a popup window. Therefore, the operation steps of the marking process in the prior art are too complicated, and the marking efficiency is low.
Disclosure of Invention
The embodiment of the invention provides a marking method and a marking device for text-to-speech processing, which are used for providing a marking copy function and greatly improving marking efficiency.
The embodiment of the invention provides a marking method for text-to-speech processing, which comprises the following steps:
providing a plurality of marking menu items, each marking menu item having a marking tool of a type of function;
selecting a first target text, and adding a mark of a corresponding function to the selected first target text based on a mark menu item;
providing a temporary marking area;
acquiring a copy instruction after adding a target mark to the first target text so as to temporarily store the target mark to the temporary mark area;
and under the condition that a second target text is selected, presenting the target mark in association with the second target text based on the temporary mark area, so that the target mark is endowed to the second target text after a confirmation instruction of a user is obtained.
Optionally, presenting the target mark in association with the second target text is achieved by marking a pop-up window;
and when the mark popup is closed, or the mark given to the second target text is different from the target mark in the temporary mark area, or the operation on the second target text is inconsistent with the precondition operation corresponding to the target mark in the temporary mark area, not marking the popup after the text is selected after the second target text.
Optionally, the method further includes: and after a third target text is selected, acquiring a selection instruction of the temporary marking area so as to endow the target mark in the temporary marking area to the third target text.
Optionally, the plurality of marker menu items includes at least: pause markers, read-through markers, polyphone markers, local volume markers, reread markers, alias markers.
Optionally, at least some of the plurality of marking menu items are provided with corresponding custom functions.
Optionally, after adding a mark of a corresponding function to the selected first target text, the method further includes:
and acquiring clicking operation on the mark of the first target text to modify the mark of the first target text.
Optionally, the method further includes:
a delete key is provided in association with a marker based on the first target text to delete the corresponding marker based on the delete key.
The present application further provides a text-to-speech processing labeling apparatus, which includes a processor and a memory, where the memory stores a computer program, and the computer program implements the steps of the text-to-speech processing labeling method when executed by the processor.
The present application also proposes a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, implements the steps of the aforementioned labeling method for text-to-speech processing.
The embodiment of the invention provides a plurality of marking menu items and provides the temporary marking area, so that the mark which is expected to be copied by a user is temporarily stored in the temporary marking area, thereby realizing the mark copying, greatly reducing the interaction frequency in the marking process and improving the marking efficiency in the text-to-speech process.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a basic flow diagram of a marking method according to an embodiment of the present application;
FIG. 2 is an example of marker replication for an embodiment of the present application;
fig. 3 is an example of implementing label pasting based on a temporary label area in an embodiment of the present application;
fig. 4 is a state example after the mark pop-up window is closed according to the embodiment of the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The speech synthesis software is mainly used for synthesizing texts into audios, but the accuracy and naturalness of the existing machine speech synthesis effect are still insufficient, so that various text pronunciation marks and rhythm adjustment marks need to be added to improve the synthesis effect, such as adding "aliases", "polyphones", "pauses", "continuous reading", "rereading", "local speed change", "local volume", and the like in the texts.
The embodiment of the invention provides a marking method for text-to-speech processing, which comprises the following steps of:
in step S101, a plurality of marking menu items each having a marking tool of one kind of function are provided. Text markup is used to improve prosodic problems such as incorrect pronunciation or unnatural pauses when machine synthesizing speech. Referring specifically to fig. 2, in some examples, the plurality of marker menu items includes at least: pause markers, read-through markers, polyphone markers, local volume markers, reread markers, alias markers.
In step S102, a first target text is selected, and a mark of a corresponding function is added to the selected first target text based on a mark menu item. In the actual use process, a user can click a certain marking menu item, then a marking of a corresponding function can be executed, for example, a polyphone marking is clicked, a pronunciation can be set for a selected text, the usage of other marking tools is similar, and the description is omitted here.
In step S103, a temporary mark area is provided. Referring specifically to the "temporary marks" in fig. 2, there are no marks in the temporary mark areas in fig. 2.
In step S104, a copy instruction after adding a target mark to the first target text is obtained, so as to temporarily store the target mark in the temporary mark area. Referring to fig. 2 and 3, for example, after the copy operation of the mark "0.9 x" is performed in fig. 2, the mark "0.9 x" is temporarily stored in the temporary mark area, so that the mark in the temporary mark area can be pasted based on the temporary mark area. The specific copying action can be completed by a right mouse button or a shortcut key Ctrl + C.
In step S105, in a case where a second target text is selected, the target mark is presented in association with the second target text based on the temporary mark area, so that the target mark is given to the second target text after a confirmation instruction of the user is acquired. The second target text referred to in this example may be text following the first target text in the text passage in the word order, and fig. 3 shows an example of implementing label pasting based on a temporary label, for example, after the second text is selected, a label in the temporary label area, for example "0.9 x" in fig. 3, may be presented in association with the selected text. By the method, the user can copy the mark based on the temporary mark area under the condition that the same mark is required to be continuously used, so that the mark adding efficiency in the text-to-speech process is greatly improved.
The embodiment of the invention provides a plurality of marking menu items and provides the temporary marking area, so that the mark which is expected to be copied by a user is temporarily stored in the temporary marking area, thereby realizing the mark copying, greatly reducing the interaction frequency in the marking process and improving the marking efficiency in the text-to-speech process.
Optionally, presenting the target mark in association with the second target text is achieved by marking a pop-up window. For example, in fig. 3, a copied mark icon "0.9 x" is automatically popped up near the cursor, the mark can be pasted by clicking the floating icon, the pasting mark is realized after clicking the pop-up window, and the pop-up window disappears.
And when the mark popup is closed, or the mark given to the second target text is different from the target mark in the temporary mark area, or the operation (text selection area or cursor positioning) on the second target text is inconsistent with the precondition operation corresponding to the target mark in the temporary mark area, after the second target text is selected, the mark popup is not carried out any more.
Specifically, for example, in fig. 4, after the following text segment "not innocent from our deep," where the preceding target text "from" paste mark "0.9X" based on mark popup implementation, "X" in user popup closes the mark popup, and thus the following target text "deep" does not popup. In other examples, for example, if the user gives another mark for "deep" than "0.9 x", then no pop-up is performed. Or the user does not select the text, but performs mouse click operation in the text, and the popup is not executed.
In other examples, in a case where the operation on the second target text does not coincide with the precondition operation corresponding to the target mark in the temporary mark area, the mark popup is not performed after the text is selected after the second target text. The precondition operation corresponding to the target mark in the temporary mark area referred to in this example may be, for example, that the polyphone mark function is to select a single word and then click on the "polyphone" mark, and at this time, the operation of selecting a single word may be used as the precondition operation of polyphone mark. Pause is the prerequisite operation that needs the mouse to click a certain position in the text to carry out cursor positioning, then clicks the mark of "pause", and cursor positioning can be regarded as the mark of "pause" at this moment. By the mode, the method can be more suitable for the use scene of the user, so that the popup window can appear at a proper time through marking, and the marking efficiency of the user is improved.
In some embodiments, further comprising: and after the third target text is selected, acquiring a selection instruction of the temporary marking area so as to endow the target mark in the temporary marking area to the third target text. Specifically, as shown in fig. 4, for example, after the mark popup is closed, the temporary mark area still stores the mark "0.9 x" temporarily, and in the subsequent marking execution process, the user may click on the temporary mark area, thereby calling the mark in the temporary mark area and assigning the selected third target text.
In some embodiments, at least some of the plurality of branding menu items are provided with corresponding custom functions.
By clicking on the toolbar icon after swiping the text content or cursor positioning herein, a mark or need to be selected in a drop down menu, entered in an input box, adjusted to a desired value using a slider, etc. may be generated directly.
Specific pause markers: when a pause mark is inserted into the text, the cursor is positioned at a certain position in the text, the pause mark menu item is clicked, and the user-defined pause time length, the non-pause time length, the 0.05s, the 0.1s, the 0.15s and the 0.2s are selected from the popped-up pull-down menu or the user-defined pause time length is input into an input box, so that the pause mark can be inserted.
Reading-through marks: after part of text (more than two characters) is selected in a sliding way, the continuous reading mark menu item is clicked, and continuous reading marks appear in the text immediately.
Polyphone marking: sliding a menu of characters, clicking a 'polyphone' mark menu item, and selecting pinyin provided by a system in a pull-down menu or inputting user-defined pinyin in an input box.
Local volume marking: selecting a text needing to adjust the volume, clicking a 'local volume' mark menu item, adjusting the volume to an ideal value through a sliding bar, or clicking a 'confirm' button after inputting a custom value in an input box.
In some embodiments, after adding the mark of the corresponding function to the selected first target text, the method further includes: and acquiring clicking operation on the mark of the first target text to modify the mark of the first target text. In some embodiments, further comprising: a delete key is associatively provided based on the marking of the first target text to delete the corresponding marking based on the delete key. Specifically, after the mark is generated, the mark in the text can be clicked to modify. Clicking the delete button in the mark can delete the mark.
The method of the embodiment can quickly copy the mark, intelligently identify the action of pasting the mark and prompt a user to paste the mark. The marks copied for the last time are stored in the temporary mark area, and a user can directly click and use the marks, so that the time for operating a plurality of same marks is reduced, the operation efficiency of text marks is improved, and the text voice synthesis efficiency is improved.
The present application further provides a text-to-speech processing labeling apparatus, which includes a processor and a memory, where the memory stores a computer program, and the computer program implements the steps of the text-to-speech processing labeling method when executed by the processor.
The present application also proposes a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, implements the steps of the aforementioned labeling method for text-to-speech processing.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (9)

1. A marking method for text-to-speech processing is characterized by comprising the following steps:
providing a plurality of marking menu items, each marking menu item having a marking tool of a type of function;
selecting a first target text, and adding a mark of a corresponding function to the selected first target text based on a mark menu item;
providing a temporary marking area;
acquiring a copy instruction after adding a target mark to the first target text so as to temporarily store the target mark to the temporary mark area;
and under the condition that a second target text is selected, presenting the target mark in association with the second target text based on the temporary mark area, and endowing the target mark to the second target text after acquiring a confirmation instruction of a user.
2. The method of text-to-speech processing tagging of claim 1, wherein presenting the target tag in association with the second target text is accomplished by tagging a popup;
and when the mark popup is closed, or the mark given to the second target text is different from the target mark in the temporary mark area, or the operation on the second target text is inconsistent with the precondition operation corresponding to the target mark in the temporary mark area, not marking the popup after the text is selected after the second target text.
3. The method for labeling text-to-speech processing as recited in claim 2, further comprising: and after a third target text is selected, acquiring a selection instruction of the temporary marking area so as to endow the target mark in the temporary marking area to the third target text.
4. The method of claim 1, wherein the plurality of markup menu items comprise at least: pause markers, read-through markers, polyphone markers, local volume markers, reread markers, alias markers.
5. The method for tagging of text to speech processing of claim 4, wherein at least some of the plurality of tagging menu items are provided with corresponding custom functions.
6. The method for labeling text-to-speech processing according to claim 1, wherein after adding a label corresponding to a function to the selected first target text, further comprising:
and acquiring clicking operation on the mark of the first target text to modify the mark of the first target text.
7. The method for text-to-speech processing tagging of claim 6, further comprising:
a delete key is associatively provided based on the marking of the first target text to delete the corresponding marking based on the delete key.
8. A text-to-speech processing labeling apparatus comprising a processor and a memory, the memory having stored thereon a computer program that, when executed by the processor, performs the steps of the text-to-speech processing labeling method of any of claims 1 to 7.
9. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the text-to-speech processing labeling method according to any one of claims 1 to 7.
CN202210791141.4A 2022-07-07 2022-07-07 Marking method and device for text-to-speech processing Active CN114863907B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210791141.4A CN114863907B (en) 2022-07-07 2022-07-07 Marking method and device for text-to-speech processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210791141.4A CN114863907B (en) 2022-07-07 2022-07-07 Marking method and device for text-to-speech processing

Publications (2)

Publication Number Publication Date
CN114863907A true CN114863907A (en) 2022-08-05
CN114863907B CN114863907B (en) 2022-10-28

Family

ID=82627090

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210791141.4A Active CN114863907B (en) 2022-07-07 2022-07-07 Marking method and device for text-to-speech processing

Country Status (1)

Country Link
CN (1) CN114863907B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6457031B1 (en) * 1998-09-02 2002-09-24 International Business Machines Corp. Method of marking previously dictated text for deferred correction in a speech recognition proofreader
US20170004821A1 (en) * 2014-10-30 2017-01-05 Kabushiki Kaisha Toshiba Voice synthesizer, voice synthesis method, and computer program product
WO2017008646A1 (en) * 2015-07-13 2017-01-19 阿里巴巴集团控股有限公司 Method of selecting a plurality targets on touch control terminal and equipment utilizing same
CN108345674A (en) * 2018-02-11 2018-07-31 维沃移动通信有限公司 A kind of file management method and mobile terminal
CN110018762A (en) * 2019-03-15 2019-07-16 维沃移动通信有限公司 A kind of text clone method and mobile terminal
CN110109608A (en) * 2019-05-17 2019-08-09 北京达佳互联信息技术有限公司 Text display method, device, terminal and storage medium
CN110520859A (en) * 2017-04-05 2019-11-29 微软技术许可有限责任公司 More intelligent copy/paste
CN111142667A (en) * 2019-12-27 2020-05-12 苏州思必驰信息科技有限公司 System and method for generating voice based on text mark
CN112947826A (en) * 2021-01-28 2021-06-11 维沃移动通信有限公司 Information acquisition method and device and electronic equipment
CN112951396A (en) * 2021-03-29 2021-06-11 深圳市科曼医疗设备有限公司 Configuration method of surgical procedure and workstation for configuring surgical procedure
CN114023302A (en) * 2022-01-10 2022-02-08 北京中电慧声科技有限公司 Text speech processing device and text pronunciation processing method
CN114063841A (en) * 2021-11-10 2022-02-18 维沃移动通信有限公司 Text selection method, text selection device and electronic equipment
CN114385946A (en) * 2020-10-22 2022-04-22 腾讯科技(深圳)有限公司 Data structure editing method and device, electronic equipment and storage medium

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6457031B1 (en) * 1998-09-02 2002-09-24 International Business Machines Corp. Method of marking previously dictated text for deferred correction in a speech recognition proofreader
US20170004821A1 (en) * 2014-10-30 2017-01-05 Kabushiki Kaisha Toshiba Voice synthesizer, voice synthesis method, and computer program product
WO2017008646A1 (en) * 2015-07-13 2017-01-19 阿里巴巴集团控股有限公司 Method of selecting a plurality targets on touch control terminal and equipment utilizing same
CN110520859A (en) * 2017-04-05 2019-11-29 微软技术许可有限责任公司 More intelligent copy/paste
CN108345674A (en) * 2018-02-11 2018-07-31 维沃移动通信有限公司 A kind of file management method and mobile terminal
CN110018762A (en) * 2019-03-15 2019-07-16 维沃移动通信有限公司 A kind of text clone method and mobile terminal
CN110109608A (en) * 2019-05-17 2019-08-09 北京达佳互联信息技术有限公司 Text display method, device, terminal and storage medium
CN111142667A (en) * 2019-12-27 2020-05-12 苏州思必驰信息科技有限公司 System and method for generating voice based on text mark
CN114385946A (en) * 2020-10-22 2022-04-22 腾讯科技(深圳)有限公司 Data structure editing method and device, electronic equipment and storage medium
CN112947826A (en) * 2021-01-28 2021-06-11 维沃移动通信有限公司 Information acquisition method and device and electronic equipment
CN112951396A (en) * 2021-03-29 2021-06-11 深圳市科曼医疗设备有限公司 Configuration method of surgical procedure and workstation for configuring surgical procedure
CN114063841A (en) * 2021-11-10 2022-02-18 维沃移动通信有限公司 Text selection method, text selection device and electronic equipment
CN114023302A (en) * 2022-01-10 2022-02-08 北京中电慧声科技有限公司 Text speech processing device and text pronunciation processing method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
岳东剑等: "面向文语转换的标记语言标准的研究", 《计算机辅助工程》 *

Also Published As

Publication number Publication date
CN114863907B (en) 2022-10-28

Similar Documents

Publication Publication Date Title
US8438032B2 (en) System for tuning synthesized speech
US20140222424A1 (en) Method and apparatus for contextual text to speech conversion
US7487092B2 (en) Interactive debugging and tuning method for CTTS voice building
US7895531B2 (en) Floating command object
EP1490861B1 (en) Method, apparatus and computer program for voice synthesis
US8160881B2 (en) Human-assisted pronunciation generation
US7062437B2 (en) Audio renderings for expressing non-audio nuances
US20190362022A1 (en) Audio file labeling process for building datasets at scale
CN111142667A (en) System and method for generating voice based on text mark
CN114023302B (en) Text speech processing device and text pronunciation processing method
CN105810197A (en) Voice processing method, voice processing device and electronic device
CN114863906B (en) Method and device for marking alias of text-to-speech processing
CN114863907B (en) Marking method and device for text-to-speech processing
CN111199724A (en) Information processing method and device and computer readable storage medium
CN111145722B (en) Text processing method and device, computer storage medium and electronic equipment
JPH08263260A (en) Text readout method
CN111370011A (en) Method, device, system and storage medium for replacing audio
US11875764B2 (en) Data-driven autosuggestion within media content creation
CN106383847A (en) Page content processing method and device
JPH08272388A (en) Device and method for synthesizing voice
CN116153289A (en) Processing method and related device for speech synthesis marked text
US20240202239A1 (en) Transcript aggregaton for non-linear editors
JP2002268664A (en) Voice converter and program
US20210097975A1 (en) Information processing method, information processing device, and program
Manzara et al. Degas: a system for rule-based diphone speech synthesis.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant