CN111105778A - Speech synthesis method, speech synthesis device, computing equipment and storage medium - Google Patents
Speech synthesis method, speech synthesis device, computing equipment and storage medium Download PDFInfo
- Publication number
- CN111105778A CN111105778A CN201811270533.6A CN201811270533A CN111105778A CN 111105778 A CN111105778 A CN 111105778A CN 201811270533 A CN201811270533 A CN 201811270533A CN 111105778 A CN111105778 A CN 111105778A
- Authority
- CN
- China
- Prior art keywords
- client
- audio data
- server
- network connection
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000015572 biosynthetic process Effects 0.000 title claims abstract description 48
- 238000003786 synthesis reaction Methods 0.000 title claims abstract description 48
- 238000001308 synthesis method Methods 0.000 title claims abstract description 27
- 238000006243 chemical reaction Methods 0.000 claims abstract description 47
- 238000012545 processing Methods 0.000 claims abstract description 38
- 230000006835 compression Effects 0.000 claims abstract description 35
- 238000007906 compression Methods 0.000 claims abstract description 35
- 238000000034 method Methods 0.000 claims abstract description 35
- 230000005540 biological transmission Effects 0.000 claims description 33
- 230000004044 response Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 abstract description 9
- 238000010586 diagram Methods 0.000 description 18
- 230000007246 mechanism Effects 0.000 description 13
- 230000003139 buffering effect Effects 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 8
- 230000008859 change Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000002085 persistent effect Effects 0.000 description 4
- 230000002194 synthesizing effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 239000002355 dual-layer Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/002—Dynamic bit allocation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0823—Errors, e.g. transmission errors
- H04L43/0829—Packet loss
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/25—Flow control; Congestion control with rate being modified by the source upon detecting a change of network conditions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/28—Flow control; Congestion control in relation to timing considerations
- H04L47/283—Flow control; Congestion control in relation to timing considerations in response to processing delays, e.g. caused by jitter or round trip time [RTT]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/54—Loss aware scheduling
Abstract
The invention discloses a voice synthesis method, a device, a computing device and a storage medium, wherein the voice synthesis method executed at a server comprises the following steps: performing text-to-speech conversion processing on text data in a client data packet from a client to obtain audio data corresponding to the text data; determining a code rate for compressing the audio data based on a network connection state between the client and the server; compressing the audio data based on the code rate to obtain an audio data packet subjected to compression processing; and returning the audio data packet to the client. Therefore, the whole voice synthesis process is ensured to be smooth by adjusting the compression code rate aiming at the complex and changeable network environment, so that a user can obtain smooth voice data playing.
Description
Technical Field
The present invention relates to the field of speech processing technologies, and in particular, to a speech synthesis method, apparatus, computing device, and storage medium.
Background
Speech synthesis is a technique for generating artificial speech by mechanical, electronic methods. TTS technology (also known as text-to-speech technology) belongs to speech synthesis, and is a technology for converting text information generated by a computer or input from the outside into intelligible and fluent audio output.
In many current speech synthesis services, audio data synthesized by a server is basically transmitted directly to a client, and the client performs appropriate buffering and then plays the audio data. In the method, the influence of complex internet environment and factors such as play buffering and decoding buffering is not considered, and the whole voice synthesis system cannot sense the change of network conditions and adaptively provides audio data with different qualities.
Accordingly, there is still a need for an improved speech synthesis technique to provide a smooth audio data playback effect to the user.
Disclosure of Invention
The invention aims to provide a voice synthesis method and a voice synthesis device, which are suitable for different network environments and network conditions and provide smooth voice data playing effect for users.
According to an aspect of the present invention, there is provided a speech synthesis method performed at a server, including: performing text-to-speech conversion processing on text data in a client data packet from a client to obtain audio data corresponding to the text data; determining a code rate for compressing the audio data based on a network connection state between the client and the server; compressing the audio data based on the code rate to obtain an audio data packet subjected to compression processing; and returning the audio data packet to the client.
Optionally, the speech synthesis method may further include determining the network connection status based on a packet loss rate parameter from the client.
Optionally, the step of determining a code rate for compressing the audio data may include: and under the condition that the network connection state is good, determining a code rate for compressing the audio data based on the packet loss rate parameter.
Optionally, in the case that the network connection state is poor, the server does not perform text-to-speech conversion processing on the text data or does not perform compression processing on the audio data, and sends an instruction for generating the audio data offline to the client.
Optionally, the packet loss rate parameter is calculated by the client according to statistical information of previously received audio data packets.
According to another aspect of the present invention, there is also provided a server for speech synthesis, including: the first text-to-speech conversion unit is used for performing text-to-speech conversion processing on text data in a client data packet from a client to obtain audio data corresponding to the text data; a code rate determining unit, configured to determine a code rate for compressing the audio data based on a network connection state between the client and the server; the compression unit is used for compressing the audio data based on the code rate to obtain an audio data packet after compression; and the first transmission unit is used for returning the audio data packet to the client.
Optionally, the server may further include a first network status determining unit, configured to determine the network connection status based on a packet loss rate parameter from the client.
Optionally, the code rate determining unit determines the code rate for compressing the audio data based on the packet loss rate parameter when the network connection state is good.
Optionally, the server may further include an instruction control unit, configured to generate an instruction for generating the audio data offline in a case that the network connection status is poor, where in a case that the network connection status is poor, the first text-to-speech conversion unit does not perform text-to-speech conversion on the text data, or the compression unit does not perform compression on the audio data, and the first transmission unit sends the instruction for generating the audio data offline to the client.
According to another aspect of the present invention, there is also provided a speech synthesis method performed at a client, including: sending a network connection state between a client and a server or a parameter which can be used for determining the network connection state correlation to the server; sending character data input by a user to a server; and
and receiving an audio data packet from the server, wherein the audio data packet comprises audio data corresponding to the text data, and the compression code rate of the audio data packet is related to the network connection state.
Optionally, the speech synthesis method may further include determining the network connection status based on a packet loss rate parameter.
Optionally, the text data is sent to the server when the network connection state is good.
Optionally, the method may further include: under the condition that the network connection state is poor, text-to-speech conversion processing is performed on the text data locally at the client; or responding to an instruction of generating audio data offline from a server, and locally performing text-to-speech conversion processing on the text data at the client.
Optionally, the parameter that can be used to determine the network connection status is a packet loss rate parameter.
Optionally, the method may further include: and calculating and updating the packet loss rate parameter based on the statistical information of the received audio data packet.
Optionally, the method may further include: and dynamically adjusting the size of a local jitter buffer area according to the packet loss rate parameter.
There is also provided according to another aspect of the present invention, a client for speech synthesis, including: the second transmission unit is used for sending the network connection state between the client and the server or parameters related to the network connection state which can be used for determining the network connection state to the server; the third transmission unit is used for transmitting the character data input by the user to the server; and a fourth transmission unit, configured to receive an audio data packet from the server, where the audio data packet includes audio data corresponding to the text data, and a compression code rate of the audio data packet is related to the network connection state.
Optionally, the client may further include: and the second network state determining unit is used for determining the network connection state based on the packet loss rate parameter.
Optionally, the third transmission unit sends the text data to the server when the network connection state is good.
Optionally, the client may further include: the second text-to-speech conversion unit is used for performing text-to-speech conversion processing on the text data locally at the client under the condition that the network connection state is poor; or in response to receiving an instruction for generating audio data offline from a server, the second text-to-speech conversion unit performs text-to-speech conversion processing on the text data locally at the client.
Optionally, the parameter that can be used to determine the network connection status is a packet loss rate parameter.
Optionally, the client may further include a calculating unit, configured to calculate and update the packet loss rate parameter based on the statistical information of the received audio data packets.
Optionally, the client may further include a jitter buffer adjusting unit, configured to dynamically adjust a size of a local jitter buffer according to the packet loss rate parameter.
According to another aspect of the present invention, there is also provided a computing device comprising: a processor; and a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method as described above.
According to another aspect of the present invention, there is also provided a non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform the method as described above.
Therefore, by the invention, aiming at a complex and changeable network environment, a network condition monitoring mechanism, a network congestion control mechanism, a local buffer dynamic adjustment mechanism and other mechanisms are introduced to ensure the smoothness of the whole voice synthesis process, so that a user can obtain a smooth voice data playing effect.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in greater detail exemplary embodiments thereof with reference to the attached drawings, in which like reference numerals generally represent like parts throughout.
FIG. 1 shows a schematic diagram of a speech synthesis system for implementing an embodiment of the invention.
FIG. 2 shows a schematic diagram of a speech synthesis system according to one embodiment of the invention.
Fig. 3 shows a flow diagram of a speech synthesis method according to an embodiment of the invention.
Fig. 4 shows a flow diagram of a speech synthesis method according to another embodiment of the invention.
Fig. 5 shows a schematic structural diagram of a server according to an embodiment of the present invention.
Fig. 6 shows a schematic structural diagram of a client according to an embodiment of the present invention.
FIG. 7 shows a schematic block diagram of a computing device in accordance with one embodiment of the present invention.
Detailed Description
Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As mentioned above, the current speech synthesis service does not consider the influence of the complicated internet environment and factors such as jitter buffering and decoding buffering, and the whole speech synthesis system cannot sense the change of the network condition and adapt to audio data with different quality.
At present, the online speech synthesis service directly transmits the audio data synthesized by the server to the client, and the client performs appropriate buffering and then playing, and such a scheme has the following disadvantages:
1) the internet environment is complex, and the playing buffer size of the local client cannot be dynamically changed along with the network condition due to different network jitter degrees. If the play buffer is too small, the audio play will be jammed; if the play buffer is too large, memory will be wasted and delay of audio play will be increased;
2) decoding buffering is lacking. If the decoding capability of the client is insufficient, the decoded audio data is not available during audio playing, and playing is blocked;
3) the network environment is complex and there is no QoS (Quality of Service) mechanism, so that the whole transmission system cannot sense the change of the network condition, and accordingly, the Quality of the transmitted audio data (i.e. audio data with different code rates) is changed.
In view of this, the present invention provides a speech synthesis method and apparatus, which, aiming at a complex and variable network environment, ensure the smoothness of the whole speech synthesis process by introducing a series of mechanisms such as a network condition monitoring mechanism, a network congestion control mechanism, a local buffer dynamic adjustment mechanism, etc., so that a user can obtain smooth speech data playing.
The speech synthesis scheme of the present invention will be described in detail below with reference to the accompanying drawings and embodiments.
FIG. 1 shows a schematic diagram of a speech synthesis system for implementing an embodiment of the invention.
As shown in fig. 1, the speech synthesis system of the present invention may include at least one server 20 and a plurality of terminal devices 10. The terminal device 10 can transmit and receive information to and from the server 20 via the network 40. The server 20 can acquire contents required by the terminal device 10 by accessing the database 30. Mobile terminals (e.g., between 10_1 and 10_2 or 10_ N) may also communicate with each other via the network 40.
Network 40 may be a network for information transfer in a broad sense and may include one or more communication networks such as a wireless communication network, the internet, a private network, a local area network, a metropolitan area network, a wide area network, or a cellular data network, among others. In one embodiment, the network 40 may also include a satellite network, whereby the GPS signals of the terminal device 10 are transmitted to the server 20.
The server 20 is any server capable of providing information required for an interactive service through a network. The server side can receive a client side data packet containing the text data sent by the client side, can perform text-to-speech conversion processing on the text data to obtain audio data corresponding to the text data, and sends the audio data packet containing the audio data to the client side for audio playing.
In the following description, one or a part of the mobile terminals (for example, the terminal device 10-1) will be selected and described, but it should be understood by those skilled in the art that the above-mentioned 1 … N mobile terminals are intended to represent a large number of mobile terminals existing in a real network, and the single server 20 and database 30 shown are intended to represent the operation of the technical solution of the present invention involving the server and the database. The detailed description of the mobile terminal and the single server and database with specific numbers is at least for convenience of description and does not imply any limitation as to the type or location of the mobile terminal and server.
It should be noted that the underlying concepts of the exemplary embodiments of the present invention are not altered if additional modules are added or removed from the illustrated environments. In addition, although a bidirectional arrow from the database 30 to the server 20 is shown in the figure for convenience of explanation, it will be understood by those skilled in the art that the above-described data transmission and reception may be realized through the network 40.
FIG. 2 shows a schematic diagram of a speech synthesis system according to one embodiment of the invention.
As shown in fig. 2, the speech synthesis system of the present invention may include at least a client 10 and a server 20.
The client 10 may be a terminal device as shown in fig. 1, or may be an application client installed on the terminal device side.
The client 10 may include a component that receives text input that provides an external input interface through which a user may directly input text that requires speech synthesis.
The client 10 may include a network module, which is a component for network communication and can be used for network transmission, wherein the protocol format of the client and server communication may be in the form of data packets. In one embodiment, the network module can send a client data packet including text data input by a user to the server and can also receive an audio data packet returned by the server.
In a preferred embodiment, the client 10 may further include a Quality of Service (QoS) module, which is capable of calculating a network parameter indicating a network condition, such as a packet loss rate, according to the received packet statistics information, and adjusting a jitter buffer of the local client according to the current network condition to assist in adaptively playing the smooth audio data. The QoS module may preferably perform the above calculation and control by using a congestion control algorithm (GCC algorithm) (see the following description for a specific control scheme). Preferably, in order to calculate the packet loss rate, the data packet transmitted by the network may further include a sequence number of the data packet.
Further, the client 10 may further include an audio decoder, which may include a decoding buffer, and may be capable of decoding and placing the audio data returned by the server into the decoding buffer, so that the audio player may play the audio data obtained from the decoding buffer, thereby enabling the user to obtain smooth audio data playing.
In a preferred embodiment, the client 10 may further include a speech synthesis module (not shown in the figure) capable of synthesizing audio data corresponding to the text data input by the user locally offline at the client. The voice synthesis module can automatically synthesize the audio data at the local client under the condition of poor network condition, and can also locally synthesize the audio data in an off-line manner at the local client after receiving a control instruction which is sent by the server and used for synthesizing the audio data.
The server 20 may also include corresponding components corresponding to the client.
For example, the server 20 may also include a network module for network transmission. The network module can receive text data from the client and can also transmit audio data synthesized at the server to the client 10.
The server 20 may have a QoS module corresponding to the client, which can determine the current network condition (such as network connection status) according to the obtained network parameters, and then perform control related to network congestion based on the current network condition, so as to obtain audio data with different quality at the server.
In a preferred embodiment, the above calculation and control can be preferably performed by using a congestion control algorithm (GCC algorithm), and the QoS module on the client side performs control of network congestion in cooperation with the QoS module on the server side to control the quality of the voice-synthesized audio data according to the network conditions.
The network parameters (e.g., packet loss rate) can reflect the network congestion condition, and based on the GCC algorithm, the compression code rate can be controlled at the transmitting end (e.g., the server end) based on the network parameters (e.g., packet loss rate).
In a preferred embodiment, when the packet loss rate is small or zero, it indicates that the network condition is good, and when the packet loss rate does not exceed the preset maximum code rate, the compression code rate of the sending end can be increased; conversely, when the packet loss rate is increased, the code rate of the transmitting side (e.g., server side) should be decreased in this case, which indicates that the network condition is deteriorated. In other cases, the compression code rate of the transmitting end can be kept unchanged.
The network parameters (e.g., packet loss rate) used by the GCC algorithm may be calculated according to the statistical information of the data packets received by the receiving end (e.g., client), and then the receiving end returns the network parameters to the transmitting end.
After receiving the packet loss rate, the transmitting end may calculate (1-1) a compression code rate of the transmitting end according to the following formula: when the packet loss rate is greater than 0.1, the network is congested, and the compression code rate of a sending end is reduced; when the packet loss rate is less than 0.02, the network condition is good, and the compression code rate of the sending end is increased; in other cases, the compression code rate at the transmitting end may remain unchanged.
Wherein A iss(tk) Representing the compression code rate, f, of the transmitting endl(tk) Is the packet loss rate parameter, k and k-1 represent the sequence number of the transmitted data packet, tk、tk-1Respectively indicating the time when the kth and k-1 th data packets are transmitted.
The server 20 may further include a speech synthesis module capable of synthesizing the received text data into audio data corresponding thereto.
Further, the server 20 may further include an audio encoder, which is capable of compressing the audio data obtained by the synthesis according to the compression code rate of the sending end calculated by the QoS module, so as to facilitate network transmission, that is, obtain audio data with different qualities.
In addition, although the text-to-speech processing capability of the server is stronger than that of the client side, when the network state is poor, the technical bottleneck of speech synthesis is shifted from the text-to-speech processing capability to network transmission, which brings a poor experience to the user.
Therefore, in a preferred embodiment, the server 20 may also determine the current network condition according to the obtained network parameters, perform online speech synthesis at the server side when the network condition is good, and send a control instruction for performing speech synthesis to the client side when the network condition is poor, so as to perform offline speech synthesis at the client side, thereby avoiding delay, pause and the like caused by poor network, ensuring that the user obtains smooth audio data playback, and ensuring user experience.
Therefore, the conversion from the local client to the local client can be determined according to the compromise between the text-to-speech conversion processing and the network transmission when the network connection state is poor.
In other words, when the user experience (e.g. fluency, transmission speed, etc.) caused by poor network status is lower than the effect of local conversion, the network connection status is considered to be poor, and the text-to-speech conversion process is performed locally by the client.
It should be understood by those skilled in the art that the text data of the present invention may include, but is not limited to, text data input by a user, and the text data of the present invention is suitable for any application scenario.
Taking a novel as an example, in an application scene of online synthesis and playing of the novel, a novel reading module in the application can segment text data corresponding to the novel, call a client input interface of the speech synthesis system, transmit the text data to the speech synthesis system, and then after receiving the text data, a server side performs online speech synthesis and returns audio data to the client side for playing. Meanwhile, in the process, the service end can generate audio data with different qualities by combining the GCC algorithm, the QoS module of the client end can dynamically adjust jitter buffering according to the change of the network environment, and decoding buffering is introduced to ensure that the audio data is smoothly played.
Furthermore, the speech synthesis scheme of the present invention can also be implemented as a speech synthesis method that can be executed separately on the server side and on the client side.
Fig. 3 shows a flow diagram of a speech synthesis method according to an embodiment of the invention. The speech synthesis method may be performed on the server side as shown in fig. 1.
As shown in fig. 3, in step S310, text-to-speech conversion processing is performed on text data in a client data packet from a client, so as to obtain audio data corresponding to the text data.
The client data may include, but is not limited to, text data input by a user and/or a network parameter (e.g., packet loss rate) calculated by the client, and the network parameter (e.g., packet loss rate parameter) may be calculated by the client according to statistical information of previously received audio data packets. The text-to-speech conversion process can be referred to as the existing text-to-speech conversion process, and is not described herein again.
In step S320, a bitrate for compressing the audio data is determined based on a network connection state between the client and the server.
The network connection state is a description of the network environment or network conditions. Here, the network connection state may be determined based on a packet loss rate parameter from a client. Further, a (transmitting end) code rate for compressing the synthesized audio data is determined based on the current network connection state. Wherein the code rate can be calculated by the above-described calculation formula (1-1).
Wherein the determining of the code rate for compressing the audio data may comprise: and under the condition that the network connection state is good, determining a code rate for compressing the audio data based on the packet loss rate parameter.
It should be understood that, the present invention does not limit the sequence of the step of synthesizing audio data and the step of determining the code rate, and the two steps may be performed simultaneously, or the network connection state and the code rate may be determined first, and then the above text-to-speech conversion processing, i.e. the online speech synthesis, is performed at the server side under the condition that the current network status is good.
Then, in step S330, the audio data is compressed based on the code rate, so as to obtain an audio data packet after compression.
Finally, in step S340, the audio data packet is returned to the client.
In addition, in the opposite case, for example, the network condition is poor, the server may perform a speech synthesis method which is not exactly the same as the above steps.
For example, the server may first determine a network connection state between the client and the server based on a packet loss rate parameter from the client, and if the network connection state is poor, the text-to-speech conversion processing is not performed on the text data or the audio data is not compressed (i.e., the steps S310 and S330 are not performed), and an instruction for generating the audio data offline is sent to the client in step S340.
Fig. 4 shows a flow diagram of a speech synthesis method according to an embodiment of the invention. The speech synthesis method may be performed on the client side as shown in fig. 1.
As shown in fig. 4, in step S410, a network connection state between the client and the server or a parameter that can be used to determine the network connection state is sent to the server.
The parameter that can be used to determine the network connection status may be a packet loss rate parameter. The client can calculate and update the packet loss rate parameter stored locally at the client based on the statistical information of the previously received audio data packets.
Before step S410, the client may determine the network connection status based on the packet loss rate parameter, and then send the network connection status to the server in step S410. Alternatively, in step S410, the client may also send the packet loss rate parameter to the server, so that the server may determine the network connection state between the client and the server based on the packet loss rate parameter.
In step S420, the character data input by the user is transmitted to the server.
The client may send the text data and the network connection status in association, or may use the related parameters to determine the network connection status. In a preferred embodiment, the client can package the text data input by the user and the network connection state or the related parameters capable of being used for determining the network connection state into a client data packet, and send the client data packet to the server, so as to perform related speech synthesis on the server side.
And, in step S430, the client is able to receive the audio data packet from the server and perform voice playing based on the audio data packet. The audio data can be directly played at the client, and can also be stored or sent to other equipment for playing.
The audio data packet may include audio data corresponding to the text data, and a compression rate of the audio data packet is related to the network connection status or the related parameter that can be used to determine the network connection status. For example, the compression code rate may be determined according to the network connection status or determined according to the related parameter.
In addition, in order to ensure the smooth playing of the audio data provided for the user, at the client side, the size of the local jitter buffer can be dynamically adjusted in a self-adaptive manner according to the packet loss rate parameter.
Because the packet loss rate, the time delay and the jitter buffer are related, the size of the local jitter buffer area of the client can be dynamically adjusted by combining the packet loss rate and the preset time delay threshold, the jitter buffer with variable size is self-adapted at the client, and the continuity of audio playing and the low time delay of the playing of the client are ensured.
In addition, in order to ensure the continuity of audio data playing, the client also introduces a decoding buffer area, and the audio data is further acquired from the decoding buffer area for playing by putting the decoded audio data into the decoding buffer area. Thus, by adopting the decoding buffer, the playing pause caused by too slow decoding of a machine with poor performance is prevented.
Therefore, the situation that the decoding efficiency is reduced due to the occupation of the CPU by other processes, and the decoding jitter (the decoded data is inconsistent in time delay) is avoided.
In addition, the client locally can also realize an offline voice synthesis function, and offline voice synthesis is preferentially selected under the condition of poor network conditions, so that the conditions of network delay, network jitter, packet loss and the like caused by poor network connection state are avoided, and the audio data corresponding to the character data input by the user can be smoothly played at the client.
Thus, after determining the network connection state between the client and the server, the client can execute the steps S420 and S430 only when the network connection state is good. Under the condition of poor network connection state, the text data can be directly subjected to text-to-speech conversion processing locally at the client without sending the text data to the server.
In addition, the client can also receive a control instruction from the server, and can respond to the instruction of generating the audio data offline from the server, and locally perform text-to-speech conversion processing on the text data at the client.
So far, the speech synthesis methods executed at the server and the client have been described in detail with reference to fig. 3 and fig. 4, respectively, and by the speech synthesis methods, different network conditions can be adapted to, and it is ensured that the user obtains smooth speech data playback.
Fig. 5 shows a schematic structural diagram of a server according to an embodiment of the present invention. The server may be used to implement the speech synthesis method as shown in fig. 3.
As shown in fig. 5, the server 500 for speech synthesis of the present invention may include a first text-to-speech conversion unit 510, a code rate determination unit 520, a compression unit 530, and a first transmission unit 540.
The first text-to-speech conversion unit 510 may perform text-to-speech conversion processing on text data in a client data packet from the client, so as to obtain audio data corresponding to the text data.
The code rate determining unit 520 may determine a code rate for compressing the audio data based on a network connection state between the client and the server.
The compressing unit 530 may perform compression processing on the audio data based on the code rate to obtain a compressed audio data packet.
The first transmission unit 540 may return the audio data packet to the client.
In a preferred embodiment, the server 500 may further include a first network status determining unit (not shown).
The first network state determination unit may determine the network connection state based on a packet loss rate parameter from the client. The code rate determining unit 520 may determine the code rate for compressing the audio data based on the packet loss rate parameter when the network connection status is good.
In a preferred embodiment, the server 500 may further include an instruction control unit (not shown). The instruction control unit is capable of generating an instruction to generate audio data offline in a case where the network connection state is poor. In the case of a poor network connection state, the first text-to-speech conversion unit 510 does not perform text-to-speech conversion processing on the text data, or the compression unit 530 does not perform compression processing on the audio data, and the first transmission unit 540 sends an instruction for generating the audio data offline to the client.
Fig. 6 shows a schematic structural diagram of a client according to an embodiment of the present invention. The client is used to implement the speech synthesis method as shown in fig. 4.
As shown in fig. 6, the client 600 for speech synthesis of the present invention may include a second transmission unit 610, a third transmission unit 620, and a fourth transmission unit 630.
The second transmission unit 610 can transmit the network connection state between the client and the server or parameters that can be used to determine the network connection state to the server. In one embodiment, the relevant parameter that can be used to determine the network connection status may be a packet loss rate parameter.
The third transmission unit 620 can transmit the text data input by the user to the server.
The fourth transmission unit 630 is capable of receiving an audio data packet from the server, where the audio data packet includes audio data corresponding to the text data, and a compression code rate of the audio data packet is related to the network connection status.
In a preferred embodiment, the second transmission unit 610, the third transmission unit 620 and the fourth transmission unit 630 may be multiplexed, for example, the network module on the client side shown in fig. 2.
In a preferred embodiment, the client 600 may further include a computing unit (not shown). The calculating unit may calculate and update the packet loss rate parameter according to the statistical information of the received audio data packet.
In a preferred embodiment, the client 600 may further include a second network status determining unit (not shown in the figure). The second network state determination unit may be capable of determining the network connection state based on a packet loss rate parameter. When the network connection state is good, the third transmission unit 620 sends the text data to the server.
In a preferred embodiment, the client 600 may further include a second text-to-speech conversion unit (not shown). The second text-to-speech conversion unit can perform text-to-speech conversion processing on the text data locally at the client under the condition of poor network connection state. Or, in response to receiving an instruction for generating audio data offline from the server, the second text-to-speech conversion unit can perform text-to-speech conversion processing on the text data locally at the client.
In a preferred embodiment, the client 600 may further include a jitter buffer adjustment unit (not shown). The jitter buffer adjustment unit can dynamically adjust the size of the local jitter buffer according to the packet loss rate parameter.
So far, the server and the client for performing speech synthesis according to the present invention have been described in detail with reference to fig. 5 to 6. By the method and the device, the smoothness of the whole voice synthesis process can be ensured by introducing a series of mechanisms such as a network condition monitoring mechanism, a network congestion control mechanism, a local buffer dynamic adjustment mechanism and the like aiming at a complex and changeable network environment, so that delay, blockage and the like caused by network change are avoided, the timeliness and the continuity of audio playing of the client are ensured, and the user experience is improved.
Fig. 7 is a schematic structural diagram of a computing device that can be used to implement the above-described speech synthesis method according to an embodiment of the present invention.
Referring to fig. 7, computing device 700 includes memory 710 and processor 720.
Processor 720 may be a multi-core processor or may include multiple processors. In some embodiments, processor 720 may include a general-purpose host processor and one or more special purpose coprocessors such as a Graphics Processor (GPU), Digital Signal Processor (DSP), or the like. In some embodiments, processor 720 may be implemented using custom circuits, such as an Application Specific Integrated Circuit (ASIC) or a Field Programmable Gate Array (FPGA).
The memory 710 may include various types of storage units, such as system memory, Read Only Memory (ROM), and permanent storage. Wherein the ROM may store static data or instructions that are required by processor 720 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. In addition, the memory 710 may include any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic and/or optical disks, may also be employed. In some embodiments, memory 710 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a digital versatile disc read only (e.g., DVD-ROM, dual layer DVD-ROM), a Blu-ray disc read only, an ultra-dense disc, a flash memory card (e.g., SD card, min SD card, Micro-SD card, etc.), a magnetic floppy disk, or the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.
The memory 710 has stored thereon processable code that, when processed by the processor 720, causes the processor 720 to perform the speech synthesis methods described above.
The speech synthesis scheme according to the invention has been described in detail above with reference to the accompanying drawings.
Furthermore, the method according to the invention may also be implemented as a computer program or computer program product comprising computer program code instructions for carrying out the above-mentioned steps defined in the above-mentioned method of the invention.
Alternatively, the invention may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or computing device, server, etc.), causes the processor to perform the steps of the above-described method according to the invention.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (25)
1. A speech synthesis method performed at a server, comprising:
performing text-to-speech conversion processing on text data in a client data packet from a client to obtain audio data corresponding to the text data;
determining a code rate for compressing the audio data based on a network connection state between the client and the server;
compressing the audio data based on the code rate to obtain an audio data packet subjected to compression processing;
and returning the audio data packet to the client.
2. The method of claim 1, further comprising:
and determining the network connection state based on the packet loss rate parameter from the client.
3. The method of claim 2, wherein the determining a code rate for compressing the audio data comprises:
and under the condition that the network connection state is good, determining a code rate for compressing the audio data based on the packet loss rate parameter.
4. The method according to claim 2, wherein, in the case of the poor network connection state, the server does not perform text-to-speech conversion processing on text data or compression processing on the audio data, and sends an instruction to generate audio data offline to the client.
5. The method of claim 2, wherein,
the packet loss rate parameter is calculated by the client according to statistical information of the previously received audio data packet.
6. A server for speech synthesis, comprising:
the first text-to-speech conversion unit is used for performing text-to-speech conversion processing on text data in a client data packet from a client to obtain audio data corresponding to the text data;
a code rate determining unit, configured to determine a code rate for compressing the audio data based on a network connection state between the client and the server;
the compression unit is used for compressing the audio data based on the code rate to obtain an audio data packet after compression; and
and the first transmission unit is used for returning the audio data packet to the client.
7. The server of claim 6, further comprising:
a first network state determining unit, configured to determine the network connection state based on a packet loss rate parameter from the client.
8. The server according to claim 7, wherein,
and the code rate determining unit determines the code rate for compressing the audio data based on the packet loss rate parameter under the condition that the network connection state is good.
9. The server of claim 7, further comprising:
an instruction control unit for generating an instruction to generate audio data offline in a case where the network connection state is poor,
under the condition that the network connection state is poor, the first text-to-speech conversion unit does not perform text-to-speech conversion processing on text data, or the compression unit does not perform compression processing on the audio data, and the first transmission unit sends an instruction for generating the audio data offline to the client.
10. A speech synthesis method performed at a client, comprising:
sending a network connection state between a client and a server or a parameter which can be used for determining the network connection state correlation to the server;
sending character data input by a user to a server; and
and receiving an audio data packet from the server, wherein the audio data packet comprises audio data corresponding to the text data, and the compression code rate of the audio data packet is related to the network connection state.
11. The method of claim 10, further comprising:
and determining the network connection state based on the packet loss rate parameter.
12. The method of claim 11, wherein,
and sending the character data to the server under the condition that the network connection state is good.
13. The method of claim 12, further comprising:
under the condition that the network connection state is poor, text-to-speech conversion processing is performed on the text data locally at the client; or
And responding to an instruction of generating audio data offline from a server, and locally performing text-to-speech conversion processing on the text data at the client.
14. The method of claim 11, wherein,
the parameter that can be used to determine the network connection status is a packet loss rate parameter.
15. The method according to any of claims 11-14, further comprising:
and calculating and updating the packet loss rate parameter based on the statistical information of the received audio data packet.
16. The method of claim 15, further comprising:
and dynamically adjusting the size of a local jitter buffer area according to the packet loss rate parameter.
17. A client for speech synthesis, comprising:
the second transmission unit is used for sending the network connection state between the client and the server or parameters related to the network connection state which can be used for determining the network connection state to the server;
the third transmission unit is used for transmitting the character data input by the user to the server; and
and the fourth transmission unit is used for receiving an audio data packet from the server, wherein the audio data packet comprises audio data corresponding to the character data, and the compression code rate of the audio data packet is related to the network connection state.
18. The client of claim 17, further comprising:
and the second network state determining unit is used for determining the network connection state based on the packet loss rate parameter.
19. The client according to claim 18, wherein,
and under the condition that the network connection state is good, the third transmission unit sends the character data to the server.
20. The client of claim 18, further comprising:
the second text-to-speech conversion unit is used for performing text-to-speech conversion processing on the text data locally at the client under the condition that the network connection state is poor; or
And in response to receiving an instruction of generating audio data offline from a server, the second text-to-speech conversion unit performs text-to-speech conversion processing on the text data locally at the client.
21. The client according to claim 18, wherein,
the parameter that can be used to determine the network connection status is a packet loss rate parameter.
22. The client according to any of claims 18-21, further comprising:
and the calculating unit is used for calculating and updating the packet loss rate parameter based on the statistical information of the received audio data packet.
23. The client of claim 22, further comprising:
and the jitter buffer adjusting unit is used for dynamically adjusting the size of the local jitter buffer according to the packet loss rate parameter.
24. A computing device, comprising:
a processor; and
a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method of any of claims 1-5, 10-16.
25. A non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform the method of any of claims 1-5, 10-16.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811270533.6A CN111105778A (en) | 2018-10-29 | 2018-10-29 | Speech synthesis method, speech synthesis device, computing equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811270533.6A CN111105778A (en) | 2018-10-29 | 2018-10-29 | Speech synthesis method, speech synthesis device, computing equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111105778A true CN111105778A (en) | 2020-05-05 |
Family
ID=70419671
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811270533.6A Pending CN111105778A (en) | 2018-10-29 | 2018-10-29 | Speech synthesis method, speech synthesis device, computing equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111105778A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112202803A (en) * | 2020-10-10 | 2021-01-08 | 北京字节跳动网络技术有限公司 | Audio processing method, device, terminal and storage medium |
CN112562638A (en) * | 2020-11-26 | 2021-03-26 | 北京达佳互联信息技术有限公司 | Voice preview method and device and electronic equipment |
CN113035205A (en) * | 2020-12-28 | 2021-06-25 | 阿里巴巴(中国)有限公司 | Audio packet loss compensation processing method and device and electronic equipment |
CN114785772A (en) * | 2022-04-27 | 2022-07-22 | 广州宸祺出行科技有限公司 | Method and device for downloading network car booking audio with corresponding code rate based on download rate |
EP4071752A4 (en) * | 2019-12-30 | 2023-01-18 | Huawei Technologies Co., Ltd. | Text-to-voice processing method, terminal and server |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1599352A (en) * | 2003-09-17 | 2005-03-23 | 上海贝尔阿尔卡特股份有限公司 | Regulating method of adaptive scillation buffer zone of packet switching network |
CN101022369A (en) * | 2007-03-23 | 2007-08-22 | 中山大学 | End-to-end quening time delay measuring method |
CN101119338A (en) * | 2007-09-20 | 2008-02-06 | 腾讯科技(深圳)有限公司 | Network voice communication method, system, device and instant communication terminal |
CN104702579A (en) * | 2013-12-09 | 2015-06-10 | 华为技术有限公司 | Method and device used for determining cache state of user equipment |
CN105530449A (en) * | 2014-09-30 | 2016-04-27 | 阿里巴巴集团控股有限公司 | Coding parameter adjusting method and device |
CN105610635A (en) * | 2016-02-29 | 2016-05-25 | 腾讯科技(深圳)有限公司 | Voice code transmitting method and apparatus |
CN106210925A (en) * | 2015-05-05 | 2016-12-07 | 阿里巴巴集团控股有限公司 | The decoding method of a kind of real-time media stream and device |
CN106412032A (en) * | 2016-09-14 | 2017-02-15 | 安徽声讯信息技术有限公司 | Remote audio character transmission method and system |
CN106452663A (en) * | 2015-08-11 | 2017-02-22 | 阿里巴巴集团控股有限公司 | Network communication data transmission method based on RTP protocol, and communication equipment |
CN107274884A (en) * | 2017-02-15 | 2017-10-20 | 赵思聪 | A kind of information acquisition method based on text resolution and phonetic synthesis |
CN107979482A (en) * | 2016-10-25 | 2018-05-01 | 腾讯科技(深圳)有限公司 | A kind of information processing method, device, transmitting terminal, debounce moved end, receiving terminal |
-
2018
- 2018-10-29 CN CN201811270533.6A patent/CN111105778A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1599352A (en) * | 2003-09-17 | 2005-03-23 | 上海贝尔阿尔卡特股份有限公司 | Regulating method of adaptive scillation buffer zone of packet switching network |
CN101022369A (en) * | 2007-03-23 | 2007-08-22 | 中山大学 | End-to-end quening time delay measuring method |
CN101119338A (en) * | 2007-09-20 | 2008-02-06 | 腾讯科技(深圳)有限公司 | Network voice communication method, system, device and instant communication terminal |
CN104702579A (en) * | 2013-12-09 | 2015-06-10 | 华为技术有限公司 | Method and device used for determining cache state of user equipment |
CN105530449A (en) * | 2014-09-30 | 2016-04-27 | 阿里巴巴集团控股有限公司 | Coding parameter adjusting method and device |
CN106210925A (en) * | 2015-05-05 | 2016-12-07 | 阿里巴巴集团控股有限公司 | The decoding method of a kind of real-time media stream and device |
CN106452663A (en) * | 2015-08-11 | 2017-02-22 | 阿里巴巴集团控股有限公司 | Network communication data transmission method based on RTP protocol, and communication equipment |
CN105610635A (en) * | 2016-02-29 | 2016-05-25 | 腾讯科技(深圳)有限公司 | Voice code transmitting method and apparatus |
CN106412032A (en) * | 2016-09-14 | 2017-02-15 | 安徽声讯信息技术有限公司 | Remote audio character transmission method and system |
CN107979482A (en) * | 2016-10-25 | 2018-05-01 | 腾讯科技(深圳)有限公司 | A kind of information processing method, device, transmitting terminal, debounce moved end, receiving terminal |
CN107274884A (en) * | 2017-02-15 | 2017-10-20 | 赵思聪 | A kind of information acquisition method based on text resolution and phonetic synthesis |
Non-Patent Citations (2)
Title |
---|
李君斌;金心宇;张昱;: "基于NS-2低丢包率自适应多速率VoIP系统的QoS研究", no. 04 * |
贺昕,李斌编著: "《异构无线网络切换技术》", 北京:北京邮电大学出版社, pages: 249 - 251 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP4071752A4 (en) * | 2019-12-30 | 2023-01-18 | Huawei Technologies Co., Ltd. | Text-to-voice processing method, terminal and server |
CN112202803A (en) * | 2020-10-10 | 2021-01-08 | 北京字节跳动网络技术有限公司 | Audio processing method, device, terminal and storage medium |
CN112562638A (en) * | 2020-11-26 | 2021-03-26 | 北京达佳互联信息技术有限公司 | Voice preview method and device and electronic equipment |
CN113035205A (en) * | 2020-12-28 | 2021-06-25 | 阿里巴巴(中国)有限公司 | Audio packet loss compensation processing method and device and electronic equipment |
CN113035205B (en) * | 2020-12-28 | 2022-06-07 | 阿里巴巴(中国)有限公司 | Audio packet loss compensation processing method and device and electronic equipment |
CN114785772A (en) * | 2022-04-27 | 2022-07-22 | 广州宸祺出行科技有限公司 | Method and device for downloading network car booking audio with corresponding code rate based on download rate |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111105778A (en) | Speech synthesis method, speech synthesis device, computing equipment and storage medium | |
CN108738006B (en) | Data transmission method and device based on Bluetooth | |
CN101203827B (en) | Flow control for media streaming | |
KR100800716B1 (en) | Apparatus and method for tranmitting and receiving moving picture data using a close range communication | |
JP7302006B2 (en) | Methods for operating Bluetooth devices | |
CN107819809B (en) | Method and device for synchronizing content | |
JP4591993B2 (en) | Inverse multiplexing method | |
RU2598805C2 (en) | Method for dynamic adaptation of repetition frequency of bits when receiving and appropriate receiver | |
CN111147606A (en) | Data transmission method, device, terminal and storage medium | |
CN113727185B (en) | Video frame playing method and system | |
JP5335354B2 (en) | Information transmitting apparatus, information transmitting apparatus control method, and computer program | |
CN117135148A (en) | Audio and video transmission method and system based on WebRTC | |
AU2002318050B2 (en) | Dispersity coding for inverse multiplexing | |
CN113573003B (en) | Audio and video real-time communication method, device and equipment based on weak network | |
CN104618736A (en) | Multimedia downloading method and device | |
JP5359952B2 (en) | High-bandwidth video transmission system and method using multiple public lines | |
JP2009188735A (en) | Device, system, method for distributing animation data and program | |
CN100531131C (en) | Portable terminal, streaming communication system, streaming communication method, and streaming communication program | |
CN112449208A (en) | Voice processing method and device | |
WO2023078232A1 (en) | Transmission method and apparatus | |
KR20050047920A (en) | Adaptive streamimg apparatus and method | |
CN112737971B (en) | Data processing method, device, storage medium and network equipment | |
JP2005033499A (en) | Method and device for absorbing fluctuation of propagation time of voice ip terminal | |
US20170302598A1 (en) | Managing a Jitter Buffer Size | |
US20060259618A1 (en) | Method and apparatus of processing audio of multimedia playback terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200505 |