CN109587042B

CN109587042B - Voice conversion communication terminal

Info

Publication number: CN109587042B
Application number: CN201811620039.8A
Authority: CN
Inventors: 王碧芳; 李雪; 张帆
Original assignee: Wuhan Polytechnic University
Current assignee: Shanghai Zhize Communication Services Co.,Ltd.
Priority date: 2018-12-28
Filing date: 2018-12-28
Publication date: 2022-01-21
Anticipated expiration: 2038-12-28
Also published as: CN109587042A

Abstract

The invention discloses a voice conversion communication terminal, which comprises: a voice encoder for generating voice data and message data of data from an original voice, combining the voice data and the message to generate combined data, and transmitting the combined data to a data transceiving channel; a voice decoder for separating voice data and message data from the combined data after receiving the combined message data from the data transceiving channel, and reconstructing the separated voice data and message data to obtain redesigned combined data; a data transceiving channel receiving the combined message from the speech encoder and passing the combined message to the speech decoder. The invention realizes the voice type conversion when the user sends voice through the voice conversion communication terminal, and meets the user-defined requirement on the type of the voice information to be sent.

Description

Voice conversion communication terminal

Technical Field

The invention relates to the field of communication, in particular to a voice conversion communication terminal.

Background

At present, the speech recognition technology is developed very rapidly, and has been applied to various technical fields, such as personal computers or mobile phone terminals for identification, the speech recognition technology is mostly applied to single electronic terminals, and with the continuous maturity of internet technology, the speech recognition technology has a very broad prospect when being applied to Web pages in order to further facilitate users to access the internet. While voice recognition is adopted as a protection means, the user's customization demand for sending voice information is continuously increased, and a voice file that the user wants to send can be reflected by a plurality of different voice types at a receiving end, so that a new scheme needs to be provided to realize the conversion of the voice types when the user sends voice.

Disclosure of Invention

The invention aims to provide a voice conversion communication terminal to solve the customization requirement of a user on the type of voice information to be sent.

To solve the above problems, the present invention provides a voice conversion communication terminal comprising: the voice coder generates voice data and message data of data from original voice, combines the voice data and the message to generate combined data, and transmits the combined data to the data transceiving channel; the voice decoder is used for separating voice data and message data from the combined data after receiving the combined message data from the data transceiving channel, and reconstructing the separated voice data and the separated message data to obtain redesigned combined data; and the data transceiving channel receives the combined message from the voice coder and transmits the combined message to the voice decoder.

Wherein the speech encoder comprises: a message data generating unit for extracting and generating message data from the original voice; a voice data generation unit which extracts and generates voice data from the original voice; and a combined data generation unit for integrating the message data and the voice data to generate combined data.

Wherein the voice decoder includes: a voice data separating unit separating voice data and message data from the combined data; and a message reconstruction unit for reconstructing the separated voice data and message data to generate re-designed combined data.

The voice data is data corresponding to original voice, the message data includes voice type, and the voice data and the message data have a one-to-one mapping relation.

Wherein the voice font of the original voice is different from the voice font of the redesigned combined data.

Wherein the voice conversion communication terminal selects a voice type through the voice server.

The separated voice data and message data are respectively sent, the message data are sent to a multimedia message service center, the voice data are sent to a voice server, and the voice server records the voice type.

The invention has the beneficial effects that: different from the situation of the prior art, the invention provides the voice conversion communication terminal, which realizes the conversion of the voice type when a user sends voice and meets the customized requirement of the user on the type of the voice information to be sent.

Drawings

Fig. 1 is a schematic structural diagram of an embodiment of a voice conversion communication terminal according to the present invention;

FIG. 2 is a system diagram of an embodiment of a voice converting communication terminal of the present invention;

fig. 3 is a system flow diagram of an embodiment of a voice converting communication terminal according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

Referring to fig. 1, fig. 1 is a schematic structural diagram of an embodiment of a voice conversion communication terminal in the present invention, in which 110 is a data transceiving channel, 120 is a voice decoder, 121 is a data reconstructing unit, 122 is a combined data separating unit, 130 is a voice encoder, 131 is a message data generator, 132 is a voice data generator, and 133 is a combined data generating unit. The voice converting communication terminal of the present invention includes a data transceiving channel 110, a voice decoder 120 and a voice encoder 130, wherein the voice decoder 120 includes a data reconstruction unit 121 and a combined data separation unit 122, and the voice encoder 130 includes a message data generation unit 131, a voice data generation unit 132 and a combined data generation unit 133. Specifically, in the data transceiving channel 110, a delivery service of data is implemented by inputting and outputting data, a combined message from the voice encoder is received and delivered to the voice decoder, and the data transceiving channel 110 can display the contents of the input and output messages; in the speech encoder 130, the original speech is converted into speech data by the speech data generating unit 132, message data is extracted and generated by the message data generating unit 131, and the speech data and the message data are combined in the combined data generating unit 133 to generate combined data; in the voice decoder 120, the combined data is separated into voice data and message data by the combined data separation unit 122, and the separated voice data and message data are reconstructed by the data reconstruction unit 121 to obtain redesigned combined data; the voice data is corresponding to the original voice file, the message data comprises the voice type, and one voice data has one message data corresponding to the voice data.

Further, the operation of the voice conversion communication terminal is described with reference to fig. 2 and fig. 3, fig. 2 is a system schematic diagram of an embodiment of the voice conversion communication terminal in the present invention, where 201 is a first communication terminal, 202 is a second communication terminal, 203 is a voice server, and 204 is a multimedia message service center; fig. 3 is a system flow diagram of an embodiment of a voice converting communication terminal according to the present invention. The first communication terminal 201 and the second communication terminal 202 are both the above-described voice conversion communication terminal, and the structure and function thereof are kept consistent with those of the above-described voice conversion communication terminal. In this embodiment, the work flow of the voice conversion communication terminal is as follows:

and S101, generating message data and voice data by the original voice through the first communication terminal, and combining the message data and the voice data to generate combined data. In this step, the first communication terminal 201 generates message data from the original voice by the message data generating unit 131 in the voice encoder 130, generates voice data by the voice data generating unit 132, and generates combined data from the message data and the voice data by the combined data generating unit 133; the voice data is data corresponding to an original voice file, the message data comprises voice types, and the voice data and the message data have a one-to-one mapping relation.

And S102, after the voice type is selected, the combined data is separated, the separated voice data is sent to a voice server, and the voice type is recorded by the voice server. In this step, after selecting the type of the voice to be transmitted from the voice server, the combined data of the voice to be transmitted is separated into voice data and message data by the combined data separation unit 122 in the first communication terminal 201, the voice data of the type is transmitted to the voice server 203, and the voice server 203 records the type of the voice to be transmitted; the voice server provides different types of voices, which can be selected or downloaded by the user of the first communication terminal 201, and the types of the voices can be distinguished by parameters such as audio frequency and the like. In one particular embodiment, the user can achieve the effect of changing voice by selecting or downloading different types of voice in the voice server 203.

S103, the first communication terminal sends the separated message data to the second communication terminal through the multimedia message service center. In this step, the first communication terminal 201 sends the message data of the voice type to be sent to the second communication terminal 202 through the multimedia message service center 204, and the message data received by the second communication terminal 202 is used for downloading the voice data corresponding to the message data to the voice server 203.

The second communication terminal downloads voice data corresponding to the received message data from the voice server S104. In this step, the second communication terminal 202 downloads voice data corresponding to the received message data from the voice server 203, the voice data corresponding to the message data.

And S105, the second communication terminal reconstructs the received message data and the voice data to obtain the redesigned combined data. In this step, the data reconstruction unit 121 of the voice decoder 120 in the second communication terminal 202 performs data reconstruction on the received message data and voice data in the voice type selected when the first communication terminal 201 transmits the received message data and voice data, and obtains redesigned combined data after the data reconstruction and provides the combined data to the user; the voice font of the original voice is different from the voice font of the redesigned combined data. The redesigned combined data is generated by original voice after being disassembled and converted in type, in a specific embodiment, the original voice of the user holding the first communication terminal 201 is type a, and voice type B is selected during transmission, and after the steps of S101 to S105, the user of the second communication terminal can receive the combined data of voice type B, thereby realizing the conversion of voice type when the user transmits voice.

Different from the situation of the prior art, the invention provides the voice conversion communication terminal, which realizes the conversion of the voice type when a user sends voice and meets the customized requirement of the user on the type of the voice information to be sent.

It should be noted that the above embodiments belong to the same inventive concept, and the description of each embodiment has a different emphasis, and reference may be made to the description in other embodiments where the description in individual embodiments is not detailed.

The above-mentioned embodiments only express the embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A voice conversion communication terminal, comprising:

the voice coder is used for generating voice data and message data of data by original voice, combining the voice data and the message to generate combined data and transmitting the combined data to a data receiving and transmitting channel, wherein the voice data is the data corresponding to the original voice, the message data comprises the type of the voice, and the voice data and the message data have a one-to-one mapping relation;

the voice decoder separates voice data and message data from the combined data after receiving the combined data from the data transceiving channel; carrying out data reconstruction on the received voice data and the message data in the voice type selected when other voice conversion terminals send the voice data and the message data;

a data transceiving channel receiving the combined data from the speech encoder and transferring the combined data to the speech decoder;

the speech encoder includes:

a message data generation unit which extracts and generates the message data from the original voice;

a voice data generation unit that extracts and generates the voice data from the original voice;

a combined data generating unit which integrates the message data and the voice data to generate the combined data;

the working process of the voice conversion communication terminal is as follows:

s101, generating message data and voice data by an original voice through a voice encoder of a voice conversion communication terminal, and combining to generate combined data;

s102, the voice conversion communication terminal selects a voice type through a voice server, performs combined data separation after the voice type is selected, sends the separated voice data to the voice server and records the voice type by the voice server;

s103, the voice conversion communication terminal sends the separated message data to a second communication terminal through a multimedia message service center;

s104, the second communication terminal downloads the voice data corresponding to the received message data from the voice server;

and S105, the second communication terminal reconstructs the received message data and the voice data according to the voice type selected when the voice conversion terminal sends the message data and the voice data, and the reconstructed data obtains the redesigned combined data and provides the combined data for the user.

2. The voice converting communication terminal according to claim 1, wherein the voice decoder comprises:

a combined data separating unit that separates voice data and message data from the combined data;

and the data reconstruction unit reconstructs the received voice data and the message data according to the voice type selected when the other voice conversion terminal sends the voice data and the message data.

3. The voice converting communication terminal according to claim 1, wherein a voice genre of the original voice is different from a voice genre of the redesigned combined data.