CN116847023A

CN116847023A - Auxiliary call method and device based on man-machine interaction

Info

Publication number: CN116847023A
Application number: CN202311113593.8A
Authority: CN
Inventors: 张升辉
Original assignee: Shenzhen Guanghetong Wireless Communication Software Co ltd
Current assignee: Shenzhen Guanghetong Wireless Communication Software Co ltd
Priority date: 2023-08-31
Filing date: 2023-08-31
Publication date: 2023-10-03

Abstract

The application discloses an auxiliary call method and device based on man-machine interaction, and belongs to the technical field of man-machine interaction. Wherein the method comprises the following steps: after a terminal establishes call connection with a call opposite terminal based on a telephone network, acquiring first text information input by the terminal; converting the first text information into an audio signal, and transmitting the audio signal to the opposite call end; or, acquiring the voice signal transmitted by the opposite terminal, converting the voice signal into second text information, and displaying the second text information on the terminal. The application solves the technical problem that the requirements of the deaf-mute on the voice call can not be met in the related technology.

Description

Auxiliary call method and device based on man-machine interaction

Technical Field

The application relates to the technical field of man-machine interaction, in particular to an auxiliary communication method and device based on man-machine interaction.

Background

In the related art, after a communication product such as a mobile phone is dialed, a calling end needs to collect voice through a microphone and transmit the voice to a called end, a user of the called end can hear the voice through a loudspeaker, and meanwhile, the voice of the called end needs to be transmitted to the calling end and played through the loudspeaker of the calling end, so that the user of the calling end can hear the voice of the called end through the loudspeaker.

However, the deaf-mute has language and hearing impairment, and cannot speak or hear when making or receiving an incoming call, so that the related technology cannot meet the requirements of the deaf-mute for voice communication.

In view of the above problems in the related art, no effective solution has been found yet.

Disclosure of Invention

The application provides an auxiliary conversation method and device based on man-machine interaction, which are used for solving the technical problem that the related technology cannot meet the requirements of deaf-mute on voice conversation.

According to an aspect of the embodiment of the application, there is provided an auxiliary call method based on man-machine interaction, including: after a terminal establishes call connection with a call opposite terminal based on a telephone network, acquiring first text information input by the terminal; converting the first text information into an audio signal, and transmitting the audio signal to the opposite call end; or, acquiring the voice signal transmitted by the opposite terminal, converting the voice signal into second text information, and displaying the second text information on the terminal.

Further, transmitting the audio signal to the call counterpart includes: adding a second connection path in a background sound scene on a first connection path of the terminal, wherein the second connection path is a connection path between a first front end FE and a rear end BE, and the first connection path is a connection path between the second front end FE and the rear end BE established when a call is connected; and sending the audio signal from the first FE to the back end BE, and sending the audio signal to the opposite call end through the back end BE.

Further, transmitting the audio signal to the call counterpart includes: invoking a first connection path between a second FE of the terminal and a rear end BE, and acquiring environmental audio data acquired by a microphone of the terminal by adopting the second FE; performing superposition and mixing processing on the environmental audio data and the audio signal to obtain superposition and mixing mixed audio data; and transmitting the mixed sound data to the opposite call end.

Further, performing superposition and mixing processing on the environmental audio data and the audio signal to obtain superposition and mixing mixed audio data, where the obtaining the superposition and mixing mixed audio data includes: acquiring synchronous frame data of a first connecting path and a second connecting path, wherein the first connecting path comprises the environmental audio data, and the second connecting path comprises the audio signal; and performing superposition and mixing processing on the synchronous frame data to obtain superposition and mixing mixed data.

Further, the step of obtaining the voice signal transmitted by the opposite terminal includes: the digital-to-analog converter of the terminal is adopted to convert the initial voice digital signal transmitted by the opposite terminal into the voice analog signal; a microphone of the terminal is adopted to acquire a voice analog signal output by the digital-to-analog converter; and carrying out analog-to-digital conversion on the voice analog signal by adopting an analog-to-digital converter of the microphone to obtain a target voice digital signal, wherein the input end of the analog-to-digital converter of the microphone is connected with the output end of the digital-to-analog converter.

Further, the step of obtaining the voice signal transmitted by the opposite terminal includes: creating a loop interface; and calling a sound card reading function corresponding to the loop interface to acquire the voice signal transmitted by the opposite call end.

Further, obtaining the first text information input by the terminal includes: acquiring a commonly used term instruction selected by the terminal; and taking the target commonly used term corresponding to the commonly used term instruction as the first text information.

According to another aspect of the embodiment of the present application, there is also provided an auxiliary communication device based on man-machine interaction, including: the terminal comprises an acquisition module, a communication module and a communication module, wherein the acquisition module is used for acquiring first text information input by the terminal after the terminal establishes communication connection with a communication opposite terminal based on a telephone network; the transmission module is used for converting the first text information into an audio signal and transmitting the audio signal to the opposite call end; the display module is used for acquiring the voice signal transmitted by the opposite terminal of the call, converting the voice signal into second text information and displaying the second text information on the terminal.

Further, the transmission module includes an establishing unit, configured to add a second connection path in a background sound scenario on a first connection path of the terminal, where the second connection path is a connection path between a first front end FE and a back end BE, and the first connection path is a connection path between the second front end FE and the back end BE established when a call is connected; and sending the audio signal from the first FE to the back end BE, and sending the audio signal to the opposite call end through the back end BE.

Further, the transmission module further comprises a sound mixing unit, and the sound mixing unit is used for calling a first connection path between a second FE of the terminal and a rear end BE, and acquiring environmental audio data acquired by a microphone of the terminal by adopting the second FE; performing superposition and mixing processing on the environmental audio data and the audio signal to obtain superposition and mixing mixed audio data; and transmitting the mixed sound data to the opposite call end.

Further, the audio mixing unit is further configured to obtain synchronization frame data of a first connection path and a second connection path, where the first connection path includes the environmental audio data, and the second connection path includes the audio signal; and performing superposition and mixing processing on the synchronous frame data to obtain superposition and mixing mixed data.

Further, the display module comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for converting an initial voice digital signal transmitted by the opposite terminal into the voice analog signal by adopting a digital-to-analog converter of the terminal; a microphone of the terminal is adopted to acquire a voice analog signal output by the digital-to-analog converter; and carrying out analog-to-digital conversion on the voice analog signal by adopting an analog-to-digital converter of the microphone to obtain a target voice digital signal, wherein the input end of the analog-to-digital converter of the microphone is connected with the output end of the digital-to-analog converter.

Further, the display module comprises a second acquisition unit for creating a loop interface; and calling a sound card reading function corresponding to the loop interface to acquire the voice signal transmitted by the opposite call end.

Further, the acquisition module comprises a third acquisition unit for acquiring the commonly used term instruction selected by the terminal; and taking the target commonly used term corresponding to the commonly used term instruction as the first text information.

According to another aspect of the embodiments of the present application, there is also provided a storage medium including a stored program that performs the above steps when running.

According to another aspect of the embodiment of the present application, there is also provided an electronic device including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; wherein: a memory for storing a computer program; and a processor for executing the steps of the method by running a program stored on the memory.

The electronic device provided by the embodiment of the application can be a module capable of realizing a communication function or a terminal device comprising the module, and the terminal device can be a mobile terminal or an intelligent terminal. The mobile terminal can be at least one of a mobile phone, a tablet computer, a notebook computer and the like; the intelligent terminal can be a terminal containing a wireless communication module, such as an intelligent automobile, an intelligent watch, a sharing bicycle, an intelligent cabinet and the like; the module may specifically be any one of a wireless communication module, such as a 2G communication module, a 3G communication module, a 4G communication module, a 5G communication module, and an NB-IOT communication module.

Embodiments of the present application also provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the above method.

According to the application, after the call connection is established between the terminal and the call opposite terminal, the first text information input by the terminal user is acquired, the first text information is converted into an audio signal, and the audio signal is transmitted to the call opposite terminal; or, the voice signal transmitted by the opposite terminal of the call is obtained, the voice signal is converted into the second text information, the second text information is displayed on the terminal, and the normal voice call between the user and the opposite terminal can be realized by editing text and reading text without speaking or listening by the terminal user through text-to-voice conversion.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a block diagram of the hardware architecture of a computer according to an embodiment of the present application;

FIG. 2 is a flow chart of an auxiliary call method based on man-machine interaction according to an embodiment of the application;

FIG. 3 is a schematic diagram of a call setup path initialization procedure according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a voice transmission and reception process according to an embodiment of the present application;

fig. 5 is a schematic diagram of a hardware extraction structure of a sound played in real time by the capturing system according to the embodiment of the present application;

FIG. 6 is a schematic diagram of a scenario in which an auxiliary telephony device communicates with 119 according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a voice call path according to an embodiment of the present application;

fig. 8 is a block diagram of an auxiliary communication device based on man-machine interaction according to an embodiment of the present application.

Detailed Description

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

The method according to the first embodiment of the present application may be implemented in a mobile phone, a computer, a tablet or a similar computing device. Taking the operation on a mobile phone as an example, fig. 1 is a hardware structure block diagram of the mobile phone according to an embodiment of the present application. As shown in fig. 1, the handset may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, and optionally a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely illustrative, and is not intended to limit the structure of the mobile phone. For example, the handset may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1.

The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to an auxiliary call method in an embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, implement the method described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located with respect to the processor 102, which may be connected to the handset via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a cell phone. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.

In this embodiment, an auxiliary call method based on man-machine interaction is provided, fig. 2 is a flowchart of an auxiliary call method based on man-machine interaction according to an embodiment of the present application, as shown in fig. 2, where the flowchart includes the following steps:

step S10, after a terminal establishes call connection with a call opposite terminal based on a telephone network, acquiring first text information input by the terminal; converting the first text information into an audio signal, and transmitting the audio signal to the opposite call end;

in this embodiment, the terminal includes a modem communication module, a text-to-speech module, a speech-to-text module, and the like. After the terminal and the opposite terminal are connected, the user can input text, call text-to-speech algorithm to convert text information input by the terminal into audio signal (pulse code modulation pcm code stream), and transmit the audio signal to the opposite terminal through the modem communication module, so that the opposite terminal can hear the sound to be expressed by the user. The terminal user may be a deaf-mute or a user who is inconvenient to speak in a special scene, and in this embodiment, the deaf-mute is taken as an example for illustration. The telephone network is a mobile communication network, the terminal and the call opposite terminal are in voice call based on the mobile communication network, and voice data in the call process are transmitted through the base station of the operator.

Step S20, or, acquiring a voice signal transmitted by the opposite terminal, converting the voice signal into second text information, and displaying the second text information on the terminal.

In this embodiment, the terminal may further obtain a voice signal transmitted by the opposite terminal, where the voice signal may be audio data of a user speaking into the opposite terminal, call a voice-to-text algorithm to convert the voice signal into text information, and display the text information on a display screen of the terminal, so that the deaf-mute can clearly speak about the opposite terminal user, thereby implementing rapid voice communication between the deaf-mute and the normal person. In this embodiment, the voice signal transmitted by the opposite terminal of the call may also be an audio signal that is converted from text to voice by the opposite terminal of the call, so as to implement rapid voice communication between the deaf-mute and the deaf-mute.

Through the steps, after the terminal establishes call connection with the opposite call end based on the telephone network, the first text information input by the terminal is obtained, the first text information is converted into an audio signal, the audio signal is transmitted to the opposite call end, or the voice signal transmitted by the opposite call end is obtained, the voice signal is converted into the second text information, the second text information is displayed on the terminal, and the terminal user can realize the normal voice call requirement of the user and the opposite call end by editing text and reading text without speaking or listening by using text-to-voice conversion.

In one implementation of this embodiment, transmitting the audio signal to the call partner includes:

a1, adding a second connection path in a background sound scene on a first connection path of the terminal, wherein the second connection path is a connection path between a first front end FE and a rear end BE, and the first connection path is a connection path between the second front end FE and the rear end BE established when a call is connected;

a2, the audio signal is sent to the back end BE from the first FE, and the audio signal is sent to the opposite call end through the BE.

When a terminal establishes call connection with a call opposite terminal, a first connection path under a voice call scene is established, the first connection path is a connection path between a second front end FE and a rear end BE, the second connection path is called, the second FE is adopted to acquire environment audio data acquired by a microphone of the terminal, and the rear end BE executes environment audio data corresponding to a normal voice call scene and outputs the environment audio data to the call opposite terminal through a modem communication module. Because the terminal user cannot speak, the environmental audio data does not contain the speaking voice of the terminal user, and therefore, in the embodiment, a new conversation session can BE created by adding the second connection path between the first FE and the back end BE in the background sound scene incall_music scene after the conversation is connected, the first FE is adopted to acquire the audio signal of the first text information after the voice conversion, the audio signal is sent to the back end BE, and the audio signal is sent to the conversation opposite end through the back end BE.

In another implementation of this embodiment, transmitting the audio signal to the call partner includes:

b1, calling a first connection path between a second FE of the terminal and a rear end BE, and acquiring environmental audio data acquired by a microphone of the terminal by adopting the second FE;

b2, performing superposition and mixing processing on the environmental audio data and the audio signal to obtain superposition and mixing mixed audio data;

and b3, transmitting the mixed data to the opposite call end.

Calling a first connection between a second front end FE and a rear end BE which are established when the terminal establishes call connection with a call opposite end, acquiring environmental audio data acquired by a microphone of the terminal by adopting the second FE, and simultaneously, adding a second connection path between the first FE and the rear end BE in a background sound scene, and acquiring an audio signal after voice conversion of first text information by adopting the first FE. And (3) carrying out superposition and mixing processing on the environment audio data acquired by the microphone and the audio signals converted by the characters to obtain superposition and mixing data, transmitting the mixing data to a modem communication module, namely sending the mixing data to a call opposite terminal through the modem communication module, and hearing the audio converted by the characters of the terminal and the environment audio of the terminal by the call opposite terminal.

Specifically, the audio mixing processing of the environmental audio data and the audio signal includes:

b21, acquiring synchronous frame data of a first connection path and a second connection path, wherein the first connection path comprises the environmental audio data, and the second connection path comprises the audio signal;

and b22, performing superposition and mixing processing on the synchronous frame data to obtain superposition and mixing data.

And acquiring synchronous frame data of the first connecting passage and the second connecting passage, and performing superposition and mixing processing on the environment audio data synchronized by the two passages and the audio signals converted by the characters to obtain superposition and mixing data. According to the embodiment, the environment audio data and the text-converted audio signals are subjected to superposition and mixing processing and transmitted to the far end, so that a far end user can acquire more information from the environment audio.

In an implementation manner of this embodiment, obtaining the voice signal transmitted by the opposite call end includes: creating a loop interface; and calling a sound card reading function corresponding to the loop interface to acquire the voice signal transmitted by the opposite call end.

In this embodiment, referring to fig. 7, the voice signal of the opposite end of the call is transmitted to the ADSP output of the audio digital signal processing module through the modem protocol, and the voice signal cannot be captured at the upper layer without passing through the Android system. Therefore, according to the embodiment, the voice signal transmitted by the opposite end of the call is captured from the audio digital signal processing module ADSP by calling the sound card reading function corresponding to the loop back interface, the captured voice signal is further converted into the second text information and displayed on the terminal, so that the deaf-mute can see the content expressed by the opposite end of the call.

In another implementation manner of this embodiment, obtaining the voice signal transmitted by the opposite call end includes: the digital-to-analog converter of the terminal is adopted to convert the initial voice digital signal transmitted by the opposite terminal into the voice analog signal; a microphone of the terminal is adopted to acquire a voice analog signal output by the digital-to-analog converter; and carrying out analog-to-digital conversion on the voice analog signal by adopting an analog-to-digital converter of the microphone to obtain a target voice digital signal, wherein the input end of the analog-to-digital converter of the microphone is connected with the output end of the digital-to-analog converter.

Referring to fig. 7, the voice signal of the opposite terminal of the call is transmitted to the terminal through the modem protocol, the call path passes through the modem module, and then passes through the audio digital signal processing module ADSP, and the initial voice digital signal transmitted by the opposite terminal of the call cannot be captured at the upper layer without passing through the Android system, so in this embodiment, the sound played by the system in real time is captured in real time by means of HW hardware extraction. Referring to fig. 5, in a call scenario, an initial voice digital signal transmitted by a call opposite terminal is converted into a left channel voice analog signal, a right channel voice analog signal by using a first digital-to-analog converter DAC1 and a second digital-to-analog converter DAC2 of the terminal, and is output to a speaker for playing. Because the microphone generally only collects the environmental analog audio signals, in this embodiment, the input ends of the first analog-to-digital converter ADC1 and the second analog-to-digital converter ADC2 of the microphone are connected to the output ends of the first digital-to-analog converter DAC1 and the second digital-to-analog converter DAC2, that is, the microphone can be used to obtain the voice analog signals output by the digital-to-analog converter, and then the analog-to-digital converter of the microphone is used to perform analog-to-digital conversion on the voice analog signals to obtain the target voice digital signals, convert the voice digital signals into the second text information, and display the second text information on the terminal. According to the embodiment, the hardware circuit is arranged at the analog signal output end, the analog signals are connected to the ADC1 and the ADC2, so that the microphone can collect environmental audio, and meanwhile, the voice analog signals transmitted by the opposite end of the call can be obtained.

According to the embodiment, the analog-to-digital converters ADC1 and ADC2 capture the audio analog signals transmitted by the opposite end of the call in real time, and convert the analog signals into digital signals, so that the function of capturing the sound played by the system in real time in a call scene can be realized. The CPU converts the audio digital signal into text information through a voice-to-text module and displays the text information.

In an optional implementation manner of this embodiment, acquiring the first text information input by the terminal may include: acquiring a commonly used term instruction selected by the terminal; and taking the target commonly used term corresponding to the commonly used term instruction as the first text information.

In this embodiment, commonly used terms, such as common term information including a home address, etc., or a positioning function menu may be set, where when a terminal user triggers a positioning function instruction, the terminal obtains a current geographic location of the terminal, and uses corresponding geographic location information as first text information, where the manner of obtaining the geographic location may be according to IP or according to GPS. The terminal user can trigger the instruction of the commonly used term by clicking the corresponding commonly used term under the call scene, so that the first text information pre-input by the user is quickly generated, the text input rate is improved, and the normal voice call requirement can be realized in many scenes that the terminal user encounters a safety problem and needs to dial emergency telephones, receive strange calls and the like.

Referring to fig. 3, after the call is established in this embodiment, the AFE loop Back FE is connected to BE to obtain the voice signal transmitted by the opposite end of the call, and the customized voice incall_music FE is connected to BE. Referring to fig. 4, first text information is input in a text editing module, the first text information is converted into an audio signal through a text-to-speech module, a PCM code stream is written and sent to a designated FE sound card node, the voice is transmitted to an audio digital signal processing module ADSP through voice, and the voice is sent to a far end through a modem call module; the modem receives far-end call voice, transmits the far-end call voice to the ADSP, plays sound through the virtual output device, reads sound from the FE sound card corresponding to the AFE loop Back, converts the read sound information into second text information through the voice text conversion module, and displays the second text information through the text display module. As shown in fig. 6, after the terminal user dials 119 a call, the terminal user sends a text, for example, "hello, is on fire at home", converts the voice signal transmitted by the opposite terminal into a text "please ask you where" and displays it, clicks the home address shortcut menu, and transmits the voice corresponding to the home address information to the opposite terminal.

The embodiment realizes that the user can normally talk with other people without speaking or hearing. When the deaf-mute is talking daily, especially the deaf-mute encounters safety problem, the talking can be transmitted to the opposite party through the telephone in time, and the opposite terminal can normally hear the sound which the deaf-mute wants to express although the deaf-mute uses the characters to communicate with the opposite terminal.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.

Example 2

The embodiment also provides an auxiliary communication device for implementing the foregoing embodiments and preferred embodiments, which are not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

Fig. 8 is a block diagram of an auxiliary communication device according to an embodiment of the present application, as shown in fig. 8, the device includes:

an obtaining module 80, configured to obtain first text information input by a terminal after the terminal establishes a call connection with a call opposite terminal based on a telephone network;

the transmission module 82 is configured to convert the first text information into an audio signal, and transmit the audio signal to the opposite call end;

the display module 84 is configured to obtain a voice signal transmitted by the opposite terminal, convert the voice signal into second text information, and display the second text information on the terminal.

It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.

Example 3

An embodiment of the application also provides a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:

s1, after a terminal establishes call connection with a call opposite terminal based on a telephone network, acquiring first text information input by the terminal; converting the first text information into an audio signal, and transmitting the audio signal to the opposite call end;

s2, or, acquiring a voice signal transmitted by the opposite terminal, converting the voice signal into second text information, and displaying the second text information on the terminal.

Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.

An embodiment of the application also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

Optionally, the electronic device may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.

The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application, which are intended to be comprehended within the scope of the present application.

Claims

1. An auxiliary call method based on man-machine interaction is characterized by comprising the following steps:

after a terminal establishes call connection with a call opposite terminal based on a telephone network, acquiring first text information input by the terminal; converting the first text information into an audio signal, and transmitting the audio signal to the opposite call end;

or, acquiring the voice signal transmitted by the opposite terminal, converting the voice signal into second text information, and displaying the second text information on the terminal.

2. The method of claim 1, wherein transmitting the audio signal to the call partner comprises:

adding a second connection path in a background sound scene on a first connection path of the terminal, wherein the second connection path is a connection path between a first front end FE and a rear end BE, and the first connection path is a connection path between the second front end FE and the rear end BE established when a call is connected;

and sending the audio signal from the first FE to the back end BE, and sending the audio signal to the opposite call end through the back end BE.

3. The method of claim 1, wherein transmitting the audio signal to the call partner comprises:

invoking a first connection path between a second FE of the terminal and a rear end BE, and acquiring environmental audio data acquired by a microphone of the terminal by adopting the second FE;

performing superposition and mixing processing on the environmental audio data and the audio signal to obtain superposition and mixing mixed audio data;

and transmitting the mixed sound data to the opposite call end.

4. The method of claim 3, wherein performing the superposition mixing process on the environmental audio data and the audio signal to obtain the mixed data after the superposition mixing comprises:

acquiring synchronous frame data of a first connecting path and a second connecting path, wherein the first connecting path comprises the environmental audio data, and the second connecting path comprises the audio signal;

and performing superposition and mixing processing on the synchronous frame data to obtain superposition and mixing mixed data.

5. The method of claim 1, wherein obtaining the voice signal transmitted by the call partner comprises:

the digital-to-analog converter of the terminal is adopted to convert the initial voice digital signal transmitted by the opposite terminal into the voice analog signal;

a microphone of the terminal is adopted to acquire a voice analog signal output by the digital-to-analog converter;

and carrying out analog-to-digital conversion on the voice analog signal by adopting an analog-to-digital converter of the microphone to obtain a target voice digital signal, wherein the input end of the analog-to-digital converter of the microphone is connected with the output end of the digital-to-analog converter.

6. The method of claim 1, wherein obtaining the voice signal transmitted by the call partner comprises:

creating a loop interface;

and calling a sound card reading function corresponding to the loop interface to acquire the voice signal transmitted by the opposite call end.

7. The method of claim 1, wherein obtaining the first text information input by the terminal comprises:

acquiring a commonly used term instruction selected by the terminal;

and taking the target commonly used term corresponding to the commonly used term instruction as the first text information.

8. Auxiliary communication device based on human-computer interaction, characterized by comprising:

the terminal comprises an acquisition module, a communication module and a communication module, wherein the acquisition module is used for acquiring first text information input by the terminal after the terminal establishes communication connection with a communication opposite terminal based on a telephone network;

the transmission module is used for converting the first text information into an audio signal and transmitting the audio signal to the opposite call end;

the display module is used for acquiring the voice signal transmitted by the opposite terminal of the call, converting the voice signal into second text information and displaying the second text information on the terminal.

9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus; wherein:

a memory for storing a computer program;

a processor for executing the method steps of any one of claims 1 to 7 by running a program stored on a memory.

10. A storage medium comprising a stored program, wherein the program when run performs the method steps of any one of claims 1 to 7.