WO2020051852A1

WO2020051852A1 - Method for recording and displaying information in communication process, and terminals

Info

Publication number: WO2020051852A1
Application number: PCT/CN2018/105571
Authority: WO
Inventors: 王骅
Original assignee: 华为技术有限公司
Priority date: 2018-09-13
Filing date: 2018-09-13
Publication date: 2020-03-19
Also published as: CN111819830B; CN111819830A

Abstract

Disclosed are a method for recording and displaying information in a communication process, and terminals, which relate to the technical field of communications and can improve the relevance between speech data recorded during the process of speech communication and a call record of the speech communication. The specific solution involves: during the process of a first terminal performing speech communication with a second terminal, the first terminal identifying speech data captured by a microphone of the first terminal; if text corresponding to first speech data captured by the microphone matches a preset startup word, the first terminal starting to record speech data of the second terminal; upon completion of speech data recording, the first terminal displaying a first interface comprising the text corresponding to the speech data; and the first terminal receiving a first operation of a user on the text corresponding to the speech data, and playing the speech data in response to the first operation.

Description

Method and terminal for recording and displaying information in communication process

Technical field

Embodiments of the present application relate to the field of communication technologies, and in particular, to a method and terminal for recording and displaying information during communication.

Background technique

With the development of electronic technology, electronic terminals (such as mobile phones) have more and more functions, and users are increasingly dependent on mobile phones. For example, the mobile phone can not only be used as a communication or entertainment tool, but also include a memo function and a voice recorder, that is, the user can record some text or picture information in the mobile phone's memo and record some voice information in the mobile phone's recorder.

In order to facilitate the use of mobile phones for voice communication, users can record information related to this voice communication. Generally, the memo and the entrance of the recorder can be integrated in the voice communication interface of the mobile phone (as shown in Figure 1, the mobile phone 100 The voice communication interface 101 includes the start button 103 of the memo and the start button 102 of the recorder, so that the user can directly start the memo or recorder through the entrance during the voice communication process, and record the relevant voice communication through the memo or recorder. Information.

However, after the above voice call ends, if the user wants to view the information recorded through the memo or recorder record, the user also needs to activate the memo or radio in the mobile phone to view the corresponding information. In addition, when viewing the information recorded by the memo or the radio, the user may need to browse or listen to multiple records to find information related to the voice call, and the user experience is poor.

Summary of the Invention

The embodiments of the present application provide a method and a terminal for recording and displaying information during a communication process, which can improve the correlation between the voice data recorded during the voice communication and the call record of the voice communication.

In a first aspect, an embodiment of the present application provides a method for recording and displaying information during a communication process, which can be applied to a process in which a first terminal performs voice communication with a second terminal. The first terminal may identify the voice data captured by the microphone of the first terminal during the voice communication between the first summary segment and the second terminal; if the text corresponding to the first voice data captured by the microphone matches the preset start word, then The first terminal starts recording the voice data of the second terminal; the voice data of the second terminal is converted from the audio and electrical signals received from the second terminal; after the voice data recording ends, the first terminal displays the recorded voice data The first interface of the corresponding text; the first terminal receives the user's first operation on the text corresponding to the voice data, and plays the recorded voice data in response to the first operation.

In the embodiment of the present application, during the voice communication between the first terminal and the second terminal, when the text corresponding to the first voice data captured by the microphone of the first terminal matches the text of the preset start word, the first terminal may automatically start Record the voice data corresponding to the voice communication. In other words, after receiving the voice command (that is, the first voice data whose text matches the text of the preset start word) issued by the user, the first terminal can automatically start recording the voice data corresponding to the voice communication. The first terminal may display a first interface including text corresponding to the voice data after the voice data recording ends. That is, the first terminal can intuitively display the text information corresponding to the voice data recorded by the first terminal to the user. In addition, the first terminal may play the voice data in response to the user's first operation on the text corresponding to the voice data, thereby improving the relevance of the text information and the voice data.

In summary, the first terminal may automatically record corresponding voice data in response to a voice command issued by a user during a voice call. This solution makes the terminal more intelligent, improves the interaction performance between the terminal and the user, and improves the user experience.

With reference to the first aspect, in a possible design manner, the first terminal may automatically display the first interface after voice data recording ends. Alternatively, the first terminal may display the first interface in response to the end of the voice communication, where the voice data recording has ended when the voice communication ends, or the first terminal ends the voice data recording when the voice communication ends. Alternatively, after the voice data recording ends, the first terminal may display the first interface in response to the user inputting a second operation. This second operation is used to instruct the first terminal to display the call history interface of the first terminal. The first interface is a call history interface. The call record interface includes the call record items of the voice communication. The call record item is used to record the call record information of the voice communication. The call log entry includes text corresponding to the recorded voice data. Alternatively, after the voice data recording ends, the first terminal may display the first interface in response to the user's third operation on the call record item of the voice call in the call record interface. This third operation is used to instruct the first terminal to display a record detail interface of voice communication. The first interface is a record detail interface. The record details interface is used to display call record information for voice communication. The record details interface includes text corresponding to the recorded voice data.

With reference to the first aspect, in another possible design manner, the first terminal can record not only the voice data of the second terminal, but also the voice data captured by the microphone of the first terminal. The method in the embodiment of the present application may further include: if the text corresponding to the first voice data captured by the microphone matches the preset start word, the first terminal starts recording the voice data captured by the microphone. In the embodiment of the present application, the first terminal can record not only the voice data of the second terminal but also the voice data captured by the microphone of the first terminal. That is, the first terminal can record the conversation between the calling party and the called party.

With reference to the first aspect, in another possible design manner, the text corresponding to the recorded voice data includes at least two pieces of text information. The recorded voice data includes at least two pieces of voice data. The at least two pieces of text information correspond to at least two pieces of voice data. The first terminal may receive a user's first operation on the first text information in at least two pieces of text information; in response to the first operation, the first terminal plays the first voice data segment. The first text information is a piece of text information among at least two pieces of text information. The first text information corresponds to a first voice data segment.

The text corresponding to the recorded voice data may include at least two pieces of text information. The first terminal may receive a user's first operation on at least two pieces of text information and play a corresponding voice data segment. That is, the user can selectively control the first terminal to play any one of the at least two pieces of voice data recorded by the first terminal.

With reference to the first aspect, in another possible design manner, when the first terminal converts the recorded voice data into text information, the converted text information may not be completely consistent with the text of the voice data recorded by the first terminal. That is, some errors may occur in the text information converted by the first terminal. Based on this situation, the first terminal may receive a fourth operation (ie, a modification operation) performed by the user on the second text information, and the fourth operation is used to modify the second text information into the third text information. The second text information is a piece of text information among at least two pieces of text information, and the second text information corresponds to the second voice data segment. In response to the fourth operation, the first terminal modifies the second text information into the third text information. The user can control the first terminal to play the second voice data segment corresponding to the second text information, and compare the second voice data segment played by the first terminal with the second text information displayed by the first terminal. When the second text information is inconsistent with the second voice data segment played by the first terminal, the first terminal is operated to modify the second text information. After the second text information is modified, the first terminal may display the third text information on the first interface. The first terminal receives a user's first operation on the third text information, and plays a second voice data segment (that is, a voice data segment corresponding to the second text information).

In the embodiment of the present application, the first terminal may respond to a user's operation of modifying text information, and replace the text information before modification with the modified text information. In this way, the user may modify the text information obtained by the first terminal converting the voice data according to the voice data stored in the first terminal, and may correct an error that occurs when the first terminal converts the voice data to obtain the text information.

With reference to the first aspect, in another possible design manner, the text corresponding to the recorded voice data may include at least two pieces of text information. When the first terminal displays the text corresponding to the recorded voice data on the first interface, the first terminal may display at least the first interface in accordance with the chronological order of the voice data segments corresponding to the recorded text information and the source information of the voice data segments corresponding to the text information. Two text messages. The source information is used to indicate that the voice data segment is voice data captured by a microphone or voice data of a second terminal.

The first terminal displays at least two pieces of text information on the first interface in accordance with the chronological order of the voice data segments corresponding to the recorded text information and the source information of the voice data segments corresponding to the text information. In this way, the content of the conversation between the calling party and the called party can be clearly displayed to the user.

With reference to the first aspect, in another possible design manner, the first terminal may prompt the user to start recording the voice data when the first terminal starts to record the voice data. Specifically, if the text corresponding to the first voice data captured by the microphone matches a preset start word, the first terminal may send a first prompt message. The first prompt information is used to prompt the user to start recording voice data on the first terminal. The first prompt information is a prompt sound or a vibration prompt.

In the embodiment of the present application, the first terminal may send a first prompt message when the text corresponding to the first voice data matches a preset start word, that is, when the first terminal starts recording the voice data, to prompt the user to start recording on the first terminal. Voice data. In this way, the user can know through the first prompt information that the first terminal has started recording voice data, which increases the direct interaction between the first terminal and the user, improves the interaction performance between the first terminal and the user, and improves the user experience.

With reference to the first aspect, in another possible design manner, before the first terminal displays the first interface, the method in the embodiment of the present application may further include: during the recording of the voice data, if the second voice data captured by the microphone is If the corresponding text matches the preset end word, the first terminal stops recording voice data.

In the process of recording the voice data by the first terminal, when the text corresponding to the second voice data captured by the microphone of the first terminal matches the text of the preset ending word, the first terminal may automatically stop recording the voice data. In other words, the first terminal can automatically stop recording voice data after receiving a voice command (that is, second voice data whose text matches a preset end word text) issued by the user.

In summary, the first terminal can automatically record and stop recording voice data in response to a voice command issued by a user during a voice call. This solution makes the terminal more intelligent, improves the interaction performance between the terminal and the user, and improves the user experience.

With reference to the first aspect, in another possible design manner, when the first terminal stops recording voice data, the user may be prompted by the first terminal to stop recording voice data. Specifically, during the recording of the voice data, if the text corresponding to the second voice data captured by the microphone matches a preset end word, the first terminal may send a second prompt message. The second prompt information is used to prompt the user to stop recording the voice data by the first terminal. The second prompt information is a prompt sound or a vibration prompt.

In the embodiment of the present application, the first terminal may send a second prompt message when the text corresponding to the second voice data matches a preset start word, that is, when the first terminal stops recording voice data, to prompt the user to stop recording by the first terminal. Voice data. In this way, the user can learn through the first prompt message that the first terminal has stopped recording voice data, which increases the direct interaction between the first terminal and the user, improves the interaction performance between the first terminal and the user, and improves the user experience.

With reference to the first aspect, in another possible design manner, when the first terminal uses a speaker to play voice data, when the first prompt information or the first prompt information is a prompt tone, the prompt issued by the first terminal The tone may be captured by the microphone of the first terminal. In this case, when the first terminal matches the text corresponding to the first voice data with the preset start word, if it is determined that the first terminal uses the speaker to play the voice data, the first terminal may collect the microphone according to the voice data played by the speaker. Voice data for echo suppression.

In this way, the voice data after the echo suppression will not include the above-mentioned prompt tone. The voice data sent by the first terminal to the second terminal is voice data after echo suppression. The voice data sent by the first terminal to the second terminal does not include the above-mentioned prompt tone. The user of the second terminal will not hear the above-mentioned prompt tone.

Further, after playing the second prompt information, the first terminal may stop performing echo suppression on the voice data collected by the microphone. In this way, the power consumption caused by the first terminal continuously performing echo suppression can be avoided, and the battery life of the first terminal can be extended.

With reference to the first aspect, in another possible design manner, the first terminal may save at least two pieces of voice data in accordance with the chronological order of recording each piece of voice data and the source information of each piece of voice data. The source information is used to indicate that the voice data segment is captured by the microphone, or the source information is used to indicate that the voice data segment is the voice data of the second terminal.

With reference to the first aspect, in another possible design manner, the above-mentioned first interface may include not only at least two text information but also at least two player plug-ins. At least two playback plug-ins are used to play at least two pieces of voice data, and at least two playback plug-ins correspond to at least two pieces of text information one by one.

In a second aspect, an embodiment of the present application provides a terminal, and the terminal is a first terminal. The terminal includes: one or more processors, memory, touch screen, microphone, communication interface, receiver and speaker; the memory, display, communication interface is coupled to the processor; the touch screen is used to display images generated by the processor; the microphone is used to capture speech Data; the memory is used to store computer program code; the computer program code includes computer instructions, and when the processor executes the computer instructions, the processor is configured to perform voice communication with the second terminal through the communication interface; identify the voice data captured by the microphone; It is recognized that the text corresponding to the first voice data captured by the microphone matches the preset start word, and then starts recording the voice data of the second terminal. The voice data of the second terminal is converted from the audio and electrical signals received from the second terminal. The processor is further configured to store the recorded voice data in the memory; the processor is further configured to control the touch screen to display a first interface after the voice data recording is completed; the first interface includes text corresponding to the recorded voice data; the processor , Also used to receive the user's display of the touch screen This first operation; in response to the first operation control receiver or a speaker to play the voice data.

With reference to the second aspect, in a possible design manner, the processor is configured to control the touch screen to display the first interface after the voice data recording ends, and includes a processor to automatically control the voice data after the voice data recording ends. The touch screen displays the first interface; or, the processor is configured to respond to the end of the voice communication, and controls the touch screen to display the first interface, and the voice data recording ends when the voice communication ends; or the processor is configured to respond after the voice data recording ends. After the user inputs a second operation, the touch screen is controlled to display the first interface, and the second operation is used to instruct the terminal to display the call history interface of the terminal. The first interface is the call history interface, and the call history interface includes a call history entry and a call history entry for voice communication. It is used to record the call log information of the voice communication, and the call log item includes the text corresponding to the recorded voice data; or, the processor is configured to respond to the user's call record of the voice call in the call record interface after the voice data recording ends. The third operation of the item is to control the touch screen to display the first interface. The operation is used to instruct the terminal to display a record detail interface for voice communication. The first interface is a record detail interface. The record detail interface is used to display call record information for voice communication. The record detail interface includes text corresponding to the recorded voice data.

With reference to the second aspect, in another possible design manner, the processor is further configured to: if it is recognized that the text corresponding to the first voice data captured by the microphone matches the preset start word, the first terminal starts recording the microphone capture Voice data.

With reference to the second aspect, in another possible design manner, the text corresponding to the recorded voice data includes at least two pieces of text information, the voice data includes at least two pieces of voice data, at least two pieces of text information, and at least two pieces of voice data One-to-one correspondence. The processor is configured to receive a first operation performed by the user on the text displayed on the touch screen, and control the receiver or the speaker to play voice data in response to the first operation. Operation; in response to the first operation, controlling the receiver or speaker to play the first voice data segment. The first text information is a piece of text information in at least two pieces of text information; the first text information corresponds to the first voice data segment.

With reference to the second aspect, in another possible design manner, the foregoing processor is further configured to receive a fourth operation performed by the user on the second text information displayed on the touch screen, and the fourth operation is configured to modify the second text information into the first operation. Three text information; the second text information is a piece of text information in at least two pieces of text information, and the second text information corresponds to the second voice data segment; and the second text information is modified into the third text information in response to the fourth operation; The third text information and the correspondence between the third text information and the second voice data segment are stored in the memory. The processor is further configured to control the touch screen to display the third text information on the first interface; receive a first operation of the user on the third text information displayed on the touch screen, and control the receiver or speaker to play the second voice data segment.

With reference to the second aspect, in another possible design manner, the processor for controlling the touch screen to display the first interface includes: a processor for controlling the touch screen according to a chronological order of voice data segments corresponding to recorded text information And the source information of the voice data segment corresponding to the text information, at least two pieces of text information are displayed on the first interface. The source information is used to indicate that the voice data segment is voice data captured by a microphone or voice data of a second terminal.

With reference to the second aspect, in another possible design manner, the processor is further configured to issue a first prompt message if it is recognized that the text corresponding to the first voice data captured by the microphone matches a preset start word, the first A prompt message is used to prompt the user terminal to start recording voice data, and the first prompt message is a prompt sound or a vibration prompt.

With reference to the second aspect, in another possible design manner, the processor is further configured to control the touch screen to display the first interface, and during the recording of the voice data, if the text corresponding to the second voice data captured by the microphone is recognized If it matches the preset end word, it will stop recording voice data.

With reference to the second aspect, in another possible design manner, the foregoing processor is further configured to, during recording voice data, if it is recognized that the text corresponding to the second voice data captured by the microphone matches the preset ending word, Then, a second prompt message is issued. The second prompt message is used to prompt the user to stop recording the voice data at the first terminal, and the second prompt message is a prompt sound or a vibration prompt.

With reference to the second aspect, in another possible design manner, the first prompt information and the second prompt information are prompt sounds, and the processor is further configured to: when the text corresponding to the first voice data matches a preset start word , Determine that the speaker plays the voice data; perform echo suppression on the voice data collected by the microphone according to the voice data played by the speaker; and stop the echo suppression of the voice data collected by the microphone after the receiver or the speaker plays the second prompt message.

With reference to the second aspect, in another possible design manner, the foregoing memory stores at least two pieces of voice data in accordance with the chronological order of recording each piece of voice data and the source information of each piece of voice data. The source information is used to indicate that the voice data segment is captured by a microphone, or the source information is used to indicate that the voice data segment is voice data of the second terminal.

With reference to the second aspect, in another possible design manner, the first interface displayed on the touch screen further includes at least two player plug-ins, and at least two playback plug-ins correspond to at least two pieces of text information, at least two The playback plug-in corresponds to at least two pieces of voice data. The processor is further configured to receive a user's click operation on at least two player plug-ins of the first player plug-in, and control the receiver or speaker to play the voice data segment corresponding to the first player plug-in.

In a third aspect, an embodiment of the present application provides a computer storage medium, where the computer storage medium includes computer instructions, and when the computer instructions are executed on a terminal, the terminal is caused to execute the first aspect and any one of the possible The method of recording and displaying information in the communication process according to the design mode.

According to a fourth aspect, an embodiment of the present application provides a computer program product. When the computer program product runs on a computer, the computer is caused to execute the communication process described in the first aspect and any possible design manner. Information recording and display methods.

In addition, for the technical effects brought by the terminal described in the second aspect and any one of the design methods, the computer storage medium described in the third aspect, and the computer program product described in the fourth aspect, refer to the foregoing first aspect and The technical effects brought by its different design methods are not repeated here.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a first schematic view of a display interface example provided by an embodiment of the present application; FIG.

2 is a schematic diagram of a hardware structure of a terminal according to an embodiment of the present application;

FIG. 3 is a schematic diagram of an example communication scenario according to an embodiment of the present application; FIG.

4 is a first flowchart of a method for recording and displaying information in a communication process according to an embodiment of the present application;

FIG. 5 is a second schematic diagram of a display interface example provided by an embodiment of the present application; FIG.

FIG. 6 is a third schematic diagram of a display interface example provided by an embodiment of the present application; FIG.

FIG. 7 is a fourth schematic view of a display interface example provided by an embodiment of the present application; FIG.

FIG. 8 is a schematic diagram 5 of an example of a display interface provided by an embodiment of the present application;

FIG. 9 is a schematic diagram 6 of an example of a display interface according to an embodiment of the present application;

10A is a second flowchart of a method for recording and displaying information in a communication process according to an embodiment of the present application;

10B is a third flowchart of a method for recording and displaying information in a communication process according to an embodiment of the present application;

FIG. 10C is a schematic diagram VII of a display interface example provided by an embodiment of the present application; FIG.

FIG. 11 is a schematic diagram 8 of an example of a display interface provided by an embodiment of the present application;

FIG. 12 is a schematic diagram IX of a display interface example provided by an embodiment of the present application;

FIG. 13 is a schematic diagram 10 of an example of a display interface provided by an embodiment of the present application;

14 is a first schematic structural composition diagram of a terminal according to an embodiment of the present application;

FIG. 15 is a second schematic structural diagram of a terminal according to an embodiment of the present application.

detailed description

The embodiments of the present application provide a method and a terminal for recording and displaying information in a communication process, which can be applied to a process in which a first terminal performs voice communication with a second terminal. Specifically, during the voice communication between the first terminal and the second terminal, when the text corresponding to the first voice data captured by the microphone of the first terminal matches the text of the preset start word, the first terminal may automatically start recording the voice Communication corresponding voice data. The voice data captured by the microphone of the first terminal is voice data sent by a user of the first terminal. In other words, after receiving the voice command (ie, the first voice data whose text matches the text of the preset start word) issued by the user of the first terminal, the first terminal can automatically start recording the voice data corresponding to the voice communication.

Of course, when the text corresponding to the second voice data captured by the microphone of the first terminal matches the text of the preset ending word, the first terminal may automatically stop recording the voice data.

In summary, during the voice communication between the first terminal and the second terminal, the first terminal can automatically record and stop recording voice data in response to a user's voice command. This solution makes the terminal more intelligent, improves the interaction performance between the terminal and the user, and improves the user experience.

The terminal in the embodiment of the present application may be a portable computer (such as a mobile phone), a notebook computer, a personal computer (PC), a wearable electronic device (such as a smart watch), a tablet computer, or augmented reality (AR) \ Virtual reality (VR) equipment, on-board computers, and the like, the following embodiments do not specifically limit the specific form of the terminal.

Please refer to FIG. 2, which illustrates a structural block diagram of a terminal 200 provided by an embodiment of the present application. The terminal 200 may include a processor 210, an external memory interface 220, an internal memory 221, a universal serial bus (USB) interface 230, a charge management module 240, a power management module 241, a battery 242, an antenna 1, and an antenna. 2. RF module 250, communication module 260, audio module 270, speaker 270A, receiver 270B, microphone 270C, headphone interface 270D, sensor module 280, button 290, motor 291, indicator 292, camera 293, display 294, and user Identification module (subscriber identification module, SIM) card interface 295, etc. The sensor module 280 may include a pressure sensor 280A, a gyro sensor 280B, an air pressure sensor 280C, a magnetic sensor 280D, an acceleration sensor 280E, a distance sensor 280F, a proximity light sensor 280G, a fingerprint sensor 280H, a temperature sensor 280J, a touch sensor 280K, and ambient light. Sensor 280L, bone conduction sensor 280M, etc.

The structure illustrated in the embodiment of the present application does not limit the terminal 200. It may include more or fewer parts than shown, or some parts may be combined, or some parts may be split, or different parts may be arranged. The illustrated components can be implemented in hardware, software, or a combination of software and hardware.

The processor 210 may include one or more processing units. For example, the processor 210 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), an image signal processor (ISP), a controller, and a memory. , Video codec, digital signal processor (DSP), baseband processor, and / or neural-network processing unit (NPU). Among them, different processing units may be independent devices or integrated in one or more processors.

The above-mentioned controller may be a decision maker that instructs each component of the terminal 200 to coordinate work according to instructions. It is the nerve center and command center of the terminal 200. The controller generates operation control signals according to the instruction operation code and timing signals, and completes the control of fetching and executing the instructions.

The processor 210 may further include a memory for storing instructions and data. In some embodiments, the memory in the processor 210 is a cache memory, which can store instructions or data that the processor 210 has just used or recycled. If the processor 210 needs to use the instruction or data again, it can be directly called from the memory. Repeated accesses are avoided and the waiting time of the processor 210 is reduced, thereby improving the efficiency of the system.

In some embodiments, the processor 210 may include an interface. The interface may include an integrated circuit (Inter-Integrated Circuit, I2C) interface, an integrated circuit (Inter-integrated circuit, Sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, universal asynchronous transceiver (universal asynchronous) receiver / transmitter (UART) interface, mobile industry processor interface (MIPI), general-purpose input / output (GPIO) interface, SIM interface, and / or USB interface.

The I2C interface is a two-way synchronous serial bus that includes a serial data line (SDL) and a serial clock line (SCL). In some embodiments, the processor 210 may include multiple sets of I2C buses. The processor 210 may be coupled to a touch sensor 280K, a charger, a flash, a camera 293 and the like through different I2C bus interfaces. For example, the processor 210 may be coupled to the touch sensor 280K through the I2C interface, so that the processor 210 and the touch sensor 280K communicate through the I2C bus interface to implement the touch function of the terminal 200.

The I2S interface can be used for audio communication. In some embodiments, the processor 210 may include multiple sets of I2S buses. The processor 210 may be coupled to the audio module 270 through an I2S bus to implement communication between the processor 210 and the audio module 270. In some embodiments, the audio module 270 can transmit audio signals to the communication module 260 through the I2S interface, so as to implement the function of receiving calls through a Bluetooth headset.

The PCM interface can also be used for audio communications, sampling, quantizing, and encoding analog signals. In some embodiments, the audio module 270 and the communication module 260 may be coupled through a PCM bus interface. In some embodiments, the audio module 270 may also transmit audio signals to the communication module 260 through the PCM interface, so as to implement the function of receiving a call through a Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication, and the sampling rates of the two interfaces are different.

The UART interface is a universal serial data bus for asynchronous communication. This bus is a two-way communication bus. It converts the data to be transferred between serial and parallel communications. In some embodiments, a UART interface is typically used to connect the processor 210 and the communication module 260. For example, the processor 210 communicates with a Bluetooth module through a UART interface to implement a Bluetooth function. In some embodiments, the audio module 270 may transmit audio signals to the communication module 260 through a UART interface, so as to implement a function of playing music through a Bluetooth headset.

The MIPI interface can be used to connect the processor 210 with peripheral devices such as the display 294 and the camera 293. The MIPI interface includes a camera serial interface (CSI), a display serial interface (DSI), and the like. In some embodiments, the processor 210 and the camera 293 communicate through a CSI interface to implement a shooting function of the terminal 200. The processor 210 and the display screen 294 communicate through a DSI interface to implement a display function of the terminal 200.

The GPIO interface can be configured by software. The GPIO interface can be configured as a control signal or as a data signal. In some embodiments, the GPIO interface may be used to connect the processor 210 with the camera 293, the display screen 294, the communication module 260, the audio module 270, the sensor module 280, and the like. GPIO interface can also be configured as I2C interface, I2S interface, UART interface, MIPI interface, etc.

The USB interface 230 may be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like. The USB interface 230 may be used to connect a charger to charge the terminal 200, and may also be used to transfer data between the terminal 200 and a peripheral device. It can also be used to connect headphones and play audio through headphones. It can also be used to connect other electronic devices, such as AR devices.

The interface connection relationship between the modules illustrated in the embodiments of the present application is only a schematic description, and does not constitute a limitation on the structure of the terminal 200. The terminal 200 may adopt different interface connection modes or a combination of multiple interface connection modes in the embodiments of the present application.

The charging management module 240 is configured to receive a charging input from a charger. The charger may be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 240 may receive the charging input of the wired charger through the USB interface 230. In some embodiments of wireless charging, the charging management module 240 may receive a wireless charging input through a wireless charging coil of the terminal 200. While the charging management module 240 is charging the battery 242, the terminal 200 can also be powered by the power management module 241.

The power management module 241 is used to connect the battery 242, the charge management module 240 and the processor 210. The power management module 241 receives the input from the battery 242 and / or the charge management module 240, and supplies power to the processor 210, the internal memory 221, the external memory interface 220, the display screen 294, the camera 293, and the communication module 260. The power management module 241 can also be used to monitor battery capacity, battery cycle times, battery health status (leakage, impedance) and other parameters. In some embodiments, the power management module 241 may also be disposed in the processor 210. In some embodiments, the power management module 241 and the charge management module 240 may also be provided in the same device.

The wireless communication function of the terminal 200 may be implemented through the antenna 1, the antenna 2, the radio frequency module 250, the communication module 260, a modem, and a baseband processor.

The antenna 1 and the antenna 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the terminal 200 may be used to cover a single or multiple communication frequency bands. Different antennas can also be multiplexed to improve antenna utilization. For example, a cellular network antenna can be multiplexed into a wireless LAN diversity antenna. In some embodiments, the antenna may be used in conjunction with a tuning switch.

The radio frequency module 250 may provide a communication processing module for a wireless communication solution including 2G / 3G / 4G / 5G and the like applied on the terminal 200. The radio frequency module 250 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), and the like. The radio frequency module 250 receives electromagnetic waves from the antenna 1, and performs filtering, amplification, and other processing on the received electromagnetic waves, and transmits them to the modem for demodulation. The radio frequency module 250 can also amplify the signal modulated by the modem and turn it into electromagnetic wave radiation through the antenna 1. In some embodiments, at least part of the functional modules of the radio frequency module 250 may be disposed in the processor 210. In some embodiments, at least part of the functional modules of the radio frequency module 250 may be provided in the same device as at least part of the modules of the processor 210.

The modem may include a modulator and a demodulator. The modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low-frequency baseband signal to a baseband processor for processing. The low-frequency baseband signal is processed by the baseband processor and then passed to the application processor. The application processor outputs a sound signal through an audio device (not limited to the speaker 270A, the receiver 270B, etc.), or displays an image or video through the display screen 294. In some embodiments, the modem may be a separate device. In some embodiments, the modem may be independent of the processor 210 and disposed in the same device as the radio frequency module 250 or other functional modules.

The communication module 260 may provide wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), and global navigation satellite systems applied to the terminal 200. (Global navigation, satellite system, GNSS), a communication processing module for wireless communication solutions such as frequency modulation (FM), near field communication (NFC), and infrared (IR). The communication module 260 may be one or more devices that integrate at least one communication processing module. The communication module 260 receives electromagnetic waves via the antenna 2, frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 210. The communication module 260 may also receive a signal to be transmitted from the processor 210, frequency-modulate it, amplify it, and convert it into electromagnetic wave radiation through the antenna 2.

In some embodiments, the antenna 1 of the terminal 200 is coupled to the radio frequency module 250, and the antenna 2 is coupled to the communication module 260, so that the terminal 200 can communicate with the network and other devices through wireless communication technology. The wireless communication technology may include a global mobile communication system (GSM), a general packet radio service (GPRS), a code division multiple access (CDMA), and broadband. Code division multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC , FM, and / or IR technology. The GNSS may include global satellite positioning systems (SBAS), global navigation satellite systems (GLONASS), BeiDou navigation navigation systems (BDS), and quasi-zenith satellite systems (BDS). Quasi-Zenith satellite system (QZSS) and / or satellite-based augmentation systems (SBAS).

The terminal 200 implements a display function through a GPU, a display screen 294, and an application processor. The GPU is a microprocessor for image processing and is connected to the display 294 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 210 may include one or more GPUs that execute program instructions to generate or change display information.

The display screen 294 is used to display images, videos, and the like. The display screen 294 includes a display panel. The display panel can use a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light-emitting diode). emitting diodes (AMOLED), flexible light-emitting diodes (FLEDs), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (QLEDs), etc. In some embodiments, the terminal 200 may include one or N display screens 294, where N is a positive integer greater than 1.

The terminal 200 can implement a shooting function through an ISP, a camera 293, a video codec, a GPU, a display screen, and an application processor.

The ISP is used to process the data fed back by the camera 293. For example, when taking a picture, the shutter is opened, and the light is transmitted to the light receiving element of the camera through the lens. The light signal is converted into an electrical signal, and the light receiving element of the camera passes the electrical signal to the ISP for processing and converts the image to the naked eye. ISP can also optimize the image's noise, brightness, and skin tone. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, an ISP may be provided in the camera 293.

The camera 293 is used to capture still images or videos. An object generates an optical image through a lens and projects it onto a photosensitive element. The photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to convert it into a digital image signal. The ISP outputs digital image signals to the DSP for processing. DSP converts digital image signals into image signals in standard RGB, YUV and other formats. In some embodiments, the terminal 200 may include one or N cameras 293, where N is a positive integer greater than 1.

A digital signal processor is used to process digital signals. In addition to digital image signals, it can also process other digital signals. For example, when the terminal 200 selects at a frequency point, the digital signal processor is used to perform a Fourier transform on the frequency point energy and the like.

Video codecs are used to compress or decompress digital video. The terminal 200 may support one or more video codecs. In this way, the terminal 200 can play or record videos in multiple encoding formats, such as: Moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and so on.

The NPU is a neural-network (NN) computing processor. By drawing on the structure of a biological neural network, such as the transfer mode between neurons in the human brain, the NPU can quickly process input information and continuously learn. Through the NPU, applications such as intelligent cognition of the terminal 200 can be implemented, such as: image recognition, face recognition, speech recognition, text understanding, and the like.

The external memory interface 220 can be used to connect an external memory card, such as a Micro SD card, to realize the expansion of the storage capacity of the terminal 200. The external memory card communicates with the processor 210 through the external memory interface 220 to implement a data storage function. For example, save music, videos and other files on an external memory card.

The internal memory 221 may be used to store computer executable program code, where the executable program code includes instructions. The processor 210 executes various functional applications and data processing of the terminal 200 by executing instructions stored in the internal memory 221. The memory 221 may include a storage program area and a storage data area. The storage program area may store an operating system, at least one application required by a function (such as a sound playback function, an image playback function, etc.) and the like. The storage data area may store data (such as audio data, phone book, etc.) created during the use of the terminal 200. In addition, the memory 221 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one disk storage device, a flash memory device, other volatile solid-state storage devices, universal flash memory (universal flash storage, UFS), etc. .

The terminal 200 can implement audio functions through an audio module 270, a speaker 270A, a receiver 270B, a microphone 270C, a headphone interface 270D, and an application processor. Such as music playback, recording, etc.

The audio module 270 is used for converting digital audio information into an analog audio signal and outputting, and also for converting an analog audio input into a digital audio signal. The audio module 270 may also be used to encode and decode audio signals. In some embodiments, the audio module 270 may be disposed in the processor 210, or some functional modules of the audio module 270 may be disposed in the processor 210.

The speaker 270A, also called a "horn", is used to convert audio electrical signals into sound signals. The terminal 200 can listen to music through the speaker 270A, or listen to a hands-free call.

The receiver 270B, also known as the "handset", is used to convert audio electrical signals into sound signals. When the terminal 200 answers a call or a voice message, it can answer the voice by holding the receiver 270B close to the human ear.

Microphone 270C, also called "microphone", "microphone", is used to convert sound signals into electrical signals. When making a call or sending a voice message, the user can make a sound through the mouth of the user near the microphone 270C, and input a sound signal into the microphone 270C. The terminal 200 may be provided with at least one microphone 270C. In some embodiments, the terminal 200 may be provided with two microphones 270C. In addition to collecting sound signals, a noise reduction function may also be implemented. In some embodiments, the terminal 200 may further be provided with three, four, or more microphones 270C to realize the collection of sound signals, reduce noise, and may also identify the source of sound, and implement a directional recording function.

The headset interface 270D is used to connect a wired headset. The earphone interface 270D may be a USB interface 230 or a 3.5mm open mobile terminal platform (OMTP) standard interface, and a cellular telecommunications industry association (United States of America, CTIA) standard interface.

The pressure sensor 280A is used to sense a pressure signal, and can convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 280A may be disposed on the display screen 294. There are many types of pressure sensors 280A, such as resistive pressure sensors, inductive pressure sensors, and capacitive pressure sensors. The capacitive pressure sensor may be at least two parallel plates having a conductive material. When a force is applied to the pressure sensor, the capacitance between the electrodes changes. The terminal 200 determines the intensity of the pressure according to the change in capacitance. When a touch operation is performed on the display screen 294, the terminal 200 detects the intensity of the touch operation according to the pressure sensor 280A. The terminal 200 may also calculate the touched position based on the detection signal of the pressure sensor 280A. In some embodiments, touch operations acting on the same touch position but different touch operation intensities may correspond to different operation instructions. For example, when a touch operation with a touch operation intensity lower than the first pressure threshold is applied to the short message application icon, an instruction for viewing the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold is applied to the short message application icon, an instruction for creating a short message is executed.

The gyro sensor 280B may be used to determine a motion posture of the terminal 200. In some embodiments, the angular velocity of the terminal 200 around three axes (ie, the x, y, and z axes) may be determined by the gyro sensor 280B. The gyro sensor 280B can be used for image stabilization. Exemplarily, when the shutter is pressed, the gyro sensor 280B detects the angle of the shake of the terminal 200, and calculates the distance to be compensated by the lens module according to the angle, so that the lens can cancel the shake of the terminal 200 through the reverse movement to achieve image stabilization. The gyro sensor 280B can also be used for navigation and somatosensory gaming scenes.

The air pressure sensor 280C is used to measure air pressure. In some embodiments, the terminal 200 calculates the altitude through the air pressure value measured by the air pressure sensor 280C, and assists in positioning and navigation.

The magnetic sensor 280D includes a Hall sensor. The terminal 200 may detect the opening and closing of the flip leather case by using the magnetic sensor 280D. In some embodiments, when the terminal 200 is a flip machine, the terminal 200 may detect the opening and closing of the flip according to the magnetic sensor 280D. Further, according to the opened and closed state of the holster or the opened and closed state of the flip cover, characteristics such as automatic unlocking of the flip cover are set.

The acceleration sensor 280E can detect the magnitude of the acceleration of the terminal 200 in various directions (generally three axes). The magnitude and direction of gravity can be detected when the terminal 200 is stationary. It can also be used to identify the posture of the terminal, and is used in applications such as switching between horizontal and vertical screens, and pedometers.

Distance sensor 280F for measuring distance. The terminal 200 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the terminal 200 may use a distance sensor 280F to measure a distance to achieve fast focusing.

The proximity light sensor 280G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. Infrared light is emitted outward through a light emitting diode. Use photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the terminal 200. When insufficient reflected light is detected, it may be determined that there is no object near the terminal 200. The terminal 200 can use the proximity light sensor 280G to detect that the user is holding the terminal 200 close to the ear to talk, so as to automatically turn off the screen to save power. The proximity light sensor 280G can also be used in holster mode, and the pocket mode automatically unlocks and locks the screen.

Ambient light sensor 280L is used to sense ambient light brightness. The terminal 200 can adaptively adjust the brightness of the display screen according to the perceived ambient light brightness. Ambient light sensor 280L can also be used to automatically adjust white balance when taking pictures. The ambient light sensor 280L can also cooperate with the proximity light sensor 280G to detect whether the terminal 200 is in a pocket to prevent accidental touch.

The fingerprint sensor 280H is used to collect fingerprints. The terminal 200 may use the collected fingerprint characteristics to realize fingerprint unlocking, access application lock, fingerprint photographing, fingerprint answering an incoming call, and the like.

The temperature sensor 280J is used to detect the temperature. In some embodiments, the terminal 200 executes a temperature processing strategy using the temperature detected by the temperature sensor 280J. For example, when the temperature reported by the temperature sensor 280J exceeds the threshold, the terminal 200 executes reducing the performance of a processor located near the temperature sensor 280J, so as to reduce power consumption and implement thermal protection.

The touch sensor 280K is also called "touch panel". Can be set on display 294. Used to detect touch operations on or near it. The detected touch operation may be passed to the application processor to determine the type of touch event, and a corresponding visual output is provided through the display screen 294.

The bone conduction sensor 280M can acquire vibration signals. In some embodiments, the bone conduction sensor 280M can acquire a vibration signal of a human voice oscillating bone mass. The bone conduction sensor 280M can also contact the human pulse and receive blood pressure beating signals. In some embodiments, the bone conduction sensor 280M may also be provided in the headset. The audio module 270 may analyze a voice signal based on the vibration signal of the oscillating bone mass obtained by the bone conduction sensor 280M to implement a voice function. The application processor may analyze the heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 280M to implement a heart rate detection function.

The keys 290 include a start key, a volume key, and the like. The key 290 may be a mechanical key. It can also be a touch button. The terminal 200 receives the input of the key 290 and generates a key signal input related to user settings and function control of the terminal 200.

The motor 291 can generate a vibration alert. The motor 291 can be used for vibration alert for incoming calls, and can also be used for touch vibration feedback. For example, the touch operation applied to different applications (such as taking pictures, playing audio, etc.) can correspond to different vibration feedback effects. Touch operations on different areas of the display screen 294 can also correspond to different vibration feedback effects. Different application scenarios (such as time reminders, receiving information, alarm clocks, games, etc.) can also correspond to different vibration feedback effects. Touch vibration feedback effect can also support customization.

The indicator 292 can be an indicator light, which can be used to indicate the charging status, power change, and can also be used to indicate messages, missed calls, notifications, and so on.

The SIM card interface 295 is used to connect a SIM. The SIM card can be inserted and removed from the SIM card interface 295 to achieve contact and separation with the terminal 200. The terminal 200 may support one or N SIM card interfaces, and N is a positive integer greater than 1. The SIM card interface 295 can support Nano SIM cards, Micro SIM cards, SIM cards, etc. Multiple SIM cards can be inserted into the same SIM card interface 295 at the same time. The types of the multiple cards may be the same or different. The SIM card interface 295 can also be compatible with different types of SIM cards. The SIM card interface 295 is also compatible with external memory cards. The terminal 200 interacts with the network through a SIM card to implement functions such as calling and data communication. In some embodiments, the terminal 200 employs eSIM. That is: embedded SIM card. The eSIM card can be embedded in the terminal 200 and cannot be separated from the terminal 200.

Please refer to FIG. 3, which is a schematic diagram illustrating an example of a communication scenario applied to a method for recording and displaying information in a communication process according to an embodiment of the present application. In the embodiment of the present application, the terminal 200 shown in FIG. 3 is the first terminal in the embodiment of the present application, the terminal 300 is the second terminal, and the terminal 200 performs voice communication with the terminal 300 as an example. The method of recording and displaying information during the process will be illustrated by examples. Among them, as shown in FIG. 3, during the voice communication between the terminal 200 and the terminal 300, the user 210 uses the terminal 200 to make a voice call with the user 310 using the terminal 300.

An embodiment of the present application provides a method for recording information during communication. As shown in FIG. 4, the method for recording information during communication may include S401-S404:

S401. The terminal 200 performs voice communication with the terminal 300.

S402. The terminal 200 identifies voice data captured by the microphone 270C of the terminal 200.

Among them, during the voice communication between the terminal 200 and the terminal 300, a microphone 270C (also referred to as a "microphone") of the terminal 200 can capture voice data around the terminal 200. The voice data around the terminal 200 may include voice data sent by the user 210 and ambient noise around the terminal 200. The terminal 200 may convert the voice data captured by the microphone 270C into text, and then determine whether the converted text matches a preset start word. If they match, the terminal 200 can start recording the voice data corresponding to the voice communication.

Exemplarily, the above preset start words may be "Let me record", "Let me record", "Start recording", "Start recording now", and the like.

The preset start word in the embodiment of the present application may be a default start word configured in the terminal 200 when the terminal 200 is shipped from the factory. Alternatively, the preset start word may be a custom start word set by the user in the terminal 200.

In this embodiment of the present application, the terminal 200 (ie, the mobile phone 200) shown in FIG. 5 is used as an example to describe the process of the terminal receiving the startup word set by the user:

The mobile phone 200 may receive a user's click operation (such as a click operation) on the “Settings” application icon on the desktop of the mobile phone 200. In response to the user's click operation on the “Settings” application icon, the mobile phone 200 may display a mobile phone setting interface. The mobile phone settings interface can include "airplane mode" options, "WLAN" options, "Bluetooth" options, "Mobile network" options, and "System application" options, etc. For specific functions of the "airplane mode" option, the "WLAN" option, the "Bluetooth" option, and the "mobile network" option, reference may be made to specific descriptions in conventional technologies, which are not described in the embodiment of the present application. The mobile phone 200 may display a system application setting interface in response to a user's click operation on the "System Application" option. The system application setting interface includes a "phone" option, a "contact" option, and a "text message" option. The mobile phone 200 may display the phone setting interface 501 shown in (a) of FIG. 5 in response to a user's click operation on the “phone” option in the system application setting interface. Optionally, the above-mentioned setting interface may include a "phone" option. The mobile phone 200 may display the phone setting interface 501 shown in (a) of FIG. 5 in response to a user's click operation on the “Phone” option in the mobile phone setting interface.

As shown in (a) of FIG. 5, the phone setting interface 501 includes a “call forwarding” option, a “call waiting” option, an “incoming call blocking” option, and a “call recording” option 502. The specific functions of the "Call Forwarding" option, "Call Waiting" option, and "Incoming Call Blocking" option can refer to the specific description in the conventional technology, which will not be repeated here in the embodiment of the present application. In response to the click operation of the “call recording” option 502, the mobile phone 200 may display the call recording interface 503 shown in FIG. 5 (b). The call recording interface 503 includes a "recording notification" option, a "connect to automatic recording" option, and a "voice control recording" option 504. After the “Recording Notification” option is turned on, the mobile phone 200 can prompt in the notification bar after recording. When the “Automatic recording on call” option is turned on, the mobile phone 200 can automatically record when a call is connected. After the "Voice Control Recording" option 504 is turned on, during the call, the mobile phone 200 receives the start word (such as "start recording" or "memory") sent by the user, and can automatically start recording; the mobile phone 200 receives the end of the user's Words (such as "End Recording") or the end of a call or the phone 200 recording preset time (such as 1 minute, 2 minutes, or 5 minutes, etc.) can automatically end the recording. In response to the user's click operation on the "Voice Control Recording" option 504, the mobile phone 200 may display the voice control recording interface 505 shown in (c) in FIG. 5. The voice-control recording interface 505 includes a voice-control recording switch 506, a "start word" option 507, and a "end word" option 510. The “starting word” option 507 includes a start word 508 and a “custom start word” option 509 currently configured in the mobile phone 200. The “end word” option 510 includes a currently configured end word 511 and a “custom end word” option 512 in the mobile phone 200. Wherein, before the user has set the start word and the end word in the mobile phone 100, the user is instructed in the "start word" option 507 to indicate the default start word "start recording", and the "end word" option 510 is indicated to the user in default The word "end recording".

In response to the user's click operation on the "custom start word" option, the mobile phone 200 may display the start word custom interface 601 shown in FIG. 6. The start word customization interface 601 may include a "Cancel" button 604, an "OK" button 605, a "Start word input box" 602, and a start word suggestion 603. The “Cancel” button 604 is used to trigger the mobile phone to cancel the setting of the custom start word, and display the voice control recording interface 505 shown in (c) in FIG. 5. The "starting word input box" 602 is used to receive a custom starting word input by a user. The "OK" button 210 is used to save the user-defined start word input by the user in the "start word input box" 602. The start word suggestion 603 is used to prompt the user of the mobile phone's requirements for a custom start word. Assume that the user enters a custom start word "Let me remember" in the "start word input box" 602 shown in FIG. 6. In response to the user's click operation (such as a click operation) on the “OK” button 605 shown in FIG.

S403. The terminal 200 determines whether the text corresponding to the first voice data captured by the microphone 270C matches a preset start word.

Specifically, if the text corresponding to the first speech data captured by the microphone 270C matches the preset start word, the terminal 200 executes S404; if the text corresponding to the first speech data captured by the microphone 270C does not match the preset start word, the terminal 200 Then proceed to S402:

S404. The terminal 200 records voice data of the terminal 300.

Wherein, in the embodiment of the present application, the text corresponding to the first voice data matches the preset start word. Specifically, the text corresponding to the first voice data includes the preset start word.

Exemplarily, in combination with the above-mentioned example, the preset start word of the terminal 200 is "let me remember". As shown in FIG. 3, during the voice communication between the terminal 200 and the terminal 300, the user 210 sends out voice data 1 "Can you provide your address and phone number?" The microphone 270C of the terminal 200 can capture the voice data 1 "Can you provide your address and phone number?" The terminal 200 can identify the voice data 1 and obtain the text "Can you provide your address and phone number?" Of the voice data 1 and determine that the text of the voice data 1 does not match the preset start word "let me remember". During the voice communication between the terminal 200 and the terminal 300, the user 210 sends out the voice data 2 "Let me remember" (that is, the first voice data). The microphone 270C of the terminal 200 can capture the voice data 2 "Let me remember". The terminal 200 may recognize the voice data 2 and obtain a text "Let me remember" of the voice data 2 and determine that the text of the voice data 2 matches a preset start word "Let me remember". When the terminal 200 determines that the text of the voice data 2 matches the preset start word “let me remember”, it can start recording the voice data corresponding to the voice communication. As shown in FIG. 3, the terminal 200 may start recording voice data corresponding to voice communication at time t1.

In the embodiment of the present application, the voice data of the terminal 300 is converted from audio electrical signals received from the terminal 300.

Optionally, the terminal 200 may also record voice data captured by the microphone 270C of the terminal 200. If the text corresponding to the first voice data captured by the microphone 270C matches the preset start word, the terminal 200 may execute not only S404, but also S404 ':

S404 ', the terminal 200 records voice data captured by the microphone 270C of the terminal 200.

That is, the voice data recorded by the terminal 200 may include: voice data captured by the microphone 270C of the terminal 200, and voice data converted from audio electrical signals received from the terminal 300.

The microphone 270C of the terminal 200 can receive voice data (ie, a first sound signal) sent by the user 210. The terminal 200 may convert a first sound signal into a first audio electrical signal. Then, the terminal 200 sends the first audio electrical signal to the terminal 300. The terminal 300 may convert the first audio electrical signal from the terminal 200 into a first sound signal, and play the first sound signal through a receiver (also referred to as a “handset”), or through a speaker of the terminal 300 (also referred to as a “horn”) Playing the first sound signal. The “voice data captured by the microphone 270C” is specifically the first sound signal captured by the microphone.

Similarly, a microphone (also referred to as a “microphone”) of the terminal 300 can receive voice data (ie, a second sound signal) sent by the user 310. The terminal 300 may convert the second sound signal into a second audio electrical signal. Then, the terminal 300 sends a second audio electric signal to the terminal 200. The terminal 200 may convert the second audio electrical signal from the terminal 300 into a second sound signal, and play the second sound signal through the receiver 270B (also referred to as a “handset”), or through the speaker 270A (also referred to as a “horn” ") Play the second sound signal. The “voice data converted from the audio electrical signals received from the terminal 300” is specifically: the second sound signal obtained by the terminal 200 converting the second audio electrical signals received from the terminal 300.

Optionally, as shown in FIG. 4, after S404, the method in the embodiment of the present application may further include S405-S406:

S405. During the recording of the voice data, the terminal 200 determines whether the text corresponding to the second voice data captured by the microphone 270C matches a preset ending word.

Specifically, if the text corresponding to the second voice data captured by the microphone 270C matches the preset end wake word, the terminal 200 executes S406; if the text corresponding to the second voice data captured by the microphone 270C does not match the preset end wake word, The terminal 200 continues to record voice data:

S406. The terminal 200 stops recording voice data.

Exemplarily, the preset ending word in the embodiment of the present application may be a default starting word configured in the terminal 200 when the terminal 200 is shipped from the factory. For example, the preset ending word in the mobile phone 200 may be a default ending word “end recording” shown in (c) in FIG. 5. Alternatively, the preset start word may be a custom start word set by the user in the terminal 200. For example, the mobile phone 200 may display an end word customization interface in response to a user's click operation on the "custom end word" option 512 shown in (c) in FIG. 5. The end word custom interface is used to set the end word. For specific content of the end word customization interface, refer to the start word customization interface 601 shown in FIG. 6. Assume that the user sets a custom ending word "I remember" for the mobile phone 200 (ie, the terminal 200) in the ending word customization interface.

For example, the preset ending words in the embodiment of the present application may be "end recording", "I remembered", "end recording", "remembered", "recording completed", and the like. In the embodiment of the present application, the method of the example of the present application is described by taking the preset ending word of the terminal 200 as "I remember well" as an example.

As shown in FIG. 3, during the voice communication between the terminal 200 and the terminal 300, since the terminal 200 starts recording voice data (that is, from time t1), the user 310 sends out voice data 3 "No. 1, Kefa Road, Nanshan District, Shenzhen" . The microphone of the terminal 300 can capture the voice data 3 and convert the voice data 3 (ie, a sound signal) into an audio electrical signal 1. The terminal 300 sends an audio electric signal 1 to the terminal 200. The terminal 200 converts the audio electrical signal 1 from the terminal 300 into a sound signal (ie, voice data 3), and the terminal 300 stores the voice data 3. The microphone 270C of the terminal 200 can capture the voice data 4 "um" and record the voice data 4. The terminal 200 receives the audio electric signal 2 sent by the terminal 300. The terminal 200 converts the audio signal 2 into sound information, that is, voice data 5 "the telephone number is 88767665". The terminal 200 stores the voice data 4.

Wherein, in the embodiment of the present application, the text corresponding to the second voice data matches the preset ending word, and specifically, the text corresponding to the second voice data includes the preset ending word.

The microphone 270C of the terminal 200 can capture the voice data 6 "OK, I remember well". The terminal 200 may recognize the voice data 6 and obtain a text "OK, I remember" of the voice data 6, and determine that the text of the voice data 6 includes a preset ending word "I remember". That is, the terminal 200 may determine that the text of the voice data 6 matches the preset ending word "I remembered". When the terminal 200 determines that the text of the voice data 6 matches the preset ending word “I remembered”, it can end recording the voice data corresponding to the voice communication. As shown in FIG. 3, the terminal 200 may end (that is, stop) recording the voice data corresponding to the voice communication at time t2.

In the embodiment of the present application, during the voice communication between the terminal 200 and the terminal 300, when the text corresponding to the first voice data captured by the microphone 207C of the terminal 200 matches the text of the preset start word, the terminal 200 may automatically start recording voice communication Corresponding voice data. In other words, after receiving the voice command (ie, the first voice data whose text matches the text of the preset start word) issued by the user 210, the terminal 200 can automatically start recording the voice data corresponding to the voice communication.

In addition, during the recording of the voice data by the terminal 200, when the text corresponding to the second voice data captured by the microphone 207C of the terminal 200 matches the text of the preset ending word, the terminal 200 may automatically stop recording the voice data. In other words, the terminal 200 can automatically stop recording the voice data after receiving the voice command (ie, the second voice data whose text matches the preset end word text) issued by the user 210.

In summary, the terminal 200 can automatically record and stop recording voice data in response to a voice command issued by a user during a voice call. This solution makes the terminal more intelligent, improves the interaction performance between the terminal and the user, and improves the user experience.

Among them, S405-S406 in the embodiments of the present application are optional. The terminal 200 may stop recording voice data when the voice communication ends. Alternatively, the terminal 200 may automatically stop recording the voice data after recording the voice data for a preset time (such as 1 minute, 2 minutes, or 5 minutes).

Optionally, the terminal 200 may also prompt the user to start recording the voice data when the user starts recording the voice data corresponding to the voice communication. Specifically, after S403, if the text corresponding to the first voice data captured by the microphone 270C matches a preset start word, the method in this embodiment of the present application may further include S701:

S701. The terminal 200 sends a first prompt message. The first prompt information is used to prompt the user terminal 200 to start recording voice data.

Exemplarily, the first prompt information in the embodiment of the present application may be a prompt sound or a vibration prompt. For example, the prompt tone in the embodiment of the present application may be a single-syllable prompt tone, such as "ding", "嘀", or "嗖". Alternatively, the prompt tone can be a ring tone of N seconds in length, for example, N is 2, 3, or 5. Alternatively, the tone is a sound signal, such as "Start recording".

In the embodiment of the present application, the terminal 200 may send a first prompt message when the text corresponding to the first voice data matches a preset start word, that is, when the terminal 200 starts recording voice data, to prompt the user terminal 200 to start recording voice data. In this way, the user can know through the first prompt information that the terminal 200 has started recording voice data, which increases the direct interaction between the terminal 200 and the user, improves the interaction performance between the terminal 200 and the user, and improves the user experience.

Optionally, the terminal 200 may also prompt the user to stop recording the voice data when the terminal 200 stops recording the voice data. Specifically, after S405, if the text corresponding to the second voice data captured by the microphone 270C matches the preset end word, the method in this embodiment of the present application may further include S702:

S702: The terminal 200 sends a second prompt message. The second prompt information is used to prompt the user terminal 200 to stop recording voice data.

Exemplarily, the second prompt information in the embodiment of the present application may be a prompt sound or a vibration prompt. For example, the prompt tone in the embodiment of the present application may be a single-syllable prompt tone, such as "ding", "嘀", or "嗖". Alternatively, the prompt tone can be a ring tone of N seconds in length, for example, N is 2, 3, or 5. Alternatively, the prompt tone is a sound signal, such as "end recording".

The second prompt information and the first prompt information in the embodiment of the present application may be the same or different. The second prompt information is different from the first prompt information, and may specifically include the following three situations:

Case (1): The first prompt message is a prompt tone, and the second prompt message is a vibration prompt. Case (2): The first prompt information and the second prompt information are both prompt sounds, but the prompt sound corresponding to the first prompt information and the second prompt information are different. For example, the prompt sound corresponding to the first prompt information is "ding", the prompt sound corresponding to the second prompt information is "嘀"; or, the prompt sound corresponding to the first prompt information is "start recording", and the second prompt information corresponds to The beep is "End Record". Case (3): Both the first prompt information and the second prompt information are vibration prompts, but the vibration prompts corresponding to the first prompt information and the vibration prompts corresponding to the second prompt information have different vibration modes. E.g. The vibration mode of the vibration prompt corresponding to the first prompt information is a single vibration, and the vibration mode of the vibration prompt corresponding to the second prompt information is two consecutive vibrations.

In the embodiment of the present application, the terminal 200 may send a second prompt message when the text corresponding to the second voice data matches a preset start word, that is, when the terminal 200 stops recording voice data, to prompt the user terminal 200 to stop recording voice data. In this way, the user can know that the terminal 200 has stopped recording voice data through the prompt information, which increases the direct interaction between the terminal 200 and the user, improves the interaction performance between the terminal 200 and the user, and improves the user experience.

It can be understood that when the terminal 200 uses the speaker 270A to play voice data, when the first prompt information or the first prompt information is a prompt tone, the prompt tone emitted by the terminal 200 may be captured by the microphone 207C of the terminal 200. In this case, the terminal 200 sends the prompt sound captured by the microphone 207C to the terminal 300. In this way, the user 310 can also hear the prompt tone, which affects the user's call experience.

In the embodiment of the present application, in order to prevent the peer user (ie, the user 310) of the voice communication of the terminal 200 from hearing the prompt tone from the terminal 200. When the text corresponding to the first voice data captured by the microphone 270C matches the preset start word, the terminal 200 may determine whether the terminal 200 is playing the voice data using the speaker 270A or the voice receiver 270B. In the case where the terminal 200 uses the speaker 270A to play voice data, the terminal 200 may perform echo suppression on the voice data collected by the microphone 270C according to the voice data played by the speaker 270A. In this way, the voice data after the echo suppression will not include the above-mentioned prompt tone. The voice data sent by the terminal 200 to the terminal 300 is voice data after echo suppression. The voice data sent by the terminal 200 to the terminal 300 does not include the above-mentioned prompt tone. The user 310 will not hear the prompt tone.

Further, the terminal 200 may stop performing echo suppression on the voice data collected by the microphone 270C after playing the second prompt information (that is, the prompt sound). In this way, the power consumption caused by the terminal 200 continuously performing echo suppression can be avoided, and the battery life of the terminal 200 can be extended.

The voice data recorded by the terminal 200 may include one piece of voice data or at least two pieces of voice data. For example, in conjunction with the example corresponding to FIG. 3. The terminal 200 can record three pieces of voice data: "No. 1, Kefa Road, Nanshan District, Shenzhen", "Hmm", and "The phone number is 88767655".

In the embodiment of the present application, the terminal 200 may save the recorded voice data according to the chronological order of recording each piece of voice data and the source information of each piece of voice data. The source information is used to indicate that the voice data is voice data captured by the microphone 270C, or voice data converted from audio and electrical signals received from the terminal 300.

For example, the time sequence of recording the above three pieces of voice data by the terminal 200 is: voice data 3 “No. 1 Kefa Road, Nanshan District, Shenzhen”, voice data 4 “um”, and voice data 5 “phone number is 88767655”. The voice data 3 “No. 1 Kefa Road, Nanshan District, Shenzhen” and the voice data 5 “the telephone number is 88767655” are voice data converted from the audio and electric signals received from the terminal 300. The voice data 4 "um" is voice data captured by the microphone 270C of the terminal 200.

Exemplarily, the terminal 200 may save the voice data recorded during the voice communication process in a table manner. Assume that the talk time between the terminal 200 and the terminal 300 for the voice communication shown in FIG. 6 is 08:06 on August 8, 2018, and the call duration is 21 minutes. As shown in Table 1, an example of a voice data table shown in this embodiment of the present application:

Table 1

As shown in Table 1, the voice data table stores voice data recorded during the call between the terminal 200 and the terminal 300 (phone number is 138 **** 5678). In Table 1, according to the chronological order of the recorded voice data and the source of the voice data, the phone number is 138 **** 5678 and the call time is 08:06, August 8, 2018. During the voice call, the terminal 200 records Voice data 3 "No. 1 Kefa Road, Nanshan District, Shenzhen", voice data 4 "um" and voice data 5 "phone number is 88767655". In addition, Table 1 also indicates the source of each voice data. For example, the source of voice data 3 and voice data 5 is the other party. That is, the voice data 3 and the voice data 5 are voice data converted from audio electric signals received from the terminal 300. The source of the voice data 4 is local. That is, the voice data 4 is voice data captured by the microphone 270C of the terminal 200.

In one implementation, the terminal 200 may segment the voice data recorded by the terminal according to the source of the voice data. Specifically, in conjunction with the example shown in FIG. 3, after the terminal 200 records the voice data 3 (speech data converted from the audio and electrical signals received from the terminal 300), the microphone 270C of the terminal 200 captures the voice data 4. Since the source of the voice data 3 is different from the voice data 4; therefore, the terminal 200 can save the voice data 3 and the voice data 4 in segments. After the above-mentioned voice data 4 is captured by the microphone 270C of the terminal 200, the terminal 200 receives an audio electric signal (an audio electric signal corresponding to the voice data 5) from the terminal 300. Since the sources of the voice data 4 and the voice data 5 are different; therefore, the terminal 200 can save the voice data 4 and the voice data 5 in segments.

In some application scenarios, during the voice communication between the terminal 200 and the terminal 300, after the user 210 sends out a piece of voice data (such as the voice data a), the user 210 and the user 310 are silent for a period of time, and then the user 210 sends out another piece of voice data (Such as voice data b). Alternatively, during the voice communication between the terminal 200 and the terminal 300, after the user 310 sends out a piece of voice data, the user 210 and the user 310 are silent for a period of time, and then the user 310 sends out another piece of voice data. In this scenario, when the terminal 200 segments the voice data recorded by the terminal, not only the source of the voice data but also the interval time of the voice data can be referred to. For example, during the voice communication between the terminal 200 and the terminal 300, the user 210 sends out a piece of voice data (such as voice data a), and the microphone 270C of the terminal 200 captures the voice data a. After the microphone 270C captures the voice data a, if within a certain period of time (such as 1 minute, 2 minutes, or 5 minutes, etc.), the microphone 270C does not capture new voice data, and the terminal 200 does not receive audio data from the terminal 300. Signal, after the certain time, the voice data b is captured by the microphone 270C, and the terminal 200 may save the voice data a and the voice data b in segments. After the above-mentioned voice data b is captured by the microphone 270C of the terminal 200, the terminal 200 receives an audio electric signal (an audio electric signal corresponding to the voice data c) from the terminal 300. Since the sources of the voice data b and the voice data c are different; therefore, the terminal 200 may save the voice data b and the voice data c in segments.

Exemplarily, the terminal 200 may save the voice data a, voice data b, and voice data c recorded in the foregoing voice communication process in a table manner. As shown in Table 2, an example of a voice data table shown in this embodiment of the present application:

Table 2

In Table 2, according to the chronological order of the recorded voice data and the source of the voice data, the phone number is 138 **** 5678 and the call time is 08:06, August 8, 2018. During the voice call, the terminal 200 records Voice data a, voice data b, and voice data c. In addition, Table 1 also indicates the source of each voice data. For example, the source of voice data a is the local machine, the source of voice data b is the local machine, and the source of voice data c is the other party. That is, the voice data a and the voice data b are voice data captured by the microphone 270C of the terminal 200, and the voice data c is voice data converted from audio electric signals received from the terminal 300.

The above-mentioned voice data may be stored together with the call log information corresponding to the corresponding voice communication. That is, the terminal 200 may save the above-mentioned voice data in a storage area for storing call record information. Alternatively, the above voice data can be stored in a separate storage area. This embodiment of the present application will not repeat them here. If the terminal 200 stores the above-mentioned voice data and call log information together, it may be because the terminal 200 records voice data during some voice communication processes, and does not record voice data during other voice communication processes, so that some call records correspond to The information includes the voice data recorded by the terminal 200, and the information corresponding to some communication records does not include the voice data recorded by the terminal 200. For example, as shown in Table 3, an example of a call record information table shown in this embodiment of the present application:

table 3

In the call log information table shown in Table 3, the phone number is 138 **** 5678 and the call time is 08:06, August 8, 2018. The call log information of the voice call includes the voice recorded during the corresponding call. Data 3-voice data 5. The terminal 200 did not record voice data during a voice call with a phone number of 180 **** 1234 and a call time of 10:01 on August 8, 2018; therefore, no voice data was stored in the corresponding call log information. As shown in Table 2, the phone number is 180 **** 1234, and the call time is 10:01 on August 8, 2018. The voice data and source of the call log information of the voice call can be empty (such as NULL). .

Compared with the voice data table shown in Table 1, the call record information table shown in Table 3 can include not only the phone number corresponding to the voice call, the call time, the voice data and the source information of the voice data, but also the information of the voice call. Talk time and other information.

An embodiment of the present application provides a method for displaying information during communication. After the voice data recording ends, the terminal 200 displays a first interface including a player plug-in for playing voice data. The terminal 200 may receive a first operation of the player plug-in by the user, and play corresponding voice data in response to the first operation.

In implementation manner (1), after the voice data recording ends, the terminal 200 may automatically display the first interface in response to the end of the voice communication.

For example, as shown in FIG. 7, the terminal 200 may display the first interface 701 shown in FIG. 7 in response to the end of the voice communication. The first interface 701 may include a player plug-in for playing voice data recorded by the terminal 200. For example, the player plug-in 702, the player plug-in 703, and the player plug-in 704 in the first interface 701. The player plug-in 702 is configured to play the voice data 3 described above. The player plug-in 703 is configured to play the voice data 4 described above. The player plug-in 704 is configured to play the voice data 5 described above. Of course, the first interface 701 may also include only one player plug-in, and the player plug-in may be used to play all the voice data recorded by the terminal 200 during the voice communication, such as voice data 3, voice data 4, and voice data 5.

In implementation manner (2), the terminal 200 may automatically display the first interface after the voice data recording ends. After the recording of the voice data of the terminal 200 ends, the terminal 200 may still be performing voice communication with the terminal 300. In this case, after the voice data recording ends, the terminal 200 displays the first interface during the voice communication.

In implementation manner (3), after the voice data recording ends, the terminal 200 displays the first interface in response to the user inputting the second operation. This second operation is used to trigger the terminal 200 to display the call history interface of the terminal 200, that is, the first interface. The terminal 200 may receive a second operation input by a user. In response to the second operation, the terminal 200 displays a call history interface of the terminal 200. The call history interface may include one or more call history items. A call log entry may correspond to a call log. A call record can record the peer communication number of a voice communication, contact information (such as contact name or remark name), the start or end time of a call, and the duration of a call. For example, the call record interface in the embodiment of the present application includes a call record item in which the terminal 200 and the terminal 300 perform voice communication. The call record item is used to record call record information of the terminal 200 and the terminal 300 for voice communication, such as the phone number 138 **** 5678 of the terminal 300.

Exemplarily, the second operation may be a user's click operation (such as a click operation) on the "phone" icon 802 in the mobile phone desktop 801 shown in (a) of FIG. 8.

In the embodiment of the present application, the call record items of the voice communication between the terminal 200 and the terminal 300 may further include a player plug-in. The player plug-in is used to play voice data recorded by the terminal 200.

It is assumed that three call records are maintained in the terminal 200, that is, a call record corresponding to the call record item 804 shown in FIG. 8 (b), a call record corresponding to the call record item 805, and a call record corresponding to the call record item 806. In response to the user's click operation on the "phone" icon 802, the terminal 200 may display the call history interface 803 shown in (b) of FIG. The call history interface 803 includes a call history entry 804, a call history entry 805, and a call history entry 806. As shown in Table 1, the terminal 200 stores voice data recorded during the voice communication between the terminal 200 and the terminal corresponding to the phone number 138 **** 5678. Then, the call record entry 804 may include a player plug-in 807 for playing the voice data 3, the voice data 4, and the voice data 5 in the table 2. In response to a user's click operation (such as a click operation) on the player plug-in 807, the terminal 200 can sequentially play the voice data 3, voice data 4, and voice data 5 in Table 1. As shown in Table 1, the terminal 200 stores voice data recorded during the voice communication between the terminal 200 and the terminal corresponding to the phone number 159 **** 7986. Then, the call record entry 804 may include a player plug-in 808 for playing the voice data 7 in the table 2. In response to the user's click operation (such as a click operation) on the player plug-in 808, the terminal 200 can sequentially play the voice data 7 in Table 1.

It can be understood that, during the voice communication between the terminal 200 and the terminal corresponding to the phone number 159 **** 7986, no voice data is recorded. Therefore, the player plug-in is not displayed in the call log item 806.

Wherein, in the present application, the call record item in the call record interface includes a player plug-in for playing voice data recorded during a corresponding voice communication process. In this way, the terminal 200 can play the corresponding voice data in response to a user's click operation on the player plug-in. In addition, the terminal 200 can show the user the relationship between the voice data played by the player plug-in and the call history on the call history interface, thereby improving the correlation between the voice data recorded by the terminal 200 and the call history.

In implementation manner (4), the call record item in the call record interface may not include the above-mentioned player plug-in. In this implementation manner, after the recording of the voice data ends, the terminal 200 displays the first interface in response to the user's third operation on the call record item of the voice call in the call record interface. The third operation is used to trigger the terminal 200 to display a record detail interface corresponding to the call record item, that is, a first interface. The record details interface includes at least one player plug-in. The at least one player plug-in corresponds one-to-one with the at least two pieces of voice data. The at least one player plug-in is displayed in the record details interface according to the chronological order of recording the corresponding voice data and the source information of the corresponding voice data.

For example, in response to a user's click operation on the "phone" icon 802 in the mobile phone desktop 801 shown in (a) of Fig. 8, the terminal 200 may display the call record interface 901 shown in (a) of Fig. 9. The call history interface 901 includes a call history entry 902, a call history entry 903, and a call history entry 904. The call log item 902, the call log item 903, and the call log item 904 do not include a player plug-in. However, in response to the user's third operation on the call record item 902, the terminal 200 may display a record detail interface 905 shown in (b) of FIG. 9. The record details interface 905 includes a player plug-in 906, a player plug-in 907, and a player plug-in 908. Among them, the player plug-in 906, the player plug-in 907, and the player plug-in 908 are arranged in the recording detail interface 905 according to the sequence in which the voice data that they play is recorded. In response to the user's click operation on the player plug-in 906, the terminal 200 can play the voice data 3 "No. 1, Kefa Road, Nanshan District, Shenzhen" shown in Table 1. In response to the user's click operation on the player plug-in 907, the terminal 200 can play the voice data 4 "hmm" shown in Table 1. In response to the user's single-click operation on the player plug-in 908, the terminal 200 can play the voice data 5 "phone number is 88767655" shown in Table 1.

Wherein, in the present application, the call record item in the call record interface includes a player plug-in for playing voice data recorded during a corresponding voice communication process. In this way, the terminal 200 can play the corresponding voice data in response to a user's click operation on the player plug-in. In addition, the terminal 200 can intuitively show the user the relationship between the voice data played by the player plug-in and the call history on the call history interface, thereby improving the correlation between the voice data recorded by the terminal 200 and the call history.

In the embodiment of the present application, the terminal 200 may convert the recorded voice data into text information, that is, the text of the recorded voice data. The voice data recorded by the terminal 200 may be at least two pieces of voice data. The text of the recorded voice data may include at least two pieces of text information. At least two pieces of text information correspond to at least two pieces of voice data. The terminal 200 may store at least two pieces of text information according to the chronological order of the voice data corresponding to the recorded text information and the source information of the voice data corresponding to the text information.

Exemplarily, the terminal 200 may save the voice data recorded during the above-mentioned voice communication and the text information corresponding to each piece of voice data in a table manner. With reference to Table 1, as shown in Table 4, an example of a voice data and text information table shown in this embodiment of the present application:

Table 4

As shown in Table 2, the voice data and text information table stores the voice data recorded during the call between the terminal 200 and the terminal 300 (phone number is 138 **** 5678) and the corresponding text information.

As shown in FIG. 10A, a method for displaying information during a communication process provided by an embodiment of the present application may include S1001-S1002. As shown in FIG. 10B, a method for recording and displaying information in a communication process provided by the embodiment of the present application may include S401-S404, S404 ', and S1001-S1002.

S1001. After the recording of the voice data ends, the terminal 200 displays a first interface including text corresponding to the recorded voice data. The voice data is voice data recorded by the terminal 200 during the voice communication between the terminal 200 and the terminal 300.

For the first interface and the method for the terminal 200 to display the first interface in response to the first event, reference may be made to the detailed descriptions in the implementation manner (1) to the implementation manner (4), which are not repeatedly described in the embodiment of the present application. The difference is that the text corresponding to the voice data can be used in the first interface in this embodiment. In this embodiment, the first interface may or may not include a player plug-in.

The voice data includes at least two pieces of voice data. For example, the voice data may include the aforementioned voice data 3, voice data 4, and voice data 5. The text corresponding to the voice data includes at least two pieces of text information. At least two pieces of voice data correspond to at least two pieces of text information. In the embodiment of the present application, for a manner in which the terminal 200 records at least two pieces of voice data, and stores at least two pieces of voice data and at least two pieces of text information, reference may be made to the detailed description in the foregoing embodiment, which is not repeatedly described in this embodiment of the application.

Exemplarily, the first interface is the foregoing record detail interface as an example. The terminal 200 may receive a third operation of a call record item in which the user performs voice communication between the terminal 200 and the terminal 300. The third operation is used to trigger the terminal 200 to display a record detail interface corresponding to the call record item. The record details interface includes at least two pieces of text information. The at least two pieces of text information correspond to the at least two pieces of voice data one-to-one. For example, in response to the user's third operation on the call record item 902, the terminal 200 may display a record detail interface 1001 shown in FIG. 10C. The record details interface 1001 includes text information corresponding to the voice data 3 shown in Table 2 "No. 1, Kefa Road, Nanshan District, Shenzhen" 1002, text information corresponding to the voice data 4 "um" 1003, and text information corresponding to the voice data 5. "The phone number is 88767655" 1004. In addition, the text information "No. 1 Kefa Road, Nanshan District, Shenzhen" 1002, the text information "um" 1003, and the text information "phone number is 88767665" 1004 are arranged in the order in which their corresponding voice data were recorded. In addition, the record details interface 1001 also indicates the source of each text message. For example, the text message "No. 1 Kefa Road, Nanshan District, Shenzhen" 1002 comes from the other party, that is, the text message "No. 1 Kefa Road, Nanshan District, Shenzhen" 1002 is voice data converted from the audio and electrical signals received from the terminal 300 Corresponding text information. The text information "um" 1003 comes from the local machine, that is, the text information "um" 1003 is the text information corresponding to the voice data captured by the microphone 270C. The text message "phone number is 88767655" 1004 comes from the counterparty, that is, the text message "phone number is 88767655" 1004 is text information corresponding to voice data converted from audio and electrical signals received from the terminal 300.

In the embodiment of the present application, the terminal 200 may display text information corresponding to the voice data recorded by the terminal 200 in a record detail interface of the call record. That is, the terminal 200 can intuitively display the text information corresponding to the voice data recorded by the terminal 200 and the relationship between the text information and the call record on the call record interface, thereby improving the relevance of the text information and the call record.

S1002. The terminal 200 may receive a user's first operation on the text corresponding to the voice data, and play the corresponding voice data in response to the first operation.

Specifically, the terminal 200 may also receive a user's first operation on any piece of text information (such as the first text information) in at least two pieces of text information in the record detail interface. The first operation user triggers the terminal 200 to play a voice data segment corresponding to the first text information. For example, the first operation may be any operation such as a user's single-click operation, double-click operation, or long-press operation on the first text information. In response to the user's first operation on the first text information, the terminal 200 may play a first voice data segment corresponding to the first text information. The first speech data segment is a piece of speech data among at least two pieces of speech data.

Exemplarily, the terminal 200 may receive a user's first operation on the first text information “No. 1 Kefa Road, Nanshan District, Shenzhen” 1002 in the record detail interface 1001 shown in FIG. 10C. In response to the user's first operation on the first text information "No. 1 Kefa Road, Nanshan District, Shenzhen" 1002, as shown in Fig. 10C, the terminal 200 may have a voice data segment "No. 1 Kefa Road, Nanshan District, Shenzhen".

Optionally, the above record detail interface may include not only at least two pieces of text information, but also at least two player plug-ins. At least two player plug-ins correspond one-to-one with at least two pieces of text information. The at least two player plug-ins are used to play voice data corresponding to at least two pieces of text information. For example, as shown in FIG. 11, the record details interface 1101 may include not only the text information "No. 1, Kefa Road, Nanshan District, Shenzhen" 1102, the text information "um" 1103, and the text information "phone number is 88767655" 1104, but also Including player plug-in 1105, player plug-in 1106, and player plug-in 1107. The player plug-in 1105 is used to play the voice data "No. 1, Kefa Road, Nanshan District, Shenzhen" corresponding to the text information 1102. The player plug-in 1106 is used to play the voice data "um" corresponding to the text information 1103. The player plug-in 1107 is used to play the voice data "phone number 88876655" corresponding to the text information 1104.

In some cases, when the terminal 200 converts the voice data recorded by the terminal 200 into text information, the converted text information may not be completely consistent with the text of the voice data recorded by the terminal 200. That is, some errors may occur in the text information converted by the terminal 200. In the embodiment of the present application, the terminal 200 may receive a fourth operation (that is, a modification operation) performed by a user on at least two pieces of text information (such as second text information), and modify the second text information stored in the terminal 200. For example, the fourth operation may be any operation such as a user's single-click operation, double-click operation, or long-press operation on the second text information. This fourth operation is different from the second operation described above. The second text information is a piece of text information among at least two pieces of text information. The second text information corresponds to a second voice data segment of at least two pieces of voice data. In this way, the user can control the terminal 200 to play the second voice data segment corresponding to the second text information, and compare the second voice data segment played by the terminal 200 with the second text information displayed by the terminal 200. When the second text information is inconsistent with the text of the second voice data segment played by the terminal 200, the operation terminal 200 modifies the second text information.

In response to the fourth operation, the first terminal may modify the second text information to the third text information; display the third text information on the first interface; receive the first operation of the user on the third text information, and play the second Voice data segment.

Exemplarily, as shown in (a) of FIG. 12, the record details interface 1201 may include not only the text information "phone number is 8877665" 1202, but also a player plug-in 1203. The player plug-in 1203 is used to play the voice data 5 "phone number is 88767655" shown in Table 2. In response to the user's click operation on the player plug-in 1203, after the terminal 200 plays the voice data 5 "phone number is 88767665", the user 210 finds that the text message "phone number is 8877665" 1202 is different from the voice data 5. At this time, the user can modify the text information "phone number is 8877665" 1202 according to the voice data 5.

For example, the terminal 200 may receive a fourth operation (such as a double-click operation) performed by the user on the second text information "phone number is 8877665" 1202. In response to the modification operation, the terminal 200 may display a record detail interface 1204 shown in (b) of FIG. 12. The record details interface 1204 includes a modification box 1205 and a keyboard 1206 displaying the second text information. As shown in (c) of FIG. 12, the user entered correct text information "phone number is 88767655" (i.e., third text information) in the modification box 1205. In response to the user's click operation on the “OK” button in the modification box 1205, the terminal 200 may display the record details interface 1208 shown in (d) in FIG. 12. The record details interface 1208 includes text information "phone number is 88767655" 1207 (ie, third text information). Thereafter, the terminal 200 receives the first operation of the user on the text information "phone number is 88767665" 1207, and can play the second voice data segment (that is, the voice data segment corresponding to the second text information).

In the embodiment of the present application, the terminal 200 may respond to a user's modification operation on the text information, and replace the text information before the modification with the modified text information. In this way, the user can modify the text information obtained by the terminal 200 by converting the voice data according to the voice data stored in the terminal 200, and can correct an error that occurs when the terminal 200 converts the voice data to obtain the text information.

It can be understood that the positions of the call log items in the call log interface are limited, and may not be sufficient to fully display the above at least two pieces of text information. In this case, the call record entry may include keywords of the at least two pieces of text information. The terminal 200 may display the at least two pieces of text information and the corresponding player plug-in in response to a user's click operation (such as a click operation) on the keyword. For example, the terminal 200 may display a call history interface 1301 shown in (a) of FIG. 13. The call history interface 1301 includes a call history item 1302. The call log item 1302 includes a keyword "address, phone" 1303. In response to a user's click operation on the keyword "address, phone" 1303, the terminal 200 may display at least two pieces of text information shown in (b) in FIG. 13 and their corresponding player plug-ins 1304.

Optionally, in an implementation manner of the embodiment of the present application, the at least two pieces of voice data recorded by the terminal 200 include only voice data converted from audio and electrical signals received from the terminal 300, and do not include voice captured by the microphone 270C. data. In other words, the terminal 200 records only voice data from the counterpart. The terminal 200 may save the at least two pieces of voice data in sequence according to the time sequence of recording each piece of voice data.

Or, in another implementation manner, the at least two pieces of voice data recorded by the terminal 200 include only voice data captured by the microphone 270C, and do not include voice data converted from audio and electrical signals received from the terminal 300. In other words, the terminal 200 records only the voice data sent by the owner of the local machine. The terminal 200 may save the at least two pieces of voice data in sequence according to the time sequence of recording each piece of voice data.

In the above two implementation manners, the terminal 200 displays the text information of the voice data or the player plug-in in the call record item in the call record interface, and the terminal 200 displays the text of the voice data in the record details interface corresponding to the call record item For information or the manner of the player plug-in, reference may be made to the detailed description in the foregoing embodiment, which will not be repeated here in the embodiment of the present application.

It can be understood that, in order to implement the foregoing functions, the terminal 200 includes a hardware structure and / or a software module corresponding to each function. Those skilled in the art should easily realize that, in combination with the units and algorithm steps of each example described in the embodiments disclosed herein, the embodiments of the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is performed by hardware or computer software-driven hardware depends on the specific application of the technical solution and design constraints. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the embodiments of the present application.

In the embodiment of the present application, functional modules of the terminal 200 may be divided according to the foregoing method example. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The above integrated modules may be implemented in the form of hardware or software functional modules. It should be noted that the division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.

Exemplarily, in a case where each functional module is divided according to each function, FIG. 14 shows a possible structural diagram of a terminal involved in the foregoing embodiment. The terminal 1400 includes a control module 1401 and a monitoring module 1402. , Automatic speech recognition module 1403, monitoring module 1404, echo suppression module 1405, recording module 1406, recording module 1407, text recording module 1408, playback module 1409, and display module 1410.

The control module 1401 is configured to control the monitoring module 1402 to monitor the voice data sent by the host when the terminal 1400 performs voice communication with other terminals. The monitoring module 1402 is configured to transmit the monitored voice data to the automatic voice recognition module 1403. The automatic voice recognition module 1403 converts the voice data monitored by the monitoring module 1402 into text information.

The control module 1402 is further configured to determine whether the text information recognized by the automatic voice recognition module 1403 (that is, the text of the voice data monitored by the monitoring module 1402) matches the preset start word. The control module 1402 is further configured to start the monitoring module 1404, the recording module 1406, the recording module 1407, and the text recording module 1408 when the text of the voice data monitored by the monitoring module 1402 matches a preset start word.

The monitoring module 1404 is configured to monitor the voice data from the peer end and transmit the monitored voice data to the automatic voice recognition module 1403. The automatic speech recognition module 1403 is configured to convert the speech data monitored by the monitoring module 1404 into text information. The control module 1402 is further configured to control the automatic speech recognition module 1403 to transmit the converted text information and source information to the text recording module 1408 when the text of the voice data monitored by the monitoring module 1402 matches a preset start word. The text recording module 1408 is used to record text information and its source information from the automatic speech recognition module 1403. The recording module 1406 is used to record the voice data sent by the owner of the machine. The recording module 1407 is configured to record voice data from a peer end.

Further, the control module 1402 is further configured to turn off the monitoring module 1404, the recording module 1406, the recording module 1407, and the text recording module 1408 when the text of the voice data monitored by the monitoring module 1402 matches a preset ending word.

Optionally, the control module 1402 is further configured to control the playback module 1409 to play the second prompt information described in the foregoing embodiment, such as a prompt sound, when the text of the voice data monitored by the monitoring module 1402 matches a preset ending word. . The control module 1402 is further configured to start the echo suppression module 1405 when the text of the voice data monitored by the monitoring module 1402 matches a preset ending word. The control module 1402 is further configured to control the playback module 1409 to play the second prompt information, such as a prompt sound, shown in the foregoing embodiment when the text of the voice data monitored by the monitoring module 1402 matches a preset ending word. The echo suppression module 1405 is configured to suppress the echo of the voice data monitored by the monitoring module 1402 according to the prompt sound played by the playback module 1409. The control module 1402 is further configured to close the playback module 1409 after the playback module 1409 plays the second prompt message.

The control module 1401 is further configured to control the display module 1410 to display an incoming call reminder interface, a call history interface, and the like according to the embodiment.

Of course, the terminal 1300 includes, but is not limited to, the unit modules listed above. For example, the terminal 300 may further include a receiving module and a sending module. The receiving module is used to receive data or instructions sent by other terminals. The sending module is used to send data or instructions to other terminals. In addition, the specific functions that can be implemented by the above functional units also include but are not limited to the functions corresponding to the method steps described in the above examples. For detailed descriptions of other units of the terminal 1400, please refer to the detailed description of the corresponding method steps. Examples are not repeated here.

In the case where an integrated unit is used, FIG. 15 shows a possible structural diagram of a terminal involved in the foregoing embodiment. The terminal 1500 includes a processing module 1501, a storage module 1502, a display module 1503, a communication module 1504, and an audio module 1505. The processing module 1501 is configured to control and manage the actions of the terminal 1500. For example, the processing module 1501 may be used to support the terminal 1500 to execute S402, S403, S404, S404 ', S405, S406, S701, and S702 in the above method embodiment to generate a first interface in S1001, and "receive a first operation in S1002" ", And / or other processes for the techniques described herein. The display module 1503 is configured to display an image generated by the processing module 1501. For example, the display module 1503 is configured to support the terminal 1500 to execute S1001 in the foregoing method embodiment, and / or other processes used in the technology described herein. The storage module 1502 is configured to store program codes and data of the terminal. For example, the storage module 1502 is configured to store the speech data recorded by the processor in S404 and S404 ′ and the text information of the recorded speech data. The communication module 1504 is configured to support communication between the terminal 200 and other network entities (such as the terminal 300). For example, the communication module 1504 is used to support the terminal 1500 to perform S401 in the foregoing method embodiment, and / or other processes used in the technology described herein. The audio module 1505 is configured to collect voice data sent by a user of the terminal 200 and play the voice data. For example, the audio module 1505 (such as a microphone in the audio module 1505) is used to support the terminal 1500 to perform an operation of "capturing voice data". The audio module 1505 (such as a speaker and a receiver in the audio module 1505) is used to support the terminal 1500 to perform an operation of "playing voice data." For a detailed description of each unit included in the terminal 1500, reference may be made to the description in the foregoing method embodiments, and details are not described herein again.

The processing module 1501 may be a processor or a controller. For example, the processing module 1501 may be a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), and an application-specific integrated circuit (Application-Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It may implement or execute various exemplary logical blocks, modules, and circuits described in connection with the present disclosure. The processor may also be a combination that implements computing functions, such as a combination including one or more microprocessors, a combination of a DSP and a microprocessor, and so on. The communication module may be a transceiver, a transceiver circuit, or a communication interface. The storage module 1502 may be a memory.

The processing module 1501 is a processor (such as the processor 210 shown in FIG. 2), and the communication module 1504 includes a radio frequency module (such as the radio frequency module 250 shown in FIG. 2). The communication module may also include a Wi-Fi module and a Bluetooth module. Communication modules such as the radio frequency module 250, the Wi-Fi module, and the Bluetooth module may be collectively referred to as a communication interface. The storage module 1502 is a memory (the internal memory 221 shown in FIG. 2). The display module 1503 is a touch screen (including a display screen 294 shown in FIG. 2, in which a display panel and a touch panel are integrated). The audio module 270 may include a microphone (such as the microphone 270C shown in FIG. 2), a speaker (such as the speaker 270A shown in FIG. 2), a receiver (such as the receiver 270B shown in FIG. 2), and a headphone interface (as shown in FIG. 2). Headphone jack 270D). The terminal provided in this embodiment of the present application may be the terminal 200 shown in FIG. 2. The processor, the communication interface, the touch screen, the memory, the microphone, the receiver, and the speaker may be coupled together through a bus.

An embodiment of the present application further provides a computer storage medium. The computer storage medium stores computer program code. When the processor executes the computer program code, the terminal executes any of the drawings in FIG. 4, FIG. 10A, or FIG. 10B. The related method steps implement the method in the above embodiment.

The embodiment of the present application also provides a computer program product, and when the computer program product is run on a computer, the computer is caused to execute the related method steps in any one of FIG. 4, FIG. 10A, or FIG. method.

The terminal 1400, the terminal 1500, the computer storage medium, or the computer program product provided in the embodiment of the present application are all used to execute the corresponding methods provided above. Therefore, for the beneficial effects that can be achieved, refer to the foregoing provided. The beneficial effects in the corresponding method are not repeated here.

Through the description of the above embodiments, those skilled in the art can clearly understand that, for the convenience and brevity of the description, only the division of the above functional modules is used as an example. In practical applications, the above functions can be allocated as required. Completed by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be divided. The combination can either be integrated into another device, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components displayed as units may be one physical unit or multiple physical units, that is, may be located in one place, or may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit.

When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a readable storage medium. Based on such an understanding, the technical solution of the embodiments of the present application is essentially a part that contributes to the existing technology or all or part of the technical solution may be embodied in the form of a software product that is stored in a storage medium. Included are several instructions for causing a device (which can be a single-chip microcomputer, a chip, etc.) or a processor to execute all or part of the steps of the method described in the embodiments of the present application. The foregoing storage medium includes various media that can store program codes, such as a U disk, a mobile hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The above description is only a specific implementation of this application, but the scope of protection of this application is not limited to this. Any changes or replacements within the technical scope disclosed in this application shall be covered by the scope of protection of this application. . Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims

A method for recording and displaying information in a communication process, which is characterized in that the method is applied to a process in which a first terminal performs voice communication with a second terminal, and the method includes:

The first terminal recognizes voice data captured by a microphone of the first terminal;

If the text corresponding to the first voice data captured by the microphone matches a preset start word, the first terminal starts recording the voice data of the second terminal, and the voice data of the second terminal is received by the user. The audio and electric signals of the second terminal are converted;

After recording of the voice data, the first terminal displays a first interface; the first interface includes text corresponding to the recorded voice data;

The first terminal receives a user's first operation on the text corresponding to the recorded voice data, and plays the recorded voice data in response to the first operation.
The method according to claim 1, wherein after the recording of the voice data is finished, the first terminal displaying the first interface comprises:

After the voice data recording ends, the first terminal automatically displays the first interface;

or,

In response to the end of the voice communication, the first terminal displays the first interface;

or,

After the voice data recording ends, in response to a user inputting a second operation, the first terminal displays the first interface, and the second operation is used to instruct the first terminal to display the call of the first terminal Record interface, the first interface is the call record interface, the call record interface includes a call record item of the voice communication, the call record item is used to record call record information of the voice communication, and the call The record includes text corresponding to the recorded voice data;

or,

After the recording of the voice data is completed, in response to a user's third operation on the call record item of the voice call in the call history interface, the first terminal displays the first interface, and the third operation is used to indicate The first terminal displays a record detail interface of the voice communication, the first interface is the record detail interface, the record detail interface is used to display call record information of the voice communication, and the record detail interface Including text corresponding to the recorded voice data.
The method according to claim 1 or 2, further comprising:

If the text corresponding to the first voice data captured by the microphone matches the preset start word, the first terminal starts recording the voice data captured by the microphone.
The method according to any one of claims 1-3, wherein the text corresponding to the recorded voice data includes at least two pieces of text information, the recorded voice data includes at least two pieces of voice data, and At least two pieces of text information correspond one-to-one with the at least two pieces of voice data;

Receiving, by the first terminal, a first operation performed by a user on text corresponding to the recorded voice data, and playing the voice data in response to the first operation includes:

Receiving, by the first terminal, the user's first operation on the first text information; in response to the first operation, the first terminal plays a first voice data segment;

The first text information is a piece of text information in the at least two pieces of text information; the first text information corresponds to the first voice data segment.
The method according to claim 4, further comprising:

Receiving, by the first terminal, a fourth operation performed by the user on the second text information, the fourth operation being used to modify the second text information into third text information; the second text information is the at least two segments A piece of text information in the text information, and the second text information corresponds to a second voice data segment;

In response to the fourth operation, the first terminal modifies the second text information into the third text information;

Displaying, by the first terminal, the third text information on the first interface;

Receiving, by the first terminal, the user's first operation on the third text information, and playing the second voice data segment.
The method according to claim 4 or 5, wherein the first terminal displays a first interface; the first interface includes text corresponding to the recorded voice data, and includes:

Displaying, by the first terminal, the at least two pieces of text information on the first interface in accordance with the chronological order of the voice data segments corresponding to the recorded text information and the source information of the voice data segments corresponding to the text information;

The source information is used to indicate that the voice data segment is voice data captured by the microphone or voice data of the second terminal.
The method according to any one of claims 1-6, wherein the method further comprises:

If the text corresponding to the first voice data matches the preset start word, the first terminal sends a first prompt message, and the first prompt information is used to prompt a user that the first terminal starts to record voice data The first prompt information is a prompt sound or a vibration prompt.
The method according to any one of claims 1 to 7, wherein before the first terminal displays a first interface, the method further comprises:

During the recording of the voice data, if the text corresponding to the second voice data captured by the microphone matches a preset end word, the first terminal stops recording the voice data.
The method according to claim 8, further comprising:

During the recording of the voice data, if the text corresponding to the second voice data captured by the microphone matches a preset ending wake-up word, the first terminal sends a second prompt message, where the second prompt information is used for Prompting the user that the first terminal stops recording voice data, and the second prompt information is a prompt sound or a vibration prompt.
The method according to claim 9, wherein the first prompt information and the second prompt information are prompt sounds, and the method further comprises:

Determining, by the first terminal when the text corresponding to the first voice data matches the preset start word, that the first terminal uses a speaker to play voice data;

The first terminal performs echo suppression on the voice data collected by the microphone according to the voice data played by the speaker;

After the first terminal plays the second prompt information, the first terminal stops performing echo suppression on the voice data collected by the microphone.
The method according to any one of claims 4 to 6, wherein the first terminal saves the at least two segments in accordance with a chronological order of recording each segment of voice data and a source information of each segment of voice data. Voice data

The source information is used to indicate that the voice data segment is captured by the microphone, or the source information is used to indicate that the voice data segment is voice data of the second terminal.
The method according to any one of claims 4-6 or 11, wherein the first interface further includes at least two player plug-ins;

The at least two playback plug-ins are used to play the at least two pieces of voice data, and the at least two playback plug-ins correspond to the at least two pieces of text information on a one-to-one basis.
A terminal, wherein the terminal is a first terminal, and the terminal includes: one or more processors, a memory, a touch screen, a microphone, a communication interface, a receiver, and a speaker; the memory, the display, and The communication interface is coupled with the processor; the touch screen is used to display an image generated by the processor; the microphone is used to capture voice data; the memory is used to store computer program code; the computer program code includes a computer Instructions, when the processor executes the computer instructions described above,

The processor is configured to perform voice communication with the second terminal through the communication interface; identify voice data captured by the microphone; if it is recognized that the text corresponding to the first voice data captured by the microphone matches a preset start word , The recording of the voice data of the second terminal is started, and the voice data of the second terminal is converted from the audio and electrical signals received from the second terminal; the processor is further configured to: Recorded voice data stored in memory;

The processor is further configured to control the touch screen to display a first interface after the recording of the voice data is finished; the first interface includes text corresponding to the recorded voice data;

The processor is further configured to receive a first operation performed by the user on the text displayed on the touch screen; and control the receiver or the speaker to play the voice data in response to the first operation.
The terminal according to claim 13, wherein the processor, configured to control the touch screen to display a first interface after the recording of the voice data is completed, comprises:

The processor is configured to automatically control the touch screen to display the first interface after the recording of the voice data is finished;

or,

The processor is configured to control the touch screen to display the first interface in response to the end of the voice communication;

or,

The processor is configured to control the touch screen to display the first interface in response to a user inputting a second operation after the recording of the voice data is finished, and the second operation is used to instruct the terminal to display the terminal The call record interface, the first interface is the call record interface, and the call record interface includes a call record item of the voice communication, and the call record item is used to record the call record information of the voice communication. The call record item includes text corresponding to the recorded voice data;

or,

The processor is configured to control the touch screen to display the first interface in response to a third operation of a call record item of the voice call in the call record interface by the user after the recording of the voice data ends, and A third operation is used to instruct the terminal to display a record detail interface of the voice communication, the first interface is the record detail interface, the record detail interface is used to display call record information of the voice communication, and the The record details interface includes text corresponding to the recorded voice data.
The terminal according to claim 13 or 14, wherein the processor is further configured to: if it is recognized that the text corresponding to the first voice data captured by the microphone matches the preset start word, the processor is configured to: The first terminal starts recording voice data captured by the microphone.
The terminal according to any one of claims 13-15, wherein the text corresponding to the recorded voice data includes at least two pieces of text information, the recorded voice data includes at least two pieces of voice data, and the at least Two pieces of text information correspond one-to-one with the at least two pieces of voice data;

The processor, configured to receive a user's first operation on the text displayed on the touch screen, and control the receiver or the speaker to play the voice data in response to the first operation, includes:

The processor is configured to receive the user's first operation on the first text information displayed on the touch screen; and in response to the first operation, control the receiver or the speaker to play a first voice data segment;

The first text information is a piece of text information among the at least two pieces of text information; the first text information corresponds to the first voice data segment.
The terminal according to claim 16, wherein the processor is further configured to receive a fourth operation performed by the user on the second text information displayed on the touch screen, and the fourth operation is configured to use the second operation The text information is modified into third text information; the second text information is a piece of text information in the at least two pieces of text information, and the second text information corresponds to a second voice data segment; in response to the fourth operation Modify the second text information to the third text information; save the third text information in the memory, and a correspondence between the third text information and the second voice data segment;

The processor is further configured to control the touch screen to display the third text information on the first interface; receive the user's first operation on the third text information displayed on the touch screen, and control The handset or speaker plays the second voice data segment.
The terminal according to claim 16 or 17, wherein the processor, configured to control the touch screen to display the first interface, comprises:

The processor is configured to control the touch screen to display the at least two pieces of text on the first interface in accordance with a chronological order of voice data segments corresponding to the recorded text information and source information of the voice data segments corresponding to the text information. information;

The source information is used to indicate that the voice data segment is voice data captured by the microphone or voice data of the second terminal.
The terminal according to any one of claims 13 to 18, wherein the processor is further configured to, if the text corresponding to the first voice data captured by the microphone and the preset activation are recognized When the words match, a first prompt message is sent. The first prompt message is used to prompt the user to start recording voice data on the terminal. The first prompt message is a prompt sound or a vibration prompt.
The terminal according to any one of claims 13 to 19, wherein the processor is further configured to control the touch screen to display the first interface, if the voice data is recognized during recording of the voice data, If the text corresponding to the second voice data captured by the microphone matches the preset end word, stop recording the voice data.
The terminal according to claim 20, wherein the processor is further configured to, during the recording of the voice data, if the text corresponding to the second voice data captured by the microphone and the preset are recognized After the wake-up word matching is ended, a second prompt message is issued, the second prompt message is used to prompt the user to stop recording the voice data by the first terminal, and the second prompt message is a prompt sound or a vibration prompt.
The terminal according to claim 21, wherein the first prompt information and the second prompt information are prompt sounds, and the processor is further configured to add a text corresponding to the first voice data to the pre- When the start word is matched, it is determined that the speaker plays voice data; the voice data collected by the microphone is echo suppressed according to the voice data played by the speaker; after the second prompt message is played by the receiver or speaker, Stop performing echo suppression on the voice data collected by the microphone.
The terminal according to any one of claims 16 to 18, wherein the memory stores the at least two pieces of voice data according to a chronological order of recording each piece of voice data and a source information of each piece of voice data ;

The source information is used to indicate that the voice data segment is captured by the microphone, or the source information is used to indicate that the voice data segment is voice data of the second terminal.
The terminal according to any one of claims 16-18 or 23, wherein the first interface displayed on the touch screen further includes at least two player plug-ins, and the at least two playback plug-ins and the The at least two pieces of text information correspond one-to-one, and the at least two playback plug-ins correspond to the at least two pieces of voice data one-to-one;

The processor is further configured to receive a user's click operation on a first player plug-in of the at least two player plug-ins, and control the receiver or the speaker to play a voice data segment corresponding to the first player plug-in. .
A computer storage medium, characterized in that the computer storage medium includes computer instructions, and when the computer instructions are run on a terminal, cause the terminal to execute the communication process according to any one of claims 1-12 Information recording and display methods.
A computer program product, characterized in that when the computer program product is run on a computer, the computer is caused to execute a method for recording and displaying information during a communication process according to any one of claims 1-12.