CN113808592A - Method and device for transcribing call recording, electronic equipment and storage medium - Google Patents

Method and device for transcribing call recording, electronic equipment and storage medium Download PDF

Info

Publication number
CN113808592A
CN113808592A CN202110944899.2A CN202110944899A CN113808592A CN 113808592 A CN113808592 A CN 113808592A CN 202110944899 A CN202110944899 A CN 202110944899A CN 113808592 A CN113808592 A CN 113808592A
Authority
CN
China
Prior art keywords
call
recording
transcription
confirmation instruction
transfer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110944899.2A
Other languages
Chinese (zh)
Inventor
王婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Original Assignee
Baidu Online Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu Online Network Technology Beijing Co Ltd filed Critical Baidu Online Network Technology Beijing Co Ltd
Priority to CN202110944899.2A priority Critical patent/CN113808592A/en
Publication of CN113808592A publication Critical patent/CN113808592A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/64Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
    • H04M1/65Recording arrangements for recording a message from the calling party
    • H04M1/656Recording arrangements for recording a message from the calling party for recording conversations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/64Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
    • H04M1/65Recording arrangements for recording a message from the calling party
    • H04M1/658Means for redirecting recorded messages to other extensions or equipment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephone Function (AREA)

Abstract

The disclosure discloses a call recording transcription method and device, electronic equipment and a storage medium, and relates to the technical field of electronic equipment, in particular to the technical field of Bluetooth earphones. After responding to a first confirmation instruction for starting the call recording transcription, taking the time for triggering the first confirmation instruction as a starting point of the call audio transcription, starting the transcription of the call recording, transcribing each audio into corresponding text information in real time according to each audio in the call recording, displaying the corresponding text information according to different audios, monitoring whether a second confirmation instruction for stopping the call recording transcription is received, responding to the second confirmation instruction for stopping the call recording transcription, taking the time for triggering the second confirmation instruction as an end point of recording the call audio transcription, finishing the transcription of the call recording, and completing the transcription of the call recording in a client without a third party, so that the real-time performance of the call recording transcription can be ensured, and the efficiency of data transmission between a wireless earphone and the client can be ensured.

Description

Method and device for transcribing call recording, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of electronic devices, and in particular, to the field of bluetooth headset technology, and in particular, to a method and an apparatus for transferring call records, an electronic device, and a storage medium.
Background
In recent years, bluetooth headsets have been rapidly developed, and particularly, the advent of TWS (True Wireless Stereo) headsets has enabled users to enjoy music by both ears without being bound by headset wires.
The TWS headset can be used for enjoying music and transcribing the recorded voice of a call, and the currently commonly used transcription method includes: the method comprises the steps that two paths of audio are obtained in the communication process, uplink and downlink audio are obtained at the same time and are transmitted to a client side connected with a TWS earphone in a coding mode, the client side decodes the audio, the audio is decoded into the two paths of audio and is transmitted to a Software Development Kit (SDK), after the two paths of audio are identified by the SDK, the identification result is returned to the client side according to the identification time sequence, and then the client side sorts the identification result. However, there is a delay in the encoding and decoding of audio or resulting data transmission, and in addition, there is a problem in bandwidth occupation in the call itself, which further results in efficiency of data transmission.
Disclosure of Invention
The disclosure provides a method and a device for transcribing call records, electronic equipment and a storage medium.
According to an aspect of the present disclosure, a method for transcribing call records is provided, where the method is applied to a client paired with a wireless headset, and includes:
responding to a first confirmation instruction for starting call record transcription, taking the time for triggering the first confirmation instruction as a starting point of call audio transcription, and starting the transcription of the call record;
according to each path of audio frequency in the call recording, transcribing each path of audio frequency into corresponding text information in real time, and displaying the corresponding text information according to different audio frequencies;
monitoring whether a second confirmation instruction for stopping the transfer of the call record is received;
and responding to a second confirmation instruction for stopping the transfer of the call recording, taking the time for triggering the second confirmation instruction as an end point for recording the transfer of the call audio, and ending the transfer of the call recording.
According to another aspect of the present disclosure, there is provided a device for transferring call records, the device being applied to a client paired with a wireless headset, including:
the first processing module is used for responding to a first confirmation instruction for starting call record transcription, taking the time for triggering the first confirmation instruction as a starting point of call audio transcription, and starting the transcription of the call record;
the transfer module is used for transferring each path of audio frequency into corresponding text information in real time according to each path of audio frequency in the call recording;
the display module is used for displaying corresponding text information according to different audios;
the first monitoring module is used for monitoring whether a second confirmation instruction for stopping the transfer of the call record is received or not;
and the second processing unit is used for responding to a second confirmation instruction for stopping the transfer of the call record, taking the time for triggering the second confirmation instruction as an end point for recording the transfer of the call audio and finishing the transfer of the call record.
According to another aspect of the present disclosure, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the preceding aspect.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of the preceding aspect.
According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method according to the preceding aspect.
The conversation recording transfer method, the device, the electronic equipment and the storage medium provided by the disclosure take the time of triggering the first confirmation instruction as the starting point of the conversation audio transfer after responding to the first confirmation instruction for starting the conversation recording transfer, start the transfer of the conversation recording, transfer each audio into corresponding character information in real time according to each audio in the conversation recording, display the corresponding character information according to different audios, monitor whether a second confirmation instruction for stopping the conversation recording transfer is received or not, take the time of triggering the second confirmation instruction as the end point of recording the conversation audio transfer in response to the second confirmation instruction for stopping the conversation recording transfer, finish the transfer of the conversation recording, can complete the transfer of the conversation recording in the client without the help of a third party, and can ensure the real-time of the conversation recording transfer, and the efficiency of data transmission between the wireless earphone and the client can be ensured.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a schematic flowchart of a call record transcription method according to an embodiment of the present disclosure;
fig. 2 is a schematic flow chart of another call record transcription method according to the embodiment of the present disclosure;
fig. 3A is a schematic diagram of a first push prompt message provided in the embodiment of the present disclosure;
fig. 3B is a schematic diagram of a second push prompt message provided by the embodiment of the present disclosure;
fig. 3C is a schematic diagram of a third push prompt provided by the embodiment of the present disclosure;
fig. 3D is a schematic diagram of a fourth push prompt provided by the embodiment of the present disclosure;
fig. 4 is a schematic diagram of transcribing each audio channel into corresponding text information according to an embodiment of the present disclosure;
fig. 5 is a schematic diagram of a recording review interface according to an embodiment of the disclosure;
fig. 6 is a schematic structural diagram of a transfer device for call recording according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of another call record transcription apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic block diagram of an example electronic device 800 provided by embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
A transfer method, an apparatus, an electronic device, and a storage medium of a call recording according to an embodiment of the present disclosure are described below with reference to the drawings.
In the related technology, in order to implement transcription of call audio, after two paths of audio are recognized based on the voice SDK, the recognition results are returned to the client according to the recognition time sequence, and then the recognition results are sequenced by the client.
In the present disclosure, in order to avoid the problem of bandwidth occupation of encoding and decoding, the client is directly utilized to complete the transcription of the call recording, so that the transcription of the call recording can be directly completed based on the client, thereby not only ensuring the real-time property of the transcription, but also ensuring the real-time property of data transmission.
Fig. 1 is a schematic flow chart of a call record transcription method according to an embodiment of the present disclosure.
As shown in fig. 1, the method comprises the following steps:
step 101, responding to a first confirmation instruction for starting call record transcription, taking the time for triggering the first confirmation instruction as a starting point of call audio transcription, and starting the transcription of the call record.
When a user wears a wireless headset to carry out a call, the user establishes connection with an Application program (APP) client installed on a terminal of electronic equipment, and the client monitors whether the electronic equipment initiates the call in real time, wherein the call is not limited to answering or dialing, but also limited to a call in an APP or an ordinary call, and in the embodiment of the Application, the call is collectively called as the call.
After the answering call based on the wireless earphone is determined, the client side can carry out prompt information of call recording transfer, when the user determines to start the transfer function of the client side, a confirmation key in the prompt information is clicked, the call audio transfer function can be triggered, and the transfer can be finished by clicking the confirmation key in the prompt information from the use angle of the user; however, on the machine implementation side, after the confirmation key is monitored to be triggered by the user, in response to a first confirmation instruction for starting call record transcription, the transcription of the call record is started by taking the time for triggering the first confirmation instruction as the starting point of the call audio transcription.
As an implementation manner of the embodiment of the present invention, the electronic device may be any electronic device that can communicate with a wireless headset, such as a smart phone, an iPad, a notebook computer, a personal computer, and a smart wearable device.
And step 102, according to each path of audio frequency in the call recording, transcribing each path of audio frequency into corresponding text information in real time, and displaying the corresponding text information according to different audio frequencies.
In the communication process, the client can obtain two paths of audio in the passing process, wherein the two paths of audio are one path of uplink audio sent by the smart phone and one path of downlink audio received by the smart phone respectively.
As an implementation manner of the embodiment of the application, the downlink audio collected by the client may be a single voice input audio, that is, a call voice dialed by one person is received, for example, a user a dials a voice call of a client owner B; as another implementation manner of the embodiment of the present application, the downlink audio collected by the client may be multiple voice input audios, that is, in the same call, multiple users participate in the call, such as a voice conference, a video conference, and the like.
When the audio is transcribed, the transcription is performed according to the input of the wireless headset and the transcription is performed according to the output audio, it is noted that when the transcription is performed according to the input audio, the specific number of the calls is not distinguished, and it depends on the input audio only limited to the headset, for example, in the same call, the input audio of 3 persons, the user a, the user C and the user D is participated in the transcription, and when the client transcribes the call record, the input audio of the user a, the user C and the user D is all transcribed as the input audio, and different users are not distinguished. The above description is only an exemplary description given for ease of understanding, and the number of participants in a particular call is not a limitation of the embodiments of the present application.
And 103, monitoring whether a second confirmation instruction for stopping the transfer of the call record is received.
In the specific implementation process, the interface for transferring the call recording also comprises a button for stopping recording, and when the user wants to finish the transfer, the user clicks the button for stopping recording.
And 104, responding to a second confirmation instruction for stopping the transfer of the call record, taking the time for triggering the second confirmation instruction as an end point for recording the transfer of the call audio, and ending the transfer of the call record.
In the method for transcribing the call record, after responding to the first confirmation instruction for starting the transcription of the call record, starting the transcription of the call recording by taking the time for triggering the first confirmation instruction as the starting point of the transcription of the call audio, according to the audio frequency of each channel in the call recording, the audio frequency of each channel is transcribed into corresponding character information in real time, and whether a second confirmation instruction for stopping the transfer of the call record is received or not is monitored according to different audio display corresponding character information, the second confirmation instruction for stopping the transfer of the call record is responded, the time for triggering the second confirmation instruction is used as a terminal point for recording the transfer of the call audio, and the transfer of the call record is finished.
In order to improve the real-time performance of transcribing the recorded audio, an embodiment of the present invention further provides a method for transcribing a call record, where as shown in fig. 2, the method includes:
step 201, detecting whether to use the wireless earphone to answer the call.
In the embodiment of the present application, the wireless headset serves as a transmission source of the call audio, and therefore, detecting whether to use the wireless headset to answer the call is a prerequisite for performing the transfer. The method of the embodiments of the present application is performed only when the wireless headset is used to answer a call, which is not limited to wearing the wireless headset in one ear or wearing the wireless headset in two ears.
When a user uses the wireless earphone to answer a call, the user can answer the call without aligning with a microphone of the smart phone, so that the user can release both hands thoroughly, and the use experience of the user is improved.
Step 202, if the wireless earphone is determined to be used for answering a call, pushing prompt information whether to start call recording transfer, wherein the prompt information can be pushed to at least one of a call dialing interface, a home page of the client, any function page of the client and a recording transfer list page.
The purpose of pushing the prompt message is to facilitate the user to use the call record transfer function provided by the client, so that the user can use the function conveniently. The function can be aimed at an application scene, for example, the function can be applied to interview recording, note arrangement, daily events and the like.
For clearly explaining the push prompt information, the embodiment of the present application is described in a form of illustration, and as shown in fig. 3A to 3D, fig. 3A illustrates a schematic diagram of a first type of push prompt information provided by the embodiment of the present application, fig. 3B illustrates a schematic diagram of a first type of push prompt information provided by the embodiment of the present application, fig. 3C illustrates a schematic diagram of a third type of push prompt information provided by the embodiment of the present application, and fig. 3D illustrates a schematic diagram of a fourth type of push prompt information provided by the embodiment of the present application. It should be noted that fig. 3A to fig. 3D are only exemplary examples, and are not limiting to the smartphone and the interface display content.
Step 203, responding to a first confirmation instruction for starting the transfer of the call record, taking the time for triggering the first confirmation instruction as a starting point of the transfer of the call audio, and starting the transfer of the call record.
With reference to fig. 3A, when the user clicks the prompt, a first confirmation instruction for starting the transfer of the call record is triggered.
Regarding step 203 and step 205, please refer to the detailed description of fig. 1, which will not be repeated herein.
And step 204, transcribing each path of audio frequency into corresponding text information in real time according to each path of audio frequency in the call recording, and displaying the corresponding text information according to different audio frequencies.
For easy understanding, reference may be made to fig. 4, where fig. 4 shows a schematic diagram of transcribing audio channels into corresponding text information according to an embodiment of the present application, and fig. 4 only shows one form of text information presentation, that is, presentation in a dialog form (bidirectional dialog recording). As another display mode that can be realized, the text information may also be sequentially output, and in particular, the display mode of the text information is not limited in this application embodiment.
Step 205, monitoring whether a second confirmation instruction for stopping the transfer of the call record is received.
If it is determined that the second confirmation instruction is received, step 206 is executed, and if it is determined that the second confirmation instruction is not received, step 207 is executed.
As shown in fig. 4, when the user clicks the "stop recording" button, the client triggers the second confirmation command to end the transcription.
And step 206, responding to a second confirmation instruction for stopping the transfer of the call record, taking the time for triggering the second confirmation instruction as an end point for recording the transfer of the call audio, and ending the transfer of the call record.
It should be noted that, after the user clicks the record stop button, the current call of the electronic device is not ended, and only the record transfer function is ended. And proceeds to step 209.
In order to enable the user to restart the transcription of the call recording, as shown in fig. 5, fig. 5 shows a schematic view of a recording review interface provided in the embodiment of the present application, and in the interface, a prompt message for pushing whether to restart the transcription of the call recording is also provided, so that the actual requirements of the user are further met, and the use experience of the user on the transcription function is improved.
Step 207, monitoring whether a third confirmation instruction of the call end is received.
Steps 205 and 207 are only sequence numbers, and do not represent that step 205 is executed first, and step 207 is executed directly if the user does not detect the execution of step 205 before triggering the third confirmation instruction of the call end in step 207.
And step 208, responding to a third confirmation instruction for finishing the call, taking the time for triggering the third confirmation instruction as an end point for recording call audio transcription, and finishing the transcription of the call recording.
The wireless earphone is used as the input of audio, and after a third confirmation instruction of finishing the call is received, the transfer function is automatically finished, so that the occupied bandwidth can be saved, and the resource waste is avoided.
Step 209, the telephone of the wireless earphone is kept to be answered.
The user can keep answering the call continuously, so that the original call experience of the user using the electronic equipment is maintained.
Step 210, jumping to a recording review interface, where the recording review interface includes: and at least one of the recording time of the call recording, the caller, the recording player, the character records corresponding to different audios and the character derivation key is used.
It should be noted that, if the call state is detected in the jump to the recording review interface, the use of the recording player is suspended, and a prompt message indicating whether to restart the call recording transfer is pushed in the recording review interface. The step provides prompt information for pushing whether to restart the call recording transcription, further meets the actual requirements of the user, and improves the use experience of the user on the transcription function.
Step 211, if the connection between the wireless earphone and the client matched with the wireless earphone is interrupted in the process of transferring the call record, reestablishing the connection between the wireless earphone and the client;
if the connection is successful within the predetermined time period, the step 204 is continued, and if the connection is failed or quit within the predetermined time period, the step 210 is executed.
The connection interruption of the wireless earphone and the client comprises two types, one is disconnection of the earphone, and the reason for the disconnection may be that the distance between the wireless earphone and the client exceeds the connection distance; the other is disconnection of wireless headset bluetooth Audio transmission model protocol (A2 DP).
The preset time period is an experience value, and in order to automatically continue the transcription function after the connection is established, the preset time period is not suitable to be set too long, so that the user experience of the user on the transcription function is not influenced, and the data transmission efficiency is not influenced; for example, the predetermined time period may be set to 30 seconds, or 1 minute, etc., which is not limited by the specific embodiment of the present invention.
Fig. 6 is a schematic structural diagram of a transfer device for call records according to an embodiment of the present disclosure, and as shown in fig. 6, the transfer device is applied to a client paired with a wireless headset, and includes: a first processing module 61, a transcription module 62, a display module 63, a first monitoring module 64 and a second processing unit 65;
the first processing module 61 is configured to respond to a first confirmation instruction for starting call record transcription, use a time for triggering the first confirmation instruction as a starting point of call audio transcription, and start the transcription of the call record;
a transcription module 62, configured to transcribe, in real time, each audio in the call recording into corresponding text information according to each audio;
the display module 63 is used for displaying corresponding text information according to different audios;
the first monitoring module 64 is configured to monitor whether a second confirmation instruction for stopping transfer of the call record is received;
and the second processing unit 65 is configured to, in response to a second confirmation instruction for stopping transfer of the call record, take a time for triggering the second confirmation instruction as an end point of recording the call audio transfer, and end the transfer of the call record.
Further, in a possible implementation manner of this embodiment, as shown in fig. 7, if the second confirmation instruction to stop the transfer of the call record is not received, the apparatus further includes: a first processing module 71, a transcription module 72, a presentation module 73, a first monitoring module 74 and a second processing unit 75; for the first processing module 71, the transcription module 72, the display module 73, the first monitoring module 74, and the second processing unit 75, reference may be made to the specific descriptions of the first processing module 61, the transcription module 62, the display module 63, the first monitoring module 64, and the second processing unit 65 corresponding to fig. 6, which is not described herein again in this embodiment of the present application.
A second monitoring module 76, configured to monitor whether a third confirmation instruction of ending the call is received;
and the third processing module 77 is configured to, in response to the third confirmation instruction for ending the call, take the time for triggering the third confirmation instruction as an end point of recording call audio transcription, and end the transcription of the call recording.
Further, in a possible implementation manner of this embodiment, as shown in fig. 7, the apparatus further includes:
a detection module 78 for detecting whether to use the wireless headset to answer a call;
the first pushing module 79 is configured to push a prompt message indicating whether to start call recording transfer when it is determined that the wireless headset is used to answer a call, where the prompt message may be pushed to at least one of a call dialing interface, a home page of the client, any function page of the client, and a recording transfer list page.
Further, in a possible implementation manner of this embodiment, as shown in fig. 7, the apparatus further includes:
a maintaining module 710, configured to, after the second processing module 75 takes the time for triggering the second confirmation instruction as an end point of transferring a recorded call audio and finishes transferring the call recording, keep the wireless headset answering the call;
the skip module 711 is configured to skip to a recording review interface, where the recording review interface includes: and at least one of the recording time of the call recording, the caller, the recording player, the character records corresponding to different audios and the character derivation key is used.
Further, in a possible implementation manner of this embodiment, as shown in fig. 7, the apparatus further includes:
a suspending module 712 for suspending use of the record player when an on-call state is detected;
a second pushing module 713, configured to push a prompt message indicating whether to restart call recording transcription in the recording review interface.
Further, in a possible implementation manner of this embodiment, as shown in fig. 7, the apparatus further includes:
an establishing module 714, configured to reestablish connection between the wireless headset and the client when the connection between the wireless headset and the client that is paired with the wireless headset is interrupted in a call record transfer process;
an executing module 715, configured to continue to execute the call recording transcription when the connection is successful within a predetermined time period;
a second skipping module 716, configured to skip to the record review interface when the connection fails or exits within the predetermined time period.
The transfer device of the call recording starts the transfer of the call recording by using the time for triggering the first confirmation instruction as the starting point of the transfer of the call audio after responding to the first confirmation instruction for starting the transfer of the call recording, according to the audio frequency of each channel in the call recording, the audio frequency of each channel is transcribed into corresponding character information in real time, and whether a second confirmation instruction for stopping the transfer of the call record is received or not is monitored according to different audio display corresponding character information, the second confirmation instruction for stopping the transfer of the call record is responded, the time for triggering the second confirmation instruction is used as a terminal point for recording the transfer of the call audio, and the transfer of the call record is finished.
It should be noted that the foregoing explanation of the method embodiment is also applicable to the apparatus of the present embodiment, and the principle is the same, and the present embodiment is not limited thereto.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the device 800 includes a computing unit 801 that can perform various appropriate actions and processes in accordance with a computer program stored in a ROM (Read-Only Memory) 802 or a computer program loaded from a storage unit 808 into a RAM (Random Access Memory) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An I/O (Input/Output) interface 805 is also connected to the bus 804.
A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing Unit 801 include, but are not limited to, a CPU (Central Processing Unit), a GPU (graphics Processing Unit), various dedicated AI (Artificial Intelligence) computing chips, various computing Units running machine learning model algorithms, a DSP (Digital Signal Processor), and any suitable Processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as the transcription method of call recording. For example, in some embodiments, the transcription method of the call recording may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by the computing unit 801, may perform one or more steps of the methods described above. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the aforementioned transcription method of call recording by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be realized in digital electronic circuitry, Integrated circuitry, FPGAs (Field Programmable Gate arrays), ASICs (Application-Specific Integrated circuits), ASSPs (Application Specific Standard products), SOCs (System On Chip, System On a Chip), CPLDs (Complex Programmable Logic devices), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a RAM, a ROM, an EPROM (Electrically Programmable Read-Only-Memory) or flash Memory, an optical fiber, a CD-ROM (Compact Disc Read-Only-Memory), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a Display device (e.g., a CRT (Cathode Ray Tube) or LCD (Liquid Crystal Display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: LAN (Local Area Network), WAN (Wide Area Network), internet, and blockchain Network.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be noted that artificial intelligence is a subject for studying a computer to simulate some human thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), and includes both hardware and software technologies. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (15)

1. A method for transcribing call records, wherein the method is applied to a client paired with a wireless headset and comprises the following steps:
responding to a first confirmation instruction for starting call record transcription, taking the time for triggering the first confirmation instruction as a starting point of call audio transcription, and starting the transcription of the call record;
according to each path of audio frequency in the call recording, transcribing each path of audio frequency into corresponding text information in real time, and displaying the corresponding text information according to different audio frequencies;
monitoring whether a second confirmation instruction for stopping the transfer of the call record is received;
and responding to a second confirmation instruction for stopping the transfer of the call recording, taking the time for triggering the second confirmation instruction as an end point for recording the transfer of the call audio, and ending the transfer of the call recording.
2. The method for transcribing a call record according to claim 1, wherein if the second confirmation instruction for stopping transcription of a call record is not received, the method further comprises:
monitoring whether a third confirmation instruction of the call end is received;
and responding to a third confirmation instruction of the call end, taking the time for triggering the third confirmation instruction as an end point of recording call audio transcription, and ending the transcription of the call recording.
3. The method of transcribing a call record of claim 1, wherein the method further comprises:
detecting whether to use the wireless earphone to answer a call;
and if the wireless earphone is determined to be used for answering the call, pushing prompt information whether to start call recording transfer, wherein the prompt information can be pushed to at least one of a call dialing interface, a home page of the client, any function page of the client and a recording transfer list page.
4. The transfer method of the call recording according to claim 1, wherein after the transfer of the call recording is ended with the time at which the second confirmation instruction is triggered as an end point of recording the call audio transfer, the method further comprises:
maintaining a telephone call to the wireless headset;
skipping to a recording review interface, wherein the recording review interface comprises: and at least one of the recording time of the call recording, the caller, the recording player, the character records corresponding to different audios and the character derivation key is used.
5. The method of transcribing a call record of claim 4, wherein the method further comprises:
if the call state is detected, the use of the recording player is suspended;
and pushing prompt information for judging whether to restart the call recording transfer in the recording review interface.
6. The method of claim 5, wherein the method further comprises:
if the connection between the wireless earphone and the client matched with the wireless earphone is interrupted in the process of transferring the call record, reestablishing the connection between the wireless earphone and the client;
if the connection is successful within the preset time period, continuing to execute the call recording transcription;
and if the connection fails or quits within the preset time period, jumping to the recording review interface.
7. A transcription device of call recording, wherein the device is applied to a client paired with a wireless headset, and comprises:
the first processing module is used for responding to a first confirmation instruction for starting call record transcription, taking the time for triggering the first confirmation instruction as a starting point of call audio transcription, and starting the transcription of the call record;
the transfer module is used for transferring each path of audio frequency into corresponding text information in real time according to each path of audio frequency in the call recording;
the display module is used for displaying corresponding text information according to different audios;
the first monitoring module is used for monitoring whether a second confirmation instruction for stopping the transfer of the call record is received or not;
and the second processing unit is used for responding to a second confirmation instruction for stopping the transfer of the call record, taking the time for triggering the second confirmation instruction as an end point for recording the transfer of the call audio and finishing the transfer of the call record.
8. The apparatus for transcribing a call record according to claim 7, wherein if the second confirmation instruction to stop transcribing a call record is not received, the apparatus further comprises:
the second monitoring module is used for monitoring whether a third confirmation instruction of the call end is received;
and the third processing module is used for responding to a third confirmation instruction of the call end, taking the time for triggering the third confirmation instruction as a terminal point for recording the call audio transcription, and ending the transcription of the call recording.
9. The apparatus for transcribing a call record according to claim 7, wherein the apparatus further comprises:
the detection module is used for detecting whether the wireless earphone is used for answering a call;
the first pushing module is used for pushing prompt information whether to start call recording transfer or not when the wireless earphone is determined to be used for answering a call, wherein the prompt information can be pushed to at least one of a call dialing interface, a home page of the client, any function page of the client and a recording transfer list page.
10. The apparatus for transcribing a call record according to claim 7, wherein the apparatus further comprises:
the maintaining module is used for keeping the telephone of the wireless earphone to answer after the second processing module takes the time for triggering the second confirmation instruction as a terminal point for recording the call audio transcription and finishes the transcription of the call recording;
the skipping module is used for skipping to the recording review interface, and the recording review interface comprises: and at least one of the recording time of the call recording, the caller, the recording player, the character records corresponding to different audios and the character derivation key is used.
11. A transcription apparatus for call recording according to claim 10, wherein said apparatus further comprises:
the pause module is used for pausing the use of the recording player when the call state is detected;
and the second pushing module is used for pushing prompt information for judging whether to restart the call recording transcription in the recording review interface.
12. The apparatus of claim 11, wherein the apparatus further comprises:
the establishing module is used for reestablishing the connection between the wireless earphone and the client when the connection between the wireless earphone and the client matched with the wireless earphone is interrupted in the process of transferring the call record;
the execution module is used for continuing to execute the call recording transcription when the connection is successful within a preset time period;
and the second skipping module is used for skipping to the recording review interface when the connection fails or quits within the preset time period.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.
15. A computer program product comprising a computer program which, when being executed by a processor, carries out the steps of the method according to any one of claims 1-6.
CN202110944899.2A 2021-08-17 2021-08-17 Method and device for transcribing call recording, electronic equipment and storage medium Pending CN113808592A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110944899.2A CN113808592A (en) 2021-08-17 2021-08-17 Method and device for transcribing call recording, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110944899.2A CN113808592A (en) 2021-08-17 2021-08-17 Method and device for transcribing call recording, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113808592A true CN113808592A (en) 2021-12-17

Family

ID=78893705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110944899.2A Pending CN113808592A (en) 2021-08-17 2021-08-17 Method and device for transcribing call recording, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113808592A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117354717A (en) * 2023-12-04 2024-01-05 宁波菊风系统软件有限公司 Real-time position sharing method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564952A (en) * 2018-03-12 2018-09-21 新华智云科技有限公司 The method and apparatus of speech roles separation
CN110648665A (en) * 2019-09-09 2020-01-03 北京左医科技有限公司 Session process recording system and method
CN110708615A (en) * 2019-08-29 2020-01-17 广东思派康电子科技有限公司 Intercommunication system and intercommunication method realized based on TWS earphone
CN111883135A (en) * 2020-07-28 2020-11-03 北京声智科技有限公司 Voice transcription method and device and electronic equipment
CN112119641A (en) * 2018-09-20 2020-12-22 华为技术有限公司 Method and device for realizing automatic translation through multiple TWS (time and frequency) earphones connected in forwarding mode
CN112188011A (en) * 2019-07-04 2021-01-05 北京航天长峰科技工业集团有限公司 Call center quality inspection and assessment method based on voice recognition
CN112562677A (en) * 2020-11-25 2021-03-26 安徽听见科技有限公司 Conference voice transcription method, device, equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564952A (en) * 2018-03-12 2018-09-21 新华智云科技有限公司 The method and apparatus of speech roles separation
CN112119641A (en) * 2018-09-20 2020-12-22 华为技术有限公司 Method and device for realizing automatic translation through multiple TWS (time and frequency) earphones connected in forwarding mode
CN112188011A (en) * 2019-07-04 2021-01-05 北京航天长峰科技工业集团有限公司 Call center quality inspection and assessment method based on voice recognition
CN110708615A (en) * 2019-08-29 2020-01-17 广东思派康电子科技有限公司 Intercommunication system and intercommunication method realized based on TWS earphone
CN110648665A (en) * 2019-09-09 2020-01-03 北京左医科技有限公司 Session process recording system and method
CN111883135A (en) * 2020-07-28 2020-11-03 北京声智科技有限公司 Voice transcription method and device and electronic equipment
CN112562677A (en) * 2020-11-25 2021-03-26 安徽听见科技有限公司 Conference voice transcription method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117354717A (en) * 2023-12-04 2024-01-05 宁波菊风系统软件有限公司 Real-time position sharing method, device, equipment and storage medium
CN117354717B (en) * 2023-12-04 2024-03-19 宁波菊风系统软件有限公司 Real-time position sharing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
EP3050051B1 (en) In-call virtual assistants
EP3084633B1 (en) Attribute-based audio channel arbitration
KR102265931B1 (en) Method and user terminal for performing telephone conversation using voice recognition
JP7353497B2 (en) Server-side processing method and server for actively proposing the start of a dialogue, and voice interaction system capable of actively proposing the start of a dialogue
CN110164437A (en) A kind of audio recognition method and terminal of instant messaging
CN103346953B (en) A kind of method of group communication data interaction, Apparatus and system
WO2017050150A1 (en) Secure voice communication method and device based on instant communication
WO2017181615A1 (en) Method and device for processing unfamiliar incoming call, and mobile terminal
US10257350B2 (en) Playing back portions of a recorded conversation based on keywords
WO2017172655A1 (en) Analysis of a facial image to extract physical and emotional characteristics of a user
CN113808592A (en) Method and device for transcribing call recording, electronic equipment and storage medium
CN110225213B (en) Recognition method of voice call scene and audio policy server
CN107959720A (en) The method and system of calling record cloud storage
KR20150088532A (en) Apparatus for providing service during call and method for using the apparatus
CN113810814B (en) Earphone mode switching control method and device, electronic equipment and storage medium
US11783837B2 (en) Transcription generation technique selection
CN113709506A (en) Multimedia playing method, device, medium and program product based on cloud mobile phone
US20230254411A1 (en) Group calling system, group calling method, and program
CN106465087B (en) Communication message transfer method and related device
CN109104535B (en) Information processing method, electronic equipment and system
CN205123843U (en) Both -way communication control system
CN116567148A (en) Intelligent outbound control method, device, medium and electronic equipment
JP2016144024A (en) Telephone apparatus with voice memo storage function
CN111951800A (en) Audio and video communication method based on head-wearing computer
CN116013342A (en) Data processing method and device for audio and video call, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination