CN109599115B - Conference recording method and device for audio acquisition equipment and user terminal - Google Patents

Conference recording method and device for audio acquisition equipment and user terminal Download PDF

Info

Publication number
CN109599115B
CN109599115B CN201811585400.8A CN201811585400A CN109599115B CN 109599115 B CN109599115 B CN 109599115B CN 201811585400 A CN201811585400 A CN 201811585400A CN 109599115 B CN109599115 B CN 109599115B
Authority
CN
China
Prior art keywords
audio
user terminal
data
text data
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811585400.8A
Other languages
Chinese (zh)
Other versions
CN109599115A (en
Inventor
张蓓蓓
张计锋
赵恒艺
孙岩
周祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sipic Technology Co Ltd
Original Assignee
Sipic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sipic Technology Co Ltd filed Critical Sipic Technology Co Ltd
Priority to CN201811585400.8A priority Critical patent/CN109599115B/en
Publication of CN109599115A publication Critical patent/CN109599115A/en
Application granted granted Critical
Publication of CN109599115B publication Critical patent/CN109599115B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Abstract

The invention discloses a conference recording method and a device for audio acquisition equipment and a user terminal, wherein the conference recording method is used for the audio acquisition equipment and comprises the following steps: the audio acquisition equipment is connected with the user terminal; the audio acquisition equipment acquires audio data in real time; sending the audio data to a cloud transcription service and acquiring text data returned by the cloud transcription service, wherein the cloud transcription service is used for performing voice-to-text processing on the audio data; synchronizing the text data to the user terminal in real time via a multi-terminal collaboration service. The scheme provided by the embodiment of the application solves the problem of synchronization of multi-end editing content. But more help is provided for product design, and multi-terminal cooperation provides good support for real-time editing in the audio transcription process, so that a user can change the audio file while listening in a conference, interview and other use scenes, and finally the aim of quickly outputting a target document is achieved.

Description

Conference recording method and device for audio acquisition equipment and user terminal
Technical Field
The invention belongs to the technical field of voice data, and particularly relates to a conference recording method and device for audio acquisition equipment and a user terminal.
Background
In the related technology, the conference recording scheme provided by some schemes can support the audio acquisition of a mobile phone end and the real-time transcription of characters, can support the simultaneous editing of an APP end and a Web end after the recording is completed, and can support the bidirectional synchronization of the editing contents of the APP end and the Web end.
The inventor finds that the scheme at least has the following defects in the process of implementing the application:
1. the audio real-time transcription does not support text editing at the same time, and does not accord with the actual use habit of listening and remembering by a user.
2. The transcription text supports synchronization to the Web end only after the recording is finished and stored, and the user can not conveniently check the transcription text in a meeting scene.
3. Only the mobile phone end is supported to collect audio, so that the sound receiving effect is poor, the recording is not clear, and the transcription result is also influenced.
Disclosure of Invention
The embodiment of the invention provides a conference recording method and device for audio acquisition equipment and a user terminal, which are used for solving at least one of the technical problems.
In a first aspect, an embodiment of the present invention provides a conference recording method for an audio acquisition device, including: the audio acquisition equipment is connected with the user terminal; the audio acquisition equipment acquires audio data in real time; sending the audio data to a cloud transcription service and acquiring text data returned by the cloud transcription service, wherein the cloud transcription service is used for performing voice-to-text processing on the audio data; synchronizing the text data to the user terminal in real time via a multi-terminal collaboration service.
In a second aspect, an embodiment of the present invention provides a conference recording method, which is used for a user terminal, and includes: the user terminal establishes connection with the audio acquisition equipment; receiving first text data synchronized via a multi-terminal collaboration service and inserting the first text data to the end of historical text data; and/or responding to the editing of the historical text data by the user, and transmitting the changed historical text data to the multi-terminal cooperation service in real time so as to synchronize the changed historical text data to other user terminals in real time.
In a third aspect, an embodiment of the present invention provides a conference recording apparatus for an audio capture device, including: the first connection module is configured to establish connection between the audio acquisition equipment and the user terminal; the acquisition module is configured to acquire audio data in real time by the audio acquisition equipment; the transfer module is configured to send the audio data to a cloud transfer service and acquire text data returned by the cloud transfer service, wherein the cloud transfer service is used for performing voice-to-text processing on the audio data; and a real-time synchronization module configured to synchronize the text data to the user terminal in real time via a multi-terminal collaboration service.
In a fourth aspect, an embodiment of the present invention provides a conference recording apparatus for a user terminal, including: the second connection module is configured to establish connection between the user terminal and the audio acquisition equipment; a receiving insertion module configured to receive first text data synchronized via a multi-terminal collaboration service and insert the first text data to an end of historical text data; and/or the change synchronization module is configured to respond to editing of historical text data by a user, and transmit the changed historical text data to the multi-terminal cooperation service in real time so as to synchronize the changed historical text data to other user terminals in real time.
In a fifth aspect, an electronic device is provided, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the conference recording method for an audio capture device and a user terminal of any of the embodiments of the present invention.
In a sixth aspect, the present invention further provides a computer program product, where the computer program product includes a computer program stored on a non-volatile computer-readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by a computer, the computer executes the steps of the conference recording method for an audio acquisition device and a user terminal according to any embodiment of the present invention.
The conference recording scheme for the audio acquisition equipment and the user terminal solves the problem of synchronization of multi-end editing contents. But more help is provided for product design, and multi-terminal cooperation provides good support for real-time editing in the audio transcription process, so that a user can change the audio file while listening in a conference, interview and other use scenes, and finally the aim of quickly outputting a target document is achieved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a flowchart of a conference recording method for an audio capture device according to an embodiment of the present invention;
fig. 2 is a flowchart of a conference recording method for a user terminal according to an embodiment of the present invention;
fig. 3 is a flowchart of another conference recording method for a user terminal according to an embodiment of the present invention;
fig. 4 is a flowchart of a conference recording method for a user terminal according to another embodiment of the present invention;
fig. 5 is an interaction diagram of each end of a specific embodiment of a conference recording scheme according to an embodiment of the present invention;
fig. 6 is a block diagram of a conference recording apparatus for an audio capturing device according to an embodiment of the present invention;
fig. 7 is a block diagram of a conference recording apparatus for a user terminal according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flowchart of an embodiment of a conference recording method for an audio acquisition device and a user terminal according to the present application is shown, where the conference recording method of the present embodiment may be applied to an audio acquisition device, such as a recording pen, a call bank, a conference bank, and the like, and the present application is not limited herein.
As shown in fig. 1, in step 101, an audio acquisition device establishes a connection with a user terminal;
in step 102, an audio acquisition device acquires audio data in real time;
in step 103, sending the audio data to a cloud transcription service and acquiring text data returned by the cloud transcription service, wherein the cloud transcription service is used for performing voice-to-text processing on the audio data;
in step 104, the text data is synchronized to the user terminal in real time via the multi-terminal collaboration service.
In this embodiment, for step 101, the audio capture device first establishes a connection with each user terminal. Then, for step 102, the audio acquisition device acquires the audio data of the user in real time, and then, for step 103, sends the audio data acquired in real time to the cloud transcription service, and then obtains the text data returned by the cloud transcription service, where the cloud transcription service is used to perform voice-to-text processing on the acquired audio data. Then, for step 104, the audio capture device synchronizes the transcribed text data to the user terminal in real time via the multi-terminal collaboration service. For example, the audio acquisition device is independent hardware, for example, the conference treasured is opened, at first establish connection with the user terminal in scope, for example, establish connection with cell-phone APP end, the Web end, can be through establishing bluetooth or WiFi connection based on same account number again after logging in same account number, afterwards, this conference treasured gathers user's audio data in real time, the audio data of gathering is uploaded to the cloud transcription service and is carried out the pronunciation and changeed the text and form text data, text data is passed back the conference treasured and is passed to the multi-terminal cooperation service by the conference treasured again, later through real-time synchronization of multi-terminal cooperation service to APP end and Web end, thereby accomplish the real-time transcription and the synchronization of conference.
The method of the embodiment can realize the transferring, editing and synchronizing of the conference content to each terminal through real-time acquisition, transferring and synchronization. Transcription and synchronization provide support for real-time viewing of meeting content. Furthermore, the cloud cooperation service can also synchronize the modification of a user on a certain terminal to other terminals, so that the modification synchronization is realized among the terminals.
In some optional embodiments, after the audio acquisition terminal acquires the audio data in real time, the method further includes: the audio data is synchronized to the big data center via the cloud transcription service. Therefore, the audio data are synchronized to the cloud big data center, so that the user terminal can download the original audio data at any time and confirm and correct the conference record.
In some optional embodiments, after the audio capture device establishes the connection with the user terminal, the method further comprises: account information of the user terminal is acquired via the connection to perform data transmission between the audio acquisition device and the user terminal based on the account information. Therefore, the audio acquisition equipment and the user terminal are associated based on the same account information, and data transmission and data safety are facilitated.
In other optional embodiments, the audio capture device comprises a user terminal, and establishing a connection between the audio capture device and the user terminal comprises establishing a connection between the user terminal and another user terminal. Therefore, the user terminal can also be used as audio acquisition equipment, and can directly use portable equipment such as a mobile phone or a computer to acquire audio under the condition of low requirement on recording quality or forgetting to carry professional audio acquisition equipment.
In other alternative embodiments, the connection may include a bluetooth connection and a WiFi connection. Therefore, the connection between the audio acquisition equipment and the user terminal can be carried out by selecting Bluetooth or WiFi.
Referring to fig. 2, a conference recording method for a user terminal according to an embodiment of the present application is shown. The method is suitable for intelligent user equipment such as a mobile phone, a pad, a computer and the like, and the application is not limited herein.
As shown in fig. 2, in step 201, a user terminal establishes a connection with an audio acquisition device;
receiving first text data synchronized via a multi-terminal collaboration service and inserting the first text data to the end of the historical text data in step 202; and/or
In step 203, in response to the editing of the historical text data by the user, the changed historical text data is transmitted to the multi-terminal cooperation service in real time so as to synchronize the changed historical text data to other user terminals in real time.
In this embodiment, for step 201, the user terminal and the audio acquisition device establish a connection through connection modes such as bluetooth and WiFi, where the audio acquisition device may be another user terminal, and the application is not limited herein. Thereafter, for step 202, the user terminal receives the first text data (transcription text data) sent by the audio capture device and synchronized via the multi-peer collaboration service, and inserts the first text data to the end of the historical text data, and if there is no data in the historical text data, the first text data is directly placed at the beginning of the historical text data, which is not limited in this application.
In step 203, if the user edits and updates the historical text data, the changed historical text data is transmitted to the multi-terminal collaboration service in real time, so that the changed historical text data is synchronized to other user terminals in real time through the multi-terminal collaboration service.
According to the method, on one hand, the transcribed text data synchronized by the multi-terminal collaboration service can be inserted into the tail of the historical text data, and on the other hand, the text data edited and modified by the user can be synchronized to other user terminals through the multi-terminal collaboration service. Two focuses are arranged, one focus is always positioned at the tail of the historical text data and used for inserting the conference recording data, and the focus can be set to be invisible; another focus may be an editing cursor that a user may place at any location of the historical text data to edit the data at any location.
The synchronization of the edited content may be to synchronize the modified content to other user terminals in real time, and since the editing and the transferring are two different focuses, they do not affect each other, for example, the problem that one content overwrites the other content does not occur. Of course, because the computer end is more convenient to edit, and the mobile phone end is inconvenient to edit, only one terminal can use the editing function in a conference, or only one device is supported for editing at a certain time, and other devices are in an uneditable state when the device is edited, so that the application is not limited.
Further referring to fig. 3, another conference recording method for a user terminal according to an embodiment of the present application is shown. The flow chart is mainly a flow chart which supplements the additional technical features of the flow chart 2. The flow chart is mainly a flow of steps for the user terminal to collect audio.
As shown in fig. 3, in step 301, audio data of a user is collected in real time in response to a recording instruction of the user;
in step 302, sending the audio data to a cloud transcription service and acquiring second text data returned by the cloud transcription service, wherein the cloud transcription service is used for performing voice-to-text processing on the audio data;
in step 303, the second text data is synchronized to other user terminals via the multi-terminal collaboration service in real time.
In this embodiment, for step 301, the user terminal collects the audio data of the user in real time in response to a recording instruction of the user, for example, when the user speaks in the course of a conference. Then, for step 302, the audio data collected in real time is sent to the cloud transcription service for voice-to-text processing, and second text data returned by the cloud transcription service is obtained, where the second text data is also transcription text data. Finally, for step 303, the second text data is synchronized to other user terminals in real time via the multi-peer collaboration service, for example, from the Web peer to the APP peer, or from the APP peer to the Web peer, which is not limited herein.
According to the method, the audio data are recorded by the user terminal and then converted into the text and synchronized to other user terminals, so that users of any user terminal can speak in the process of a conference, the conference in actual life is more consistent, and the user experience is better.
Further referring to fig. 4, it shows still another conference recording method for a user terminal according to an embodiment of the present application. The flow chart is primarily a step further defining additional technical features following step 202 of the flow chart 2.
As shown in fig. 4, in step 401, an audio acquisition request is sent to a big data center in response to an audio acquisition instruction of a user;
in step 402, audio data returned by the big data center is received.
In this embodiment, for step 401, when the user terminal receives an audio data acquisition instruction from a user, it sends an audio acquisition request to the big data center. Thereafter, for step 402, audio data returned by the big data center is received.
The method of the embodiment provides a way for the user to acquire the audio data, so that the user can conveniently record the conference content by using the conference recording scheme and can also acquire the original audio data to perform auxiliary confirmation or modification on the text data of the conference recording.
In some optional embodiments, after the user terminal establishes the connection with the audio capture device, the method further comprises: and sending account information of the user terminal to the audio acquisition equipment through the connection so as to perform data transmission among the user terminal, other user terminals and the audio acquisition equipment based on the account information. Therefore, the user terminal and the audio acquisition equipment can establish connection and transmit data through the same account number, and the data can be conveniently integrated and the safety of the audio data can be guaranteed.
Further optionally, the connection modes include a bluetooth connection and a WiFi connection.
It should be noted that the above method steps are not intended to limit the execution order of the steps, and in fact, some steps may be executed simultaneously or in reverse order of the steps, or some steps may not be in a sequential order, and the present application is not limited thereto.
The following description is provided to enable those skilled in the art to better understand the present disclosure by describing some of the problems encountered by the inventors in implementing the present disclosure and by describing one particular embodiment of the finally identified solution.
The inventors have found that in order to solve the above-mentioned drawbacks of the prior art, some products of the prior art may be solved by the following methods:
some existing solutions add summary pages to the APP side. The defect that the audio cannot be edited in the audio transcription is overcome by editing the abstract in the recording, but the page needs to be switched in the editing process, so that the recording is not easy to use in practical use.
In other existing schemes, when the audio acquisition is supported by hardware, the complexity is increased from the original two-end interaction to multi-end interaction, and the abnormal flow processing is increased in the design and is more complex. In addition, because real-time editing in the recording transcription has the conflict problem of the editing focus and the transcription text focus, similar products lack deep thinking in the design of the problem.
The design idea of the scheme is as follows:
1. the problem that the sound reception of the mobile phone is not clear enough is solved by using special audio acquisition hardware.
2. And establishing a multi-terminal real-time collaboration service, acquiring audio by audio acquisition hardware and uploading the audio to a cloud identification service in real time, and pushing the transcription text to the APP and the Web terminal in real time through the collaboration service after the identification service returns a result, so that the real-time synchronization of the transcription contents is realized.
3. Terminal editing in audio transcription creates 2 foci, an editing focus and a transcribed text insertion focus. Ensuring that the editing focus does not affect the location of the text insertion.
Referring to fig. 5, a specific embodiment of a conference recording multi-terminal collaboration mode according to the scheme of the present application is shown, and it should be noted that although some specific examples are mentioned in the following embodiments, the following embodiments are not intended to limit the scheme of the present application.
The method comprises the following steps: the audio acquisition hardware and the mobile phone end establish a binding relationship through Bluetooth and are connected with the equipment network. After the binding relationship between the mobile phone and the equipment is established, the APP account is synchronized to the equipment end, so that three-end connection of audio acquisition hardware, the APP and Web can be accurately carried out through the account.
Step two: and starting the device for recording. The mode of starting the equipment recording has a plurality of modes, because the current 3 ends are in the connection state everywhere, the user can select to start the recording from any end of audio acquisition hardware, APP and Web, and other ends implement the synchronous recording state after the audio acquisition hardware, APP and Web are started.
Step three: and the equipment terminal uploads the audio to the identification service in real time for identification.
Step four: the recognition service recognizes and returns the recognition result to the audio acquisition hardware.
Step five: and the audio acquisition hardware synchronizes the identification result and the audio to the real-time collaboration service in real time, and the real-time collaboration service pushes the content to the APP end and the Web end. The real-time service of transcription provides a foundation for real-time viewing and modification in the user meeting process.
Step six: and the user can edit in real time through the APP terminal and the Web terminal in the transcription process, and the APP terminal and the Web terminal notify the edited content to the real-time collaboration service in real time during editing.
Step seven: and the real-time collaboration service pushes the changed editing content to each end in real time.
It should be noted that, although fig. 5 only shows a scheme of recording with a professional audio acquisition device, a person skilled in the art may understand that the recording device may also directly use a terminal such as a mobile phone, a pad, or a computer. The schematic view of the recording device as a user terminal is not provided here.
In the process of implementing the scheme of the present application, the inventor also tries some other schemes, for example, at the beginning of product design, a scheme of automatic saving by considering over-timing is considered. The timing automatic saving is a real-time scheme, and the problem that edited contents are improperly covered due to the problem of saving time difference still exists in multi-terminal cooperation, so that the method is abandoned.
The multi-terminal real-time cooperation solution provided by the embodiment of the application solves the problem of synchronization of multi-terminal editing contents. But more help is provided for product design, and multi-terminal cooperation provides good support for real-time editing in the audio transcription process, so that a user can change the audio file while listening in a conference, interview and other use scenes, and finally the aim of quickly outputting a target document is achieved.
Referring to fig. 6, a block diagram of a conference recording apparatus for an audio capture device according to an embodiment of the present invention is shown.
As shown in fig. 6, the conference recording apparatus 600 for audio capturing device includes a first connection module 610, a capturing module 620, a transcription module 630, and a real-time synchronization module 640.
The first connection module 610 is configured to establish a connection between the audio acquisition device and the user terminal; the acquisition module 620 is configured to acquire audio data in real time by an audio acquisition device; the transcription module 630 is configured to send the audio data to a cloud transcription service and obtain text data returned by the cloud transcription service, where the cloud transcription service is used to perform voice-to-text processing on the audio data; and a real-time synchronization module 640 configured to synchronize the text data to the user terminal in real time via a multi-terminal collaboration service.
Referring to fig. 7, a block diagram of a conference recording apparatus for a user terminal according to an embodiment of the present invention is shown.
As shown in fig. 7, the conference recording apparatus 700 for a user terminal includes a second connection module 710, a reception insertion module 720, and/or a change synchronization module 730.
The second connection module 710 is configured to establish a connection between the user terminal and the audio acquisition device; a receiving insertion module 720 configured to receive first text data synchronized via a multi-terminal collaboration service and insert the first text data to an end of historical text data; and/or the change synchronization module 730 is configured to respond to the editing of the historical text data by the user, and transmit the changed historical text data to the multi-terminal cooperation service in real time so as to synchronize the changed historical text data to other user terminals in real time.
It should be understood that the modules depicted in fig. 6 and 7 correspond to various steps in the method described with reference to fig. 1. Thus, the operations and features described above for the method and the corresponding technical effects are also applicable to the modules in fig. 6 and 7, and are not described again here.
It should be noted that the modules in the embodiments of the present disclosure are not intended to limit the aspects of the present disclosure, and for example, the capture module may be described as a module for capturing audio data in real time by an audio capture device. In addition, the related functional modules may also be implemented by a hardware processor, for example, the acquisition module may also be implemented by a processor, which is not described herein again.
In other embodiments, an embodiment of the present invention further provides a non-volatile computer storage medium, where the computer storage medium stores computer-executable instructions, and the computer-executable instructions may execute the conference recording method for an audio acquisition device and a user terminal in any of the above method embodiments;
as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
the audio acquisition equipment is connected with the user terminal;
the audio acquisition equipment acquires audio data in real time;
sending the audio data to a cloud transcription service and acquiring text data returned by the cloud transcription service, wherein the cloud transcription service is used for performing voice-to-text processing on the audio data;
synchronizing the text data to the user terminal in real time via a multi-terminal collaboration service.
As one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
the user terminal establishes connection with the audio acquisition equipment;
receiving first text data synchronized via a multi-terminal collaboration service and inserting the first text data to the end of historical text data; and/or
And responding to the editing of the historical text data by the user, and transmitting the changed historical text data to the multi-terminal cooperation service in real time so as to synchronize the changed historical text data to other user terminals in real time.
The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the conference recording apparatus for the audio capture device and the user terminal, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-volatile computer readable storage medium optionally includes memory located remotely from the processor, which may be connected over a network to the conference recording apparatus for the audio capture device and the user terminal. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Embodiments of the present invention also provide a computer program product, which includes a computer program stored on a non-volatile computer-readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by a computer, the computer executes any one of the above-mentioned conference recording methods for an audio acquisition device and a user terminal.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 8, the electronic device includes: one or more processors 810 and a memory 820, with one processor 810 being an example in FIG. 8. The apparatus for the conference recording method of the audio collecting apparatus and the user terminal may further include: an input device 830 and an output device 840. The processor 810, the memory 820, the input device 830, and the output device 840 may be connected by a bus or other means, such as the bus connection in fig. 8. The memory 820 is a non-volatile computer-readable storage medium as described above. The processor 810 executes various functional applications of the server and data processing by executing nonvolatile software programs, instructions and modules stored in the memory 820, that is, implementing the conference recording method of the above-described method embodiment for the audio acquisition device and the user terminal. The input device 830 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the conference recording device. The output device 840 may include a display device such as a display screen.
The product can execute the method provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.
As an embodiment, the electronic device is applied to a conference recording apparatus, and is used in an audio capturing device, and includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to:
the audio acquisition equipment is connected with the user terminal;
the audio acquisition equipment acquires audio data in real time;
sending the audio data to a cloud transcription service and acquiring text data returned by the cloud transcription service, wherein the cloud transcription service is used for performing voice-to-text processing on the audio data;
synchronizing the text data to the user terminal in real time via a multi-terminal collaboration service.
As an embodiment, the electronic device is applied to a conference recording apparatus, and is used for a user terminal, and includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to:
the user terminal establishes connection with the audio acquisition equipment;
receiving first text data synchronized via a multi-terminal collaboration service and inserting the first text data to the end of historical text data; and/or
And responding to the editing of the historical text data by the user, and transmitting the changed historical text data to the multi-terminal cooperation service in real time so as to synchronize the changed historical text data to other user terminals in real time.
The electronic device of the embodiments of the present application exists in various forms, including but not limited to:
(1) a mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.
(2) Ultra mobile personal computer device: the equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc., such as ipads.
(3) A portable entertainment device: such devices can display and play multimedia content. Such devices include audio and video players (e.g., ipods), handheld game consoles, electronic books, as well as smart toys and portable car navigation devices.
(4) The server is similar to a general computer architecture, but has higher requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because of the need of providing highly reliable services.
(5) And other electronic devices with data interaction functions.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A conference recording method for an audio capture device, comprising:
the audio acquisition equipment is connected with a user terminal to synchronize an account number of the user terminal to the audio acquisition equipment, so that the audio acquisition equipment, the user terminal and the Web are connected through the account number;
the audio acquisition equipment acquires audio data in real time;
sending the audio data to a cloud transcription service and acquiring text data returned by the cloud transcription service, wherein the cloud transcription service is used for performing voice-to-text processing on the audio data;
synchronizing the text data to the user terminal establishing account connection in real time via a multi-terminal collaboration service;
wherein the user terminal edits and creates an edit focus and a transcribed text insertion focus in audio transcription to ensure that the edit focus does not affect an insertion position of the text data of the transcribed text insertion focus.
2. The method of claim 1, wherein after the audio capture device captures audio data in real-time, the method further comprises:
synchronizing the audio data to a big data center via the cloud transcription service.
3. The method of claim 1, wherein after the audio capture device establishes a connection with a user terminal, the method further comprises:
and acquiring account information of the user terminal through the connection so as to perform data transmission between the audio acquisition equipment and the user terminal based on the account information.
4. The method of any of claims 1-3, wherein the audio capture device comprises a user terminal, the audio capture device establishing a connection with the user terminal comprising the user terminal establishing a connection with another user terminal.
5. The method of claim 4, wherein the connection comprises a Bluetooth connection and a WiFi connection.
6. A conference recording method for a user terminal, comprising:
the user terminal is connected with an audio acquisition device to synchronize an account number of the user terminal to the audio acquisition device, so that the three ends of the audio acquisition device, the user terminal and Web are connected through the account number;
receiving first text data synchronized by a multi-terminal collaboration service and inserting the first text data to the end of historical text data, wherein the first text data is audio transcription text data;
responding to the editing of the historical text data by the user, and transmitting the changed historical text data to the multi-terminal cooperation service in real time so as to synchronize the changed historical text data to other user terminals in real time;
wherein the user terminal edits and creates an edit focus and a transcribed text insertion focus in audio transcription to ensure that the edit focus does not affect an insertion position of the first text data of the transcribed text insertion focus.
7. The method of claim 6, wherein the method further comprises:
responding to a recording instruction of a user to acquire audio data of the user in real time;
sending the audio data to a cloud transcription service and acquiring second text data returned by the cloud transcription service, wherein the cloud transcription service is used for performing voice-to-text processing on the audio data;
and synchronizing the second text data to other user terminals in real time through a multi-terminal collaboration service.
8. The method of claim 6, wherein after the receiving first text data synchronized via a multi-peer collaboration service and inserting the first text data to an end of historical text data, the method further comprises:
responding to an audio acquisition instruction of a user and sending an audio acquisition request to a big data center;
and receiving the audio data returned by the big data center.
9. The method of any of claims 6-8, wherein after the user terminal establishes a connection with an audio capture device, the method further comprises:
and sending account information of the user terminal to the audio acquisition equipment through the connection so as to perform data transmission among the user terminal, the other user terminals and the audio acquisition equipment based on the account information.
10. The method of claim 9, wherein the connection comprises a bluetooth connection and a WiFi connection.
CN201811585400.8A 2018-12-24 2018-12-24 Conference recording method and device for audio acquisition equipment and user terminal Active CN109599115B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811585400.8A CN109599115B (en) 2018-12-24 2018-12-24 Conference recording method and device for audio acquisition equipment and user terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811585400.8A CN109599115B (en) 2018-12-24 2018-12-24 Conference recording method and device for audio acquisition equipment and user terminal

Publications (2)

Publication Number Publication Date
CN109599115A CN109599115A (en) 2019-04-09
CN109599115B true CN109599115B (en) 2022-03-22

Family

ID=65964430

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811585400.8A Active CN109599115B (en) 2018-12-24 2018-12-24 Conference recording method and device for audio acquisition equipment and user terminal

Country Status (1)

Country Link
CN (1) CN109599115B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110246501B (en) * 2019-07-02 2022-02-01 思必驰科技股份有限公司 Voice recognition method and system for conference recording
CN111177353B (en) * 2019-12-27 2023-06-09 赣州得辉达科技有限公司 Text record generation method, device, computer equipment and storage medium
CN113571061A (en) * 2020-04-28 2021-10-29 阿里巴巴集团控股有限公司 System, method, device and equipment for editing voice transcription text
CN114664306A (en) * 2020-12-22 2022-06-24 华为技术有限公司 Method, electronic equipment and system for editing text
CN112637147B (en) * 2020-12-13 2022-08-05 青岛希望鸟科技有限公司 Method, terminal and server for establishing and connecting communication service through audio

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140081635A1 (en) * 2008-02-22 2014-03-20 Apple Inc. Providing Text Input Using Speech Data and Non-Speech Data
CN105245355A (en) * 2015-10-14 2016-01-13 安徽声讯信息技术有限公司 Intelligent voice shorthand conference system
CN108074570A (en) * 2017-12-26 2018-05-25 安徽声讯信息技术有限公司 Surface trimming, transmission, the audio recognition method preserved
CN108133710A (en) * 2017-12-26 2018-06-08 安徽声讯信息技术有限公司 Long-range record refreshes and the high in the clouds data processing system of multiport synchronous vacations
CN108597518A (en) * 2018-03-21 2018-09-28 安徽咪鼠科技有限公司 A kind of minutes intelligence microphone system based on speech recognition

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140081635A1 (en) * 2008-02-22 2014-03-20 Apple Inc. Providing Text Input Using Speech Data and Non-Speech Data
CN105245355A (en) * 2015-10-14 2016-01-13 安徽声讯信息技术有限公司 Intelligent voice shorthand conference system
CN108074570A (en) * 2017-12-26 2018-05-25 安徽声讯信息技术有限公司 Surface trimming, transmission, the audio recognition method preserved
CN108133710A (en) * 2017-12-26 2018-06-08 安徽声讯信息技术有限公司 Long-range record refreshes and the high in the clouds data processing system of multiport synchronous vacations
CN108597518A (en) * 2018-03-21 2018-09-28 安徽咪鼠科技有限公司 A kind of minutes intelligence microphone system based on speech recognition

Also Published As

Publication number Publication date
CN109599115A (en) 2019-04-09

Similar Documents

Publication Publication Date Title
CN109599115B (en) Conference recording method and device for audio acquisition equipment and user terminal
US10255929B2 (en) Media presentation playback annotation
US10043504B2 (en) Karaoke processing method, apparatus and system
CN107609045B (en) Conference record generating device and method thereof
CN109361527B (en) Voice conference recording method and system
US9449523B2 (en) Systems and methods for narrating electronic books
WO2017148442A1 (en) Audio and video processing method and apparatus, and computer storage medium
CN110246501B (en) Voice recognition method and system for conference recording
CN103905216A (en) Team-building method, client, server and system
CN112634902A (en) Voice transcription method, device, recording pen and storage medium
CN110428798B (en) Method for synchronizing voice and accompaniment, Bluetooth device, terminal and storage medium
CN103581700A (en) Audio and video on demand method, server, terminal and system
US11580954B2 (en) Systems and methods of handling speech audio stream interruptions
CN114257905B (en) Audio processing method, computer-readable storage medium, and electronic device
KR101351264B1 (en) System and method for message translation based on voice recognition
CN103826009A (en) Method and system for intelligently synchronizing audio recording and video recoding based on mobile terminal
CN103905483A (en) Audio and video sharing method, equipment and system
CN110659006A (en) Cross-screen display method and device, electronic equipment and readable storage medium
CN104253943A (en) Video shooting method and device using mobile terminal
CN109819360A (en) The audio collection method and system of more wireless microphones
CN112562688A (en) Voice transcription method, device, recording pen and storage medium
KR101518482B1 (en) Authoring system for contents by event signals syncronized between smart devices
CN113707151A (en) Voice transcription method, device, recording equipment, system and storage medium
CN106878841B (en) Microphone assembly
CN110971744A (en) Method and device for controlling voice playing of Bluetooth sound box

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant after: Sipic Technology Co.,Ltd.

Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant before: AI SPEECH Ltd.

GR01 Patent grant
GR01 Patent grant