CN113055529B - Recording control method and recording control device - Google Patents

Recording control method and recording control device Download PDF

Info

Publication number
CN113055529B
CN113055529B CN202110333296.9A CN202110333296A CN113055529B CN 113055529 B CN113055529 B CN 113055529B CN 202110333296 A CN202110333296 A CN 202110333296A CN 113055529 B CN113055529 B CN 113055529B
Authority
CN
China
Prior art keywords
information
audio
sub
speaker
receiving time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110333296.9A
Other languages
Chinese (zh)
Other versions
CN113055529A (en
Inventor
刘朝辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ioco Communication Software Co ltd
Original Assignee
Shenzhen Ioco Communication Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ioco Communication Software Co ltd filed Critical Shenzhen Ioco Communication Software Co ltd
Priority to CN202110333296.9A priority Critical patent/CN113055529B/en
Publication of CN113055529A publication Critical patent/CN113055529A/en
Application granted granted Critical
Publication of CN113055529B publication Critical patent/CN113055529B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Abstract

The application discloses a recording control method and a recording control device, and belongs to the technical field of communication. The recording control method comprises the following steps: recording audio information, and acquiring speaker information and receiving time information of the audio information; dividing the audio information into a plurality of sub-audios according to the speaker information and the receiving time information; and displaying the identifications of the plurality of sub-audios and the speaker information corresponding to each sub-audio according to the sequence of the receiving time.

Description

Recording control method and recording control device
Technical Field
The application belongs to the technical field of communication, and particularly relates to a recording control method and a recording control device.
Background
In the recording method in the related art, the whole recording content is recorded, and if a user needs to listen to a certain section of audio after recording, the user may need to repeatedly adjust the progress to locate the target audio, so that the searching efficiency is very low.
Disclosure of Invention
The embodiment of the application aims to provide a recording control method and a recording control device, which can solve the problem of low efficiency of searching recorded audio in the related art.
In a first aspect, an embodiment of the present application provides a recording control method, where the method includes:
recording audio information, and acquiring speaker information and receiving time information of the audio information;
dividing the audio information into a plurality of sub-audios according to the speaker information and the receiving time information;
and displaying the identifications of the plurality of sub-audios and the speaker information corresponding to each sub-audio according to the sequence of the receiving time.
In a second aspect, an embodiment of the present application provides a recording control apparatus, including:
the receiving module is used for receiving audio information;
the acquisition module is used for acquiring speaker information and receiving time information of the audio information;
the dividing module is used for dividing the audio information into a plurality of sub-audios according to the speaker information and the receiving time information;
and the display module is used for displaying the identifications of the plurality of sub-audios and the speaker information corresponding to each sub-audio according to the sequence of the receiving time.
In a third aspect, embodiments of the present application provide an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, where the program or instructions, when executed by the processor, implement the method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a readable storage medium on which a program or instructions are stored, which when executed by a processor, implement a method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.
In the embodiment of the application, in the process of recording the audio information, the audio information is divided into a plurality of sub-audios according to the speaker information of the audio and the receiving time information of the audio, the sub-audios are displayed one by one according to the sequence of the receiving time, and meanwhile, the speaker information of each sub-audio is correspondingly displayed. According to the embodiment of the application, the recording scene is analogized to a conversation process of one-site multi-person chat, each voice record of each speaker is displayed by taking time as a sequence, the purpose of recording the audio information in a segmented mode is achieved, the flexibility of a voice recording mode is improved, and a subsequent user can conveniently find each voice record independently.
Drawings
Fig. 1 is a schematic flow chart illustrating a recording control method according to an embodiment of the present application;
FIG. 2 is a schematic view of a recording display according to an embodiment of the present application;
FIG. 3 is a second recording display diagram according to the embodiment of the present application;
FIG. 4 is a third schematic view of a recording display according to an embodiment of the present application;
FIG. 5 is a fourth schematic view of a recording display according to an embodiment of the present application;
FIG. 6 is a fifth recording display diagram according to the embodiment of the present application;
FIG. 7 is a sixth recording display diagram according to the present embodiment;
FIG. 8 is a second flowchart illustrating a recording control method according to an embodiment of the present application;
FIG. 9 is a schematic block diagram of a recording control apparatus according to an embodiment of the present application;
FIG. 10 is one of the schematic block diagrams of an electronic device of an embodiment of the present application;
fig. 11 is a second schematic block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below clearly with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived from the embodiments in the present application by a person skilled in the art, are within the scope of protection of the present application.
The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or described herein. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.
The recording control method, the recording control apparatus, the electronic device, and the readable storage medium provided in the embodiments of the present application are described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.
An embodiment of the present application provides a recording control method, as shown in fig. 1, the method includes:
102, recording audio information by the electronic equipment, and acquiring speaker information and receiving time information of the audio information;
step 104, the electronic equipment divides the audio information into a plurality of sub-audios according to the speaker information and the receiving time information;
and 106, displaying the identifications of the multiple sub-audios and the speaker information corresponding to each sub-audio by the electronic equipment according to the sequence of the receiving time.
Wherein, above-mentioned electronic equipment has the recording function, and electronic equipment includes cell-phone, recording pen, panel computer, on-vehicle electronic equipment, wearable equipment etc..
In this embodiment, in the process of recording the audio information by using the electronic device, the audio information is divided into a plurality of sub-audios according to the speaker information of the audio and the receiving time information of the audio, and the plurality of sub-audios are displayed one by one according to the sequence of the receiving time, and the speaker information of each sub-audio is displayed correspondingly at the same time. According to the embodiment of the application, the recording scene is analogized to a conversation process of one-site multi-person chat, each voice record of each speaker is displayed by taking time as a sequence, the purpose of recording the audio information in a segmented mode is achieved, the flexibility of a voice recording mode is improved, and a subsequent user can conveniently find each voice record independently.
In one embodiment of the application, a recording function is added to the electronic device, for example, a conference recording function is added. Under the conference recording function, adding speaker information, wherein the speaker information comprises but is not limited to: the speaker name, the speaker avatar, and the speaker voiceprint information can be associated with a unique speaker through the information. After a conference starts, starting a recording function of the electronic equipment through a specific touch gesture and a shortcut key of a user, starting the recording function of the conference after the recording is started, distinguishing speakers to which each section of voice belongs in the recording process, then taking each section of voice spoken by each speaker as an independent record to be displayed, as shown in fig. 2, the recording form of the whole conference is similar to one-time multi-person conversation chatting, and each section of sub-audio (including a speaker A, a speaker B, a speaker C, a speaker D and the sub-audio of the user) is presented to the user in a chatting record mode.
It should be noted that, the scenes of the embodiment of the present application include, but are not limited to, a conference recording scene, a classroom recording scene, a multi-person chat scene, and the like. In addition, in the recording process, the electronic equipment starts the functions of voice enhancement, noise reduction and the like, so that the definition of recorded voice can be improved.
Further, after the whole recording process is finished, each sub-audio is stored separately, and all the sub-audios are stored into a whole recording file.
Further, in an optional embodiment of the present application, dividing the audio information into a plurality of sub-audios according to the speaker information and the receiving time information includes: and under the condition that the speaker information is not changed in the first receiving time period, dividing the audio information in the first receiving time period into one sub-audio.
In this embodiment, a way of dividing the audio information is defined. Specifically, if the speaker information in the continuous time period is always unchanged, it indicates that one speaker is always speaking, so the audio information in the time period is divided into one sub-audio. And if the speaker information changes, taking the changed critical time point as a division point of the two sub-audios. Illustratively, as shown in fig. 2, only the audio of speaker B is detected during a period of 1 minute and 3 seconds, and a sub-audio of speaker B of 1 minute and 3 seconds is recorded. And if the user's own audio is detected in the following 2 minutes and 3 seconds, the user's own sub-audio of 2 minutes and 3 seconds is recorded.
By the above mode, the recorded audio information can be accurately divided according to the information whether the speaker information is switched, and the accuracy of acquiring the sub-audio is improved. Moreover, the sub-audio is recorded correspondingly to different speakers, so that the user can subsequently find out the audio of different speakers more conveniently without repeatedly adjusting the progress of the recording file, and the user can operate conveniently.
In addition, it should be noted that, in the case that the speaker information is the same, the audio information is divided according to different receiving times, that is, if the same speaker respectively speaks at different times, different sub-audios are recorded according to different times. For example, speaker a may start speaking a session at time T1 and record the session as a separate sub-audio, and speaker a may start speaking a session at time T2 and record the session as a new separate sub-audio.
Further, in an optional embodiment of the present application, dividing the audio information into a plurality of sub-audios according to the speaker information and the receiving time information includes: and under the condition that the receiving time information is the same, dividing the audio information according to different speaker information.
In this embodiment, a way of dividing the audio information is defined. Specifically, if the receiving time is the same, the audio information is divided according to different speaker information, as shown in fig. 2, when the speaker C and the speaker D start speaking at the same time point, a sub-audio is recorded for the speaker C, and a sub-audio is recorded for the speaker D, for example, the sound is divided according to the voiceprint information of different speakers, and then the audio recording is performed.
By the mode, even if a plurality of speakers speak at the same time, each speaker can be recorded accurately, and the recording effect is improved. Moreover, the user can conveniently search the audio of different speakers subsequently.
The display order of the sub-audio recorded at the same time may be displayed according to the setting order of the speaker by the user, or may be arranged according to the initial order of the names of the speakers, which is not limited herein.
Further, in an optional embodiment of the present application, the method further comprises: a first identifier is added to a plurality of sub-audios which have the same receiving time information.
In this embodiment, marks are added for a plurality of sub-audios starting to be recorded at the same time point. Exemplarily, as shown in fig. 2, if a speaker C and a speaker D start speaking at the same time point, the speaker C and the speaker D are distinguished by the speaker, a 30-second sub-audio of the speaker C and a 25-second sub-audio of the speaker D are displayed, and an auxiliary flag (for example, a flag such as a frame diagram is added) is added to the two sub-audios at the same time, that is, the first flag 20 is set, so that the user can be reminded that several corresponding sub-audios are speaking at the same time through the setting of the first flag 20, so that the recording information is more detailed.
Further, in an optional embodiment of the present application, the method further comprises: identifying a sub-audio; displaying a second identifier under the condition that the sub audio is useless audio; and in the case that the sub audio is the useful audio, displaying a third identification.
In this embodiment, a sub audio is recognized, whether the sub audio is a non-useful audio is determined, when the sub audio is a non-useful audio, a second identifier (for example, a deletion identifier) is displayed on the sub audio, and further, in a case where an input operation of the second identifier by a user is received, the sub audio is deleted. Illustratively, as shown in fig. 3, when the sub-audio is useless audio, a second identifier 30 is disposed beside the sub-audio, on one hand, the user can be prompted by the second identifier 30 that the sub-audio is useless audio, and on the other hand, the user can operate the second identifier 30 to delete the useless audio.
It will be appreciated that if the speech content of the multiple speakers is useless audio, each useless audio is correspondingly marked with the second identifier. In the scenario where multiple speakers speak simultaneously, and the speech content of all speakers is useless, the entire speaker may be marked as useless audio, and for example, as shown in fig. 3, the speakers C and D start speaking at the same time point, and the recorded sub-audio is useless, the second identifier 30 is marked as a whole.
And when the sub audio is the useful audio, displaying a third identifier for the sub audio. Illustratively, as shown in fig. 3, when the sub-audio is useful audio, a third identifier 32 is disposed beside the sub-audio, and the user can be prompted by the third identifier 32 that the sub-audio is useful audio.
It should be noted that the useless audio refers to recorded information having speech content, but having no meaning in the speech content. Whether the sub-audio is useless or not can be identified according to the previous audio or the next audio of the sub-audio, whether the sub-audio is useless or not can be identified according to keywords in the sub-audio, and whether the sub-audio is useless or not can be identified according to whether the voice content of the sub-audio is a language word or not.
In addition, the shape or color of the second mark and the third mark is not limited in the application, and can be set by a user according to needs.
By the method, the useless audio and the useful audio can be identified, the useless audio and the useful audio can be distinguished and marked, and a user can intuitively know the sub-audio. And moreover, useless audio can be removed, and the storage pressure of the system is reduced.
In some embodiments, when the sub-audio is determined to be useful audio, it may not be marked, and a distinction from the useless information may be realized.
After the whole recording process is finished, only the sub-audio which is not marked as deleted is reserved, the sub-audio which is marked as deleted is not saved, and specifically, the sub-audio which is not marked as deleted and corresponds to each sub-audio is intercepted and saved according to the receiving time stamp (i.e. time stamp) corresponding to each sub-audio.
Further, in an optional embodiment of the present application, the method further comprises: and determining the display form of the third identifier according to the importance degree of the sub-audio.
In this embodiment, for a sub audio identified as a useful audio, the display form of the third mark is determined according to the degree of importance of the sub audio. For example, if the third identifier is a star, the higher the importance of the sub audio, the greater the number of stars, and as shown in fig. 3, the sub audio of speaker B is marked as one star, and the sub audio of the user is marked as two stars, the more important the sub audio representing the user is than the sub audio of speaker B.
Through the mode, the user can intuitively know the importance level of the sub-audio, and the flexibility of marking the sub-audio is improved.
Further, in an optional embodiment of the present application, the method further comprises: converting the sub-audio into text information; displaying the text information under the condition that the number of characters of the text information is less than or equal to a first number threshold; and displaying the keywords of the text information under the condition that the number of the characters of the text information is greater than a first number threshold value.
In this embodiment, the sub-audio may be converted into corresponding text information, and for text information with short content, all text after audio conversion may be directly displayed, as shown in fig. 4, the sub-audio of speaker a is converted into text "kayi", and the text may be directly displayed because the content is short. For text information with long content, keywords may be generated from sub-audio content, and then the keywords of the sub-audio may be displayed by default, as shown in fig. 4, since the content is longer after the sub-audio of the user is converted into text, the keywords "first item group" and "next month" are displayed.
In addition, the character information after the sub audio conversion can be stored in association with the sub audio.
In some embodiments, while the keywords of the sub-audio are displayed by default, all the characters after audio conversion are displayed in a folded manner, and the folded display identifier 40 is displayed, so that a user can operate the folded display identifier 40, and thus the characters displayed in the folded manner are expanded for viewing.
According to the embodiment of the application, the sub-audio can be converted into the text for the user to check. Moreover, when the user needs to record the characters, the user can directly copy the characters without listening to the recording content again, so that the user can operate the recording device conveniently, and the efficiency is improved.
Further, in an optional embodiment of the present application, the method further comprises: and updating the character information or the keywords when the correction information input by the user is received.
In this embodiment, if there is a content to be corrected after the audio is converted into text, the user can input correction information, so as to correct the content to be corrected, so that the converted text is more accurate.
Further, in an optional embodiment of the present application, the method further comprises: and under the condition that the text information or the keywords contain the information to be interpreted, generating a hyperlink for the information to be interpreted, wherein the hyperlink is used for pointing to the interpretation information of the information to be interpreted.
In this embodiment, after the audio is converted into text, if the information to be interpreted (for example, a professional term) is to be interpreted, a hyperlink is automatically generated on the information to be interpreted, and after the user clicks the hyperlink, the specific interpretation content of the information to be interpreted can be viewed. By the mode, when the unknown information is encountered, the user can conveniently check the interpretation information so as to meet the user requirement.
Further, in an optional embodiment of the present application, the method further comprises: in the case that a first input of a user to the sub audio is received, tag information is added to the sub audio.
In this embodiment, in the recording process, if an idea of a user needs to be recorded for a certain sub-audio, the label information (i.e., note information) of the user may be directly inserted into the sub-audio, and the label information may be characters, pictures, or even voice. Illustratively, as shown in fig. 5, an addition mark "+" is displayed behind each sub-audio, and when a user needs to add mark information to a certain sub-audio, the user can click the "+" corresponding to the sub-audio, so as to add mark information.
In addition, if the sub-audio to which the mark information is to be added has already been converted into the text information, the mark information may be added to the text information accordingly. And, the information of the user added with the mark information is stored in association with the mark information.
It should be noted that the first input includes, but is not limited to, a click input, a slide input, a double click input, a long press input, etc. to the sub audio or the added identifier of the sub audio. Specifically, the input mode in the embodiments of the present application is not particularly limited, and may be any realizable mode.
By the mode, the purpose of adding the notes of the user in a targeted manner in the recording process can be achieved, and the operation of the user is facilitated.
Further, in an optional embodiment of the present application, the method further comprises: displaying the speaker information; and playing the sub-audio corresponding to the speaker information when receiving a second input of the speaker information from the user.
In this embodiment, a way of opening the sub-audio is defined. Specifically, speaker information is displayed in an interface of the electronic device, and when a user wants to view a sub-audio of a certain speaker, the user can select the speaker information, so that the sub-audio of the speaker is played. Illustratively, as shown in fig. 6, speaker information (e.g., speaker a, speaker B, speaker C, speaker D, and user) involved in the recording process is presented within the interface top area 60. The user is supported to check all user information, and if the interface cannot display all speaker information, the user can click the identifier '… …' to check the complete speaker information. And if the user clicks the information of the speaker D, the sub-audio of the speaker D can be played.
The second input includes, but is not limited to, a click input, a slide input, a double click input, a long press input, and the like for the speaker information. Specifically, the input mode in the embodiments of the present application is not particularly limited, and may be any realizable mode.
By the method, the flexibility of the sub-audio playing mode is improved, and a user can conveniently play the sub-audio.
It is understood that speaker information corresponding to the sub audio, for example, a speaker name, a speaker avatar, speaker voice print information, and the like are stored at the time of recording. If a speaker does not speak or the spoken words are identified as useless audio in the recording process, the speaker information of the speaker is not saved, so that the system storage resources are saved.
Further, in an optional embodiment of the present application, the method further comprises: and in the case of receiving search information input by a user, searching for the sub-audio corresponding to the search information.
In this embodiment, since each sub audio is presented separately and the text information after audio conversion is saved, the search function for the sub audio is supported. Specifically, if the user only wants to listen to the voice content of a certain speaker or the voice content related to a certain key content, the search information is input, so as to search for the corresponding sub-audio. Illustratively, as shown in fig. 6, a search indicator 62 is displayed, and when the user clicks the search indicator 62, a search box is displayed, and the user can input search information in the search box to implement a search function on sub-audio.
In some embodiments, specific search modes include, but are not limited to: searching according to speaker information, performing accurate search according to user input voice, performing fuzzy search according to user input voice, performing search according to keywords input by a user, performing search according to a time range input by the user, and performing search according to an audio importance level.
By the method, the sub-audio can be searched according to the search information input by the user, and the time limit helps the user to quickly position the required audio.
Further, in an optional embodiment of the present application, the method further comprises: and under the condition that a user selects at least one sub audio from the plurality of sub audio is received, saving the at least one sub audio.
In this embodiment, the user may select one or more sub-audios to be saved, wherein the sub-audios may be continuously displayed or discontinuously displayed. For example, as shown in fig. 7, after the editing function of audio interception and storage is entered, a check box 70 is located behind each sub-audio, and when the user clicks the check boxes 70 corresponding to the speakers B and C and selects the sub-audio of the speakers B and C, the sub-audio of the speaker B and the sub-audio of the speaker C are stored as a new recording file.
By the mode, one or more sub-audios can be saved as new recording files according to the needs of users, and the flexibility of audio saving is improved.
In one embodiment, as shown in fig. 8, the recording control method includes:
and step 802, setting a conference recording function of the electronic equipment. In the step, a conference recording function is added to the electronic equipment, and speaker information is added.
And step 804, starting a conference recording function. In the step, a conference recording function of the electronic equipment is started through a specific touch gesture and a shortcut key of a user.
Step 806, recording interactive display. In this step, the audio information is divided into a plurality of sub-audios according to the speaker information of the audio and the receiving time of the audio, and the plurality of sub-audios are displayed one by one according to the receiving time, and simultaneously, the speaker information of each sub-audio is correspondingly displayed.
And 808, presetting the recording. In this step, it specifically includes: marking useless audio or important audio, deleting useless audio, converting sub audio into character information, and the like.
Step 810, inserting note information into the sound record. In this step, if it is necessary to record the user's mind for a certain sub-audio, note information of the user may be directly inserted into the sub-audio.
Step 812, save the audio file. In this step, useful sub-audio, speaker information, converted text information, and note-taking are stored.
Step 814, the recording file is opened. In this step, when the user wants to view the sub-audio of a speaker, the user can select the speaker information, so as to play the sub-audio of the speaker.
Step 816, search for the content of the audio file. In this step, if the user only wants to listen to the voice content of a certain speaker or the voice content related to a certain key content, the search information is input, so as to search for the corresponding sub audio.
Step 818, the recording file is edited for the second time. In this step, one or more sub-audio files are saved as new recording files according to the needs of the user.
The embodiment of the application provides a method for processing and storing a recording file, which takes a recording scene as a multi-person chat conversation, displays the speaking content of each speaker by taking time as an axis, removes useless information, stores the text content after voice conversion, and facilitates subsequent searching and listening of target voice content and secondary file editing of a user.
It should be noted that, in the recording control method provided in the embodiment of the present application, the execution main body may be a recording control device, or a control module in the recording control device for executing the loading recording control method. In the embodiment of the present application, a recording control device is taken as an example to execute a recording loading control method, and the recording control device provided in the embodiment of the present application is described.
An embodiment of the present application provides a recording control apparatus, as shown in fig. 9, the recording control apparatus 900 includes:
a receiving module 902, configured to receive audio information;
an obtaining module 904, configured to obtain speaker information and receiving time information of the audio information;
a dividing module 906 configured to divide the audio information into a plurality of sub-audios according to the speaker information and the reception time information;
the display module 908 is configured to display the identifiers of the multiple sub-audio frequencies and the speaker information corresponding to each sub-audio frequency according to the receiving time sequence.
In this embodiment, in the process of recording the audio information by using the electronic device, the audio information is divided into a plurality of sub-audios according to the speaker information of the audio and the receiving time information of the audio, and the plurality of sub-audios are displayed one by one according to the sequence of the receiving time, and meanwhile, the speaker information of each sub-audio is correspondingly displayed. According to the embodiment of the application, the recording scene is analogized to a conversation process of one-site multi-person chat, each voice record of each speaker is displayed by taking time as a sequence, the purpose of recording the audio information in a segmented mode is achieved, the flexibility of a voice recording mode is improved, and a subsequent user can conveniently find each voice record independently.
Further, in an optional embodiment of the present application, the dividing module 906 is specifically configured to divide the audio information in the first receiving time period into one sub-audio under the condition that the speaker information does not change in the first receiving time period.
Further, in an optional embodiment of the present application, the dividing module 906 is specifically configured to divide the audio information according to different speaker information under the condition that the receiving time information is the same.
Further, in an optional embodiment of the present application, the recording control apparatus 900 further includes: and the adding module is used for adding a first identifier to a plurality of sub-audios with the same receiving time information.
Further, in an optional embodiment of the present application, the recording control apparatus 900 further includes: the identification module is used for identifying the sub-audio; the display module 908 is further configured to display the second identifier if the sub-audio is non-useful audio, and display the third identifier if the sub-audio is useful audio.
Further, in an optional embodiment of the present application, the recording control apparatus 900 further includes: the conversion module is used for converting the sub audio into character information; the display module 908 is further configured to display the text message if the number of characters of the text message is less than or equal to a first number threshold, and display the keyword of the text message if the number of characters of the text message is greater than the first number threshold.
Further, in an optional embodiment of the present application, the processing module is further configured to generate a hyperlink for the information to be interpreted when the text information or the keyword includes the information to be interpreted, where the hyperlink is used to point to the interpretation information of the information to be interpreted.
Further, in an optional embodiment of the present application, the adding module is further configured to add the mark information to the sub-audio in a case that the receiving module 902 receives a first input of the user to the sub-audio.
Further, in an optional embodiment of the present application, the display module 908 is further configured to display speaker information; the recording control apparatus 900 further includes: and a playing module, configured to play the sub-audio corresponding to the speaker information when the receiving module 902 receives the second input of the speaker information from the user.
The recording control apparatus 900 in the embodiment of the present application may be an apparatus, or may be a component, an integrated circuit, or a chip in a terminal. The recording control device 900 may be a mobile electronic device or a non-mobile electronic device. By way of example, the Mobile electronic device may be a Mobile phone, a tablet Computer, a notebook Computer, a palm top Computer, an in-vehicle electronic device, a wearable device, an Ultra-Mobile Personal Computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-Mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (Personal Computer, PC), a Television (TV), a teller machine, a self-service machine, and the like, and the embodiments of the present application are not particularly limited.
The recording control apparatus 900 in the embodiment of the present application may be an apparatus having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.
The recording control device 900 provided in this embodiment of the application can implement each process implemented in the recording control method embodiments of fig. 1 to 8, and is not described here again to avoid repetition.
Optionally, as shown in fig. 10, an electronic device 1000 is further provided in this embodiment of the present application, and includes a processor 1002, a memory 1004, and a program or an instruction stored in the memory 1004 and executable on the processor 1002, where the program or the instruction is executed by the processor 1002 to implement each process of the foregoing recording control method embodiment, and can achieve the same technical effect, and no further description is provided here to avoid repetition.
It should be noted that the electronic devices in the embodiments of the present application include the mobile electronic devices and the non-mobile electronic devices described above.
Fig. 11 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
The electronic device 1100 includes, but is not limited to: radio frequency unit 1102, network module 1104, audio output unit 1106, input unit 1108, sensors 1110, display unit 1112, user input unit 1114, interface unit 1116, memory 1118, and processor 1120, among other components.
Those skilled in the art will appreciate that the electronic device 1100 may further include a power supply (e.g., a battery) for supplying power to the various components, and the power supply may be logically connected to the processor 1120 via a power management system, so that the functions of managing charging, discharging, and power consumption may be implemented via the power management system. The electronic device structure shown in fig. 11 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is not repeated here.
The microphone 11084 of the input unit 1108 is configured to receive audio information; a processor 1120 for acquiring speaker information and reception time information of the audio information; a microphone 11084 for dividing the audio information into a plurality of sub-audios according to the speaker information and the reception time information; display unit 1112 is configured to display the identifiers of the multiple sub-audios and the speaker information corresponding to each sub-audio in the receiving time sequence.
In this embodiment, in the process of recording the audio information by using the electronic device, the audio information is divided into a plurality of sub-audios according to the speaker information of the audio and the receiving time information of the audio, and the plurality of sub-audios are displayed one by one according to the sequence of the receiving time, and meanwhile, the speaker information of each sub-audio is correspondingly displayed. According to the embodiment of the application, the recording scene is analogized to a conversation process of one-site multi-person chat, each voice record of each speaker is displayed by taking time as a sequence, the purpose of recording the audio information in a segmented mode is achieved, the flexibility of a voice recording mode is improved, and a subsequent user can conveniently find each voice record independently.
Further, in an optional embodiment of the present application, the microphone 11084 is specifically configured to divide the audio information in the first receiving time period into one sub-audio when the speaker information does not change in the first receiving time period.
Further, in an optional embodiment of the present application, the microphone 11084 is specifically configured to divide the audio information according to different speaker information under the condition that the receiving time information is the same.
Further, in an alternative embodiment of the present application, the processor 1120 is further configured to add the first identifier to a plurality of sub-audios with the same receiving time information.
Further, in an optional embodiment of the present application, the processor 1120 is further configured to identify a sub-audio; the display unit 1112 is further configured to display the second identifier if the sub audio is the useless audio, and display the third identifier if the sub audio is the useful audio.
Further, in an optional embodiment of the present application, the processor 1120 is further configured to convert the sub audio into text information; the display unit 1112 is further configured to display the text information if the number of characters of the text information is less than or equal to a first number threshold, and to display a keyword of the text information if the number of characters of the text information is greater than the first number threshold.
Further, in an optional embodiment of the present application, the processor 1120 is further configured to generate a hyperlink for the information to be interpreted in a case that the text information or the keyword includes the information to be interpreted, where the hyperlink is used to point to the interpretation information of the information to be interpreted.
Further, in an alternative embodiment of the present application, the processor 1120 is further configured to add the marking information to the sub-audio if the user input unit 1114 receives a first input of the sub-audio by the user.
Further, in an optional embodiment of the present application, the display unit 1112 is further configured to display speaker information; microphone 11084 is also configured to play the sub-audio corresponding to the speaker information when user input section 1114 receives a second input of the speaker information from the user.
It should be understood that, in the embodiment of the present application, the radio frequency unit 1102 may be configured to send and receive information or send and receive signals during a call, and in particular, receive downlink data of a base station or send uplink data to the base station. The radio frequency unit 1102 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like.
The network module 1104 provides wireless broadband internet access to the user, such as helping the user send and receive e-mails, browse web pages, and access streaming media.
The audio output unit 1106 may convert audio data received by the radio frequency unit 1102 or the network module 1104 or stored in the memory 1118 into an audio signal and output as sound. Also, the audio output unit 1106 may also provide audio output related to a specific function performed by the electronic device 1100 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 1106 includes a speaker, a buzzer, a receiver, and the like.
The input unit 1108 is used to receive audio or video signals. The input Unit 1108 may include a Graphics Processing Unit (GPU) 11082 and a microphone 11084, the Graphics processor 11082 Processing image data of still pictures or video obtained by an image capturing apparatus (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 1112, or stored in the memory 1118 (or other storage medium), or transmitted via the radio frequency unit 1102 or the network module 1104. The microphone 11084 can receive sound and can process the sound into audio data, and the processed audio data can be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 1102 in case of a phone call mode.
The electronic device 1100 also includes at least one sensor 1110, such as a fingerprint sensor, pressure sensor, iris sensor, molecular sensor, gyroscope, barometer, hygrometer, thermometer, infrared sensor, light sensor, motion sensor, and others.
The display unit 1112 is used to display information input by the user or information provided to the user. The display unit 1112 may include a display panel 11122, and the display panel 11122 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
The user input unit 1114 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 1114 includes a touch panel 11142 and other input devices 11144. Touch panel 11142, also referred to as a touch screen, can collect touch operations by a user on or near it. The touch panel 11142 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 1120, receives a command from the processor 1120, and executes the command. Other input devices 11144 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein.
Further, touch panel 11142 can be overlaid on display panel 11122, and after touch panel 11142 detects a touch operation on or near touch panel 11142, the touch event can be transmitted to processor 1120 to determine the type of touch event, and then processor 1120 can provide corresponding visual output on display panel 11122 according to the type of touch event. The touch panel 11142 and the display panel 11122 may be provided as two separate components or may be integrated into one component.
The interface unit 1116 is an interface for connecting an external device to the electronic apparatus 1100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 1116 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the electronic apparatus 1100 or may be used to transmit data between the electronic apparatus 1100 and an external device.
The memory 1118 may be used to store software programs as well as various data. The memory 1118 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the mobile terminal, and the like. Additionally, the memory 1118 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 1120 performs various functions of the electronic device 1100 and processes data by running or executing software programs and/or modules stored in the memory 1118 and invoking data stored in the memory 1118 to thereby perform overall monitoring of the electronic device 1100. Processor 1120 may include one or more processing units; preferably, the processor 1120 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications.
The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the recording control method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.
The processor is the processor in the electronic device in the above embodiment. Readable storage media, including computer-readable storage media, such as computer Read-Only Memory (ROM), random Access Memory (RAM), magnetic or optical disks, etc.
The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to execute a program or an instruction to implement each process of the above-mentioned recording control method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the description is omitted here.
It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as a system-on-chip, or a system-on-chip.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application or portions thereof that contribute to the prior art may be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present application.
While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the present embodiments are not limited to those precise embodiments, which are intended to be illustrative rather than restrictive, and that various changes and modifications may be effected therein by one skilled in the art without departing from the scope of the appended claims.

Claims (10)

1. A recording control method, comprising:
recording audio information, and acquiring speaker information and receiving time information of the audio information, wherein the speaker information is used for judging whether a speaker switches;
in the process of recording the audio information, dividing the audio information into a plurality of sub-audios according to the speaker information and the receiving time information so as to record the words spoken by each speaker as the sub-audios;
and respectively displaying the identifiers of the plurality of sub-audios and the speaker information corresponding to each sub-audio in a chat record display mode according to the sequence of the receiving time.
2. The recording control method of claim 1, wherein the dividing the audio information into a plurality of sub-audios according to the speaker information and the receiving time information comprises:
and under the condition that the speaker information is not changed in a first receiving time period, dividing the audio information in the first receiving time period into one sub audio.
3. The recording control method of claim 1, wherein the dividing the audio information into a plurality of sub-audios according to the speaker information and the receiving time information comprises:
and under the condition that the receiving time information is the same, dividing the audio information according to different speaker information.
4. The recording control method of claim 3, further comprising:
and adding a first identifier to a plurality of sub-audios with the same receiving time information.
5. The recording control method according to any one of claims 1 to 4, characterized by further comprising:
identifying the sub-audio;
displaying a second identifier under the condition that the sub-audio is useless audio;
and displaying a third identifier under the condition that the sub audio is the useful audio.
6. The recording control method according to any one of claims 1 to 4, characterized by further comprising:
converting the sub-audio into text information;
displaying the text information under the condition that the number of characters of the text information is less than or equal to a first number threshold;
and displaying the keywords of the text information under the condition that the number of the characters of the text information is greater than the first number threshold.
7. The recording control method of claim 6, further comprising:
and under the condition that the text information or the keywords contain information to be explained, generating a hyperlink for the information to be explained, wherein the hyperlink is used for pointing to the explanation information of the information to be explained.
8. The recording control method according to any one of claims 1 to 4, further comprising:
in the event that a first input to the sub-audio by a user is received, tagging information is added to the sub-audio.
9. The recording control method according to any one of claims 1 to 4, further comprising:
displaying the speaker information;
and playing the sub-audio corresponding to the speaker information under the condition that a second input of the user to the speaker information is received.
10. A recording control apparatus, comprising:
the receiving module is used for receiving audio information;
the acquisition module is used for acquiring speaker information and receiving time information of the audio information, wherein the speaker information is used for judging whether speakers are switched or not;
the dividing module is used for dividing the audio information into a plurality of sub-audios according to the speaker information and the receiving time information in the process of recording the audio information so as to record the words spoken by each speaker into the sub-audios;
and the display module is used for respectively displaying the identifiers of the sub-audios and the speaker information corresponding to each sub-audio in a chat record display mode according to the sequence of the receiving time.
CN202110333296.9A 2021-03-29 2021-03-29 Recording control method and recording control device Active CN113055529B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110333296.9A CN113055529B (en) 2021-03-29 2021-03-29 Recording control method and recording control device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110333296.9A CN113055529B (en) 2021-03-29 2021-03-29 Recording control method and recording control device

Publications (2)

Publication Number Publication Date
CN113055529A CN113055529A (en) 2021-06-29
CN113055529B true CN113055529B (en) 2022-12-13

Family

ID=76515980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110333296.9A Active CN113055529B (en) 2021-03-29 2021-03-29 Recording control method and recording control device

Country Status (1)

Country Link
CN (1) CN113055529B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113655985A (en) * 2021-08-09 2021-11-16 维沃移动通信有限公司 Audio recording method and device, electronic equipment and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109741754A (en) * 2018-12-10 2019-05-10 上海思创华信信息技术有限公司 A kind of conference voice recognition methods and system, storage medium and terminal
CN110322869A (en) * 2019-05-21 2019-10-11 平安科技(深圳)有限公司 Meeting subangle color phoneme synthesizing method, device, computer equipment and storage medium
CN110335612A (en) * 2019-07-11 2019-10-15 招商局金融科技有限公司 Minutes generation method, device and storage medium based on speech recognition
CN110648665A (en) * 2019-09-09 2020-01-03 北京左医科技有限公司 Session process recording system and method
CN111913627A (en) * 2020-06-22 2020-11-10 维沃移动通信有限公司 Recording file display method and device and electronic equipment

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104952451B (en) * 2015-06-08 2019-05-14 Oppo广东移动通信有限公司 A kind of recording processing method and processing unit of recording
CN107025913A (en) * 2016-02-02 2017-08-08 西安中兴新软件有限责任公司 A kind of way of recording and terminal
CN106024009B (en) * 2016-04-29 2021-03-30 北京小米移动软件有限公司 Audio processing method and device
CN108074574A (en) * 2017-11-29 2018-05-25 维沃移动通信有限公司 Audio-frequency processing method, device and mobile terminal
CN110661923A (en) * 2018-06-28 2020-01-07 视联动力信息技术股份有限公司 Method and device for recording speech information in conference
CN112468761A (en) * 2020-10-31 2021-03-09 浙江云优家智能科技有限公司 Intelligent conference recording system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109741754A (en) * 2018-12-10 2019-05-10 上海思创华信信息技术有限公司 A kind of conference voice recognition methods and system, storage medium and terminal
CN110322869A (en) * 2019-05-21 2019-10-11 平安科技(深圳)有限公司 Meeting subangle color phoneme synthesizing method, device, computer equipment and storage medium
CN110335612A (en) * 2019-07-11 2019-10-15 招商局金融科技有限公司 Minutes generation method, device and storage medium based on speech recognition
CN110648665A (en) * 2019-09-09 2020-01-03 北京左医科技有限公司 Session process recording system and method
CN111913627A (en) * 2020-06-22 2020-11-10 维沃移动通信有限公司 Recording file display method and device and electronic equipment

Also Published As

Publication number Publication date
CN113055529A (en) 2021-06-29

Similar Documents

Publication Publication Date Title
CN110868639B (en) Video synthesis method and device
CN109309751B (en) Voice recording method, electronic device and storage medium
CN111010610B (en) Video screenshot method and electronic equipment
CN109561211B (en) Information display method and mobile terminal
US20170373994A1 (en) Method and terminal for displaying instant messaging message
CN110830362B (en) Content generation method and mobile terminal
CN109165292A (en) Data processing method, device and mobile terminal
CN111445927B (en) Audio processing method and electronic equipment
CN110909524A (en) Editing method and electronic equipment
CN110750368A (en) Copying and pasting method and terminal
CN112287162A (en) Message searching method and device and electronic equipment
CN111752448A (en) Information display method and device and electronic equipment
CN108595107B (en) Interface content processing method and mobile terminal
CN108710521B (en) Note generation method and terminal equipment
CN108763475B (en) Recording method, recording device and terminal equipment
CN109669710B (en) Note processing method and terminal
CN110989847A (en) Information recommendation method and device, terminal equipment and storage medium
CN113055529B (en) Recording control method and recording control device
CN113241097A (en) Recording method, recording device, electronic equipment and readable storage medium
CN111400552B (en) Note creation method and electronic equipment
CN110750198A (en) Expression sending method and mobile terminal
CN111445929A (en) Voice information processing method and electronic equipment
CN110880330A (en) Audio conversion method and terminal equipment
CN113593614B (en) Image processing method and device
CN112383666B (en) Content sending method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant