WO2016197708A1 - 一种录音方法及终端 - Google Patents

一种录音方法及终端 Download PDF

Info

Publication number
WO2016197708A1
WO2016197708A1 PCT/CN2016/079919 CN2016079919W WO2016197708A1 WO 2016197708 A1 WO2016197708 A1 WO 2016197708A1 CN 2016079919 W CN2016079919 W CN 2016079919W WO 2016197708 A1 WO2016197708 A1 WO 2016197708A1
Authority
WO
WIPO (PCT)
Prior art keywords
time point
user information
audio
ith
information
Prior art date
Application number
PCT/CN2016/079919
Other languages
English (en)
French (fr)
Inventor
奚黎明
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016197708A1 publication Critical patent/WO2016197708A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • G11B2020/10537Audio or video recording
    • G11B2020/10546Audio or video recording specifically adapted for audio data

Definitions

  • the present application relates to, but is not limited to, information processing technology in the field of electronic applications, and in particular, to a recording method and a terminal.
  • terminals With the popularity of terminals, terminals have become an indispensable and portable electronic device in life, and you can record things around you at any time. For some occasions, the terminal also needs to record information by means of recording, for example, meeting minutes.
  • the terminal starts the recording setting, and collects the voice information in the scene through the microphone to obtain the audio data, so that the user can reproduce the scene when playing the audio data on the terminal at any time after the recording.
  • Voice information in.
  • the user can record the conference content of the conference through the microphone, and then when the audio data is played through the terminal, the conference content can be reproduced to facilitate the recording.
  • the user needs to query the predetermined content in the audio data obtained by the recording, for example, if more than one person in the conference has made a speech, it is necessary to adjust the playing progress of the audio data, and audition the audio data to find a person's speech.
  • Content if the audio data is large, this search process can be time consuming and labor intensive. Therefore, the operation of finding the predetermined audio content in the audio data in the related art is cumbersome and not user-friendly, and the user experience is reduced.
  • This paper proposes a recording method and terminal, which can record different audio according to different users, which reflects the humanized design and improves the intelligence of the terminal.
  • a recording method that includes:
  • the i-th tag information includes: an i-th tag time point and an i-th tag identifier, where N ⁇ i ⁇ 1, N ⁇ 2, N Marked The total number, N is a positive integer;
  • the audio data between the ith mark time point and the i+1th mark time point is saved as the ith audio matched by the user information corresponding to the ith mark identifier. file.
  • the method further includes:
  • the at least two audio files are combined into one audio file, the one audio file matching the same user information as the at least two audio files.
  • the method further includes:
  • the N pieces of audio information on the first track and the audio data on the second track are combined into one synthesized recording file.
  • the method further includes:
  • the acquiring the i-th tag information includes:
  • a terminal comprising:
  • An acquiring unit configured to acquire, in the process of recording the audio data, the i-th tag information, where the i-th tag information includes: an i-th tag time point and an i-th tag identifier, where N ⁇ i ⁇ 1, N ⁇ 2, N is the total number of marks, and N is a positive integer;
  • a determining unit configured to determine user information corresponding to the i-th tag identifier according to the corresponding relationship between the i-th tag identifier and the preset tag identifier and the visited user information acquired by the acquiring unit;
  • a saving unit configured to save, when i ⁇ N, audio data between the i-th mark time point and the i+1th mark time point acquired by the acquiring unit as determined by the determining unit
  • the ith flag identifies an ith audio file that the corresponding user information matches.
  • the terminal further includes: a detecting unit and a synthesizing unit;
  • the saving unit is further configured to save the audio data between the ith mark time point acquired by the acquiring unit and the recording end time point as the ith number determined by the determining unit Marking the i-th audio file matched by the corresponding user information;
  • the detecting unit is configured to detect whether the N audio files saved by the saving unit are There are at least two audio files that match the same user information;
  • the synthesizing unit is configured to synthesize the at least two audio files saved by the saving unit into one audio file if the detecting unit detects that there are at least two audio files that match the same user information, The one audio file matches the same user information as the at least two audio files.
  • the terminal further includes: a converting unit and an inserting unit;
  • the converting unit is configured to convert the user information corresponding to the ith flag identifier determined by the determining unit into the ith audio information;
  • the inserting unit is configured to insert the ith audio information converted by the converting unit into the first audio track when the ith marking time point acquired by the acquiring unit is used;
  • the insertion unit is further configured to insert audio data between the i-th mark time point and the i+1th mark time point onto the second track when i ⁇ N;
  • the synthesizing unit is further configured to synthesize the N pieces of audio information on the first track and the audio data on the second track into a synthesized recording file.
  • the acquiring unit is further configured to acquire a preset visited user information database before acquiring the i-th tag information;
  • the determining unit is further configured to determine a correspondence between the preset tag identifier and the accessed user information according to the preset visited user information database acquired by the acquiring unit.
  • the obtaining, by the acquiring unit, the ith flag information includes:
  • the first operation is used to determine the i th mark information; and acquiring an i th mark identifier according to the i th first operation; and acquiring the i th
  • the occurrence time of an operation the occurrence time of the i-th first operation is the i-th marked time point.
  • a computer readable storage medium storing computer executable instructions, the computer being executable
  • the recording method is implemented when the line instructions are executed by the processor.
  • An embodiment of the present invention provides a recording method and a terminal.
  • the method includes: acquiring, in a process of recording audio data, an i-th tag information, where the i-th tag information includes: an i-th tag time point and an i-th Marking identifier, where N ⁇ i ⁇ 1, N ⁇ 2, N is a positive integer; determining the user corresponding to the i-th tag identifier according to the correspondence relationship between the i-th tag identifier and the preset tag identifier and the visited user information Information; when i ⁇ N, the audio data between the i-th mark time point and the i+1th mark time point is saved as the i-th audio file matched with the user information corresponding to the i-th mark identifier.
  • the information corresponding to the content of the speaker (user information) of the speaker or the interviewee is saved in the process of recording, so that the terminal can be based on different users.
  • the terminal can be based on different users.
  • it reflects the humanized design and improves the intelligence of the terminal.
  • FIG. 1 is a flowchart 1 of a recording method according to an embodiment of the present invention.
  • FIG. 2 is a flowchart 2 of a recording method according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an interface for setting a recording mark according to an embodiment of the present invention.
  • FIG. 4 is a flowchart 3 of a recording method according to an embodiment of the present invention.
  • FIG. 5 is a flowchart 4 of a recording method according to an embodiment of the present invention.
  • FIG. 6 is a flowchart 5 of a recording method according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram 1 of a terminal according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram 2 of a terminal according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram 3 of a terminal according to an embodiment of the present invention.
  • An embodiment of the present invention provides a recording method. As shown in FIG. 1 , the method may include steps S101 to S103:
  • S101 in the process of recording audio data, acquiring an i-th tag information, where the i-th tag information includes: an i-th tag time point and an i-th tag identifier, where N ⁇ i ⁇ 1, N ⁇ 2, N is the total number of markers, and N is a positive integer.
  • the recording method provided by the embodiment of the present invention is suitable for recording a plurality of visitors in a recording process, or recording a situation in which a plurality of people are speaking in a conference, that is, recording a conference. Or multiple people access the situation to be recorded, etc.
  • steps S101 to S103 of this embodiment may be, but are not limited to, being performed by a terminal.
  • the terminal in the embodiment of the present invention is an electronic device having a recording function, such as a voice recorder, a smart phone, a tablet computer, or the like.
  • the terminal in the embodiment of the present invention can receive the mark information through the touch screen, and can also perform corresponding mark setting on the setting interface, and can also receive the mark information by setting a sensor or a sensor that can sense the touch operation, which is not limited in the embodiment of the present invention.
  • the ith in the embodiment of the present invention is sequentially implemented in sequence.
  • the identification information acquired for the first time is the first identification information
  • the identification information acquired for the second time is the second identification information. And so on.
  • the N in the embodiment of the present invention is at least two, and the specific value of N can be determined according to the situation in which the actual user speaks.
  • the process for the terminal to obtain the i-th identification information in the embodiment of the present invention includes steps S1011 to S1013:
  • S1011 Acquire an ith first operation, where the first operation is used to determine an ith flag information.
  • the user can input the mark identifier on the touch screen, the sensing area or the mark setting interface of the terminal, that is, the terminal acquires the ith first operation for determining the ith mark information.
  • the first operation may be a gesture, or may be an input operation. Limit the form of the first operation.
  • the user can slide the ith first gesture through the touch screen or the sensing area of the terminal, that is, the terminal acquires the ith first operation (sliding gesture operation).
  • the user can directly perform the ith input operation through the terminal having the setting interface or the setting button, so that the terminal acquires the ith first operation (input operation).
  • the embodiment of the present invention presupposes that audio data of a plurality of users is recorded in one recording, there may be a case where a plurality of marking information needs to be acquired, so that a process of acquiring a plurality of first operations occurs.
  • the first operation set for each speaking period may be the same or different for the case where the same user speaks at different time periods.
  • each user's different speaking period corresponds to a first operation, so that when the user's speech is extracted, an entire operation content corresponding to the user can be extracted at one time by one operation.
  • different users' speeches need to set different first operations.
  • the terminal After the ith first operation is performed, the terminal obtains the i-th operation data corresponding to the ith first operation, and the operation data acquired by the terminal is the ith. Tag ID.
  • the identifiers in the embodiments of the present invention may be in the form of a figure, a symbol, a number, or a text, and the like.
  • the user A slides a line gesture on the touch screen of the mobile phone, and the line gesture is induced to be a “Z” on the touch screen of the mobile phone.
  • "Z” is identified as the third mark.
  • the user A can also obtain the "ZM" of the input operation input as the mark identifier through the input operation in the mark setting interface in the mobile phone.
  • the first operation in the embodiment of the present invention is a specific gesture for marking or Actions, inputs, etc., such as a specific letter gesture, etc.
  • the terminal can only recognize the mark identification according to the gesture when acquiring a specific gesture. If the terminal obtains an operation other than setting the first operation type during the recording process, the terminal does not mark the operation. In this way, the occurrence of mis-marking due to misoperation or no touch during recording is avoided.
  • the obtaining, by the terminal, the first operation is performed in a preset time period in which the operation is started, and the preset time may be 30 seconds, and the time value may be customized according to the actual application scenario. limit.
  • the first operation in the embodiment of the present invention may be two touch actions or the number of inputs, etc., therefore, the terminal needs to acquire the first operation within a preset time.
  • the first operation is acquired within a preset time from the time when the first operation occurs, so that the second operation of the second letter that is acquired by the terminal is avoided, and the terminal is mistaken for two. The first operation happens.
  • the terminal When acquiring the i-th first operation, the terminal simultaneously acquires the occurrence time corresponding to the i-th first operation, that is, the i-th marked time point.
  • the occurrence time of the ith first operation in the embodiment of the present invention is the starting time from the recording of this time. That is, the occurrence time of the ith first operation in the embodiment of the present invention The difference between the occurrence time and the start time of the i-th first operation.
  • user A uses the mobile phone to start recording at 10 o'clock.
  • the mobile phone acquires the third gesture operation at 10:30, and the mobile phone records that the third gesture takes 30 minutes.
  • the terminal may obtain the correspondence between the preset tag identifier and the accessed user information before the recording, so the terminal may identify the identifier according to the i-th tag and the preset tag identifier.
  • the correspondence information with the visited user information is used to determine the user information corresponding to the i-th identifier.
  • the user information in the embodiment of the present invention may be the name of the speaker, the avatar of the user, and the like, which can characterize the identity of the speaker.
  • the correspondence between the preset tag identifier and the user information of the respondent may be a corresponding list between the tag identifier and the user identity information, for example, the correspondence between the tag identifier and the user's name or the user's avatar. List relationship.
  • the correspondence between the preset tag identifier and the respondent user information may be that the first letter of the user name is a tag identifier, and the tag identifies the name of the corresponding user.
  • the "ZM"-like tag identifier corresponds to Zhang Ming.
  • the i-th tag is identified as "ZM”, and the phone finds "ZM-Zhang Ming" according to the correspondence between the preset tag identifier and the user information of the respondent. Therefore, the mobile phone determines that the user information corresponding to the i-th tag identifier is Zhang Ming.
  • the terminal After determining the user information corresponding to the i-th tag identifier according to the corresponding relationship between the i-th tag identifier and the preset tag identifier and the visited user information, the terminal acquires the i+1th device at the terminal when i ⁇ N When the information is marked, the terminal may save the recorded audio data between the ith marker time point and the i+1th marker time point as the ith audio matching the user information corresponding to the ith marker identifier. file.
  • the terminal may segment the recorded audio data between the ith mark time point and the i+1th mark time point, and save the ith audio file named by the user information corresponding to the ith mark identifier.
  • i ⁇ N that is, the terminal acquires the i+1th tag information
  • the terminal sets the i-th identification time point and the next i+1th tag.
  • the audio data recorded between the time points is saved first.
  • the terminal also performs the recording work of the user corresponding to the normal i+1th tag identifier.
  • the mobile phone follows the order of the time points of the two times before and after the second mark is obtained, and the mobile phone firstly saves the first audio file as the first audio file, then the second one.
  • the marked time point is used as the starting point saved in the next group of segments; wherein the name of the recording file is automatically saved as the user information of the speaker according to the correspondence between the preset tag identifier and the accessed user information (name, for example "Zhang Ming"), when the user information in the embodiment of the present invention also includes the user's avatar, the first audio file is also displayed by the avatar information of the speaker.
  • the segmentation principle is the same as the above process. If the same person (same tag information) has audio data that is spoken at different time periods, it can also be saved in sections according to this method, or automatically saved and named as Zhang Ming-1, Zhang Ming-2 and so on.
  • a recording method provided by an embodiment of the present invention further includes: S201-S204.
  • the i-th tag information acquired by the terminal may be the last tag.
  • the terminal After the terminal acquires the recording end time point, the terminal can save the audio data (that is, the audio data of the last speaker) between the i-th mark time point and the recording end time point as corresponding to the i-th mark identifier.
  • the user information matches the i-th audio file.
  • the terminal After the terminal finishes saving the N audio files, since the same speaker can speak several times at different time points, there may be multiple audio files corresponding to the same speaker, and the terminal can detect the audio file. Whether the user information corresponding to the N or N segment audio files is the same.
  • the at least two audio files are combined into one audio file, and the one audio file matches the same user information as the at least two audio files.
  • the terminal detects whether there are at least two audio files matching the same user information in the N audio files, and when the terminal detects that there are at least two audio files (s) matching the same user information, the terminal may The at least two audio files matching the same user information are combined and combined into one audio file.
  • the terminal splices the audio files corresponding to Zhang Ming-1 and Zhang Ming-2 into one audio file. Save it, And named this paragraph audio file with Zhang Ming. In this way, the speeches of the same speaker can be brought together, which is convenient for users to query and organize in the future.
  • a recording method provided by an embodiment of the present invention further includes: S301-S302.
  • S301. Acquire a preset visited user information database.
  • the speaker (pre-reviewed user information database) of the recording can be obtained first, and then each speaker is set on the terminal.
  • Tag ID the speaker (pre-reviewed user information database) of the recording.
  • the preset visited user information database can be manually recorded by the user.
  • the process of determining, by the terminal according to the preset visited user information database, the correspondence between the preset tag identifier and the accessed user information may be a form in which the terminal sets the tag identifier, and then is related to the user information of each respondent (speaker). You can connect.
  • the implementation manner may be an information association scheme in the related art, which is not limited in the embodiment of the present invention.
  • the usage scenario of the recording method provided by the embodiment of the present invention may be that the terminal black screen performs background recording, or the screen is lit for background recording, or the terminal without the screen performs recording.
  • the user can select different input manners of the marker information according to different usage scenarios, so that the user can perform recording when the screen set on the terminal is dark or bright.
  • the terminal does not have a screen or a screen lock screen
  • the user can directly perform a gesture action on the preset sensing area on the screen, or have a screen on the terminal and the screen is on and recording in the background.
  • a recording method in the process of recording audio data, obtaining an i-th tag information, where the i-th tag information includes: an i-th tag time point and an i-th tag identifier, where , N ⁇ i ⁇ 1, N ⁇ 2, where N is the total number of markers, and N is a positive integer; the i-th flag is determined according to the correspondence between the i-th marker identifier and the preset marker identifier and the visited user information. Identify the corresponding user information; when i ⁇ N, the i-th mark time point and the i+1th mark time The audio data between the points is saved as the i-th audio file that matches the user information corresponding to the i-th tag identifier.
  • the terminal since the terminal (in the recording process), the information (user information) of the speaker or the interviewee is correspondingly marked and corresponding to the speech content of the corresponding speaker, the terminal can be based on different users. To record different audio, it reflects the humanized design and improves the intelligence of the terminal.
  • An embodiment of the present invention provides a recording method. As shown in FIG. 6, the method may include: steps S401 to S410.
  • S402. Determine, according to the preset visited user information database, a correspondence between the preset tag identifier and the accessed user information.
  • the recording method provided by the embodiment of the present invention is suitable for recording a plurality of visitors in a recording process, or recording a situation in which a plurality of people are speaking in a conference, that is, recording a conference. Or multiple people access the situation to be recorded, etc.
  • the terminal in the embodiment of the present invention is an electronic device having a recording function, such as a voice recorder, a smart phone, a tablet computer, or the like.
  • the terminal in the embodiment of the present invention can receive the mark information through the touch screen, and can also perform corresponding mark setting on the setting interface, and can also receive the mark information by setting a sensor or a sensor that can sense the touch operation, which is not limited in the embodiment of the present invention.
  • the speaker (pre-reviewed user information database) of the recording can be obtained first, and then each speaker is set on the terminal.
  • Tag ID the speaker (pre-reviewed user information database) of the recording.
  • the preset visited user information database can be manually recorded by the user.
  • the process of determining, by the terminal according to the preset visited user information database, the correspondence between the preset tag identifier and the accessed user information may be a form in which the terminal sets the tag identifier, and then is related to the user information of each respondent (speaker). You can connect.
  • the implementation manner may be an information association scheme in the related art, which is not limited in the embodiment of the present invention.
  • the usage scenario of the recording method provided by the embodiment of the present invention may be that the terminal black screen performs background recording, or the screen is lit for background recording, or the terminal without the screen performs recording.
  • the user can select different input manners of the marker information according to different usage scenarios, so that the user can perform recording when the screen set on the terminal is dark or bright.
  • the terminal does not have a screen or a screen lock screen
  • the user can directly perform a gesture action on the preset sensing area on the screen, or have a screen on the terminal and the screen is on and recording in the background.
  • the i-th tag information includes: an i-th tag time point and an i-th tag identifier, where N ⁇ i ⁇ 1, N ⁇ 2, N is a positive integer.
  • the ith in the embodiment of the present invention is sequentially implemented in sequence.
  • the identification information acquired for the first time is the first identification information
  • the identification information acquired for the second time is the second identification information. And so on.
  • the N in the embodiment of the present invention is at least 2, and the value of N can be determined according to the situation in which the actual user speaks.
  • the process for the terminal to obtain the i-th identification information in the embodiment of the present invention includes steps S1011 to S1013:
  • S1011 Acquire an ith first operation, where the first operation is used to determine an ith flag information.
  • the user can input the mark identifier on the touch screen, the sensing area or the mark setting interface of the terminal, that is, the terminal acquires the ith first operation for determining the ith mark information.
  • the first operation may be a gesture, or may be a form in which the input operation does not limit the first operation.
  • the user can slide the ith first gesture through the touch screen or the sensing area of the terminal, that is, the terminal acquires the ith first operation (sliding gesture operation).
  • the user can pass The terminal having the setting interface or the setting button directly performs the i-th input operation, so that the terminal acquires the i-th first operation (input operation).
  • the embodiment of the present invention presupposes that audio data of a plurality of users is recorded in one recording, there may be a case where a plurality of marking information needs to be acquired, so that a process of acquiring a plurality of first operations occurs.
  • the first operation set for each speaking period may be the same or different for the case where the same user speaks at different time periods.
  • each user's different speaking period corresponds to a first operation, so that when the user's speech is extracted, an entire operation content corresponding to the user can be extracted at one time by one operation.
  • different users' speeches need to set different first operations.
  • the terminal After the ith first operation is performed, the terminal obtains the i-th operation data corresponding to the ith first operation, and the operation data acquired by the terminal is the ith. Tag ID.
  • the identifiers in the embodiments of the present invention may be in the form of a figure, a symbol, a number, or a text, and the like.
  • the user A slides a line gesture on the touch screen of the mobile phone, and the line gesture is induced to be a “Z” on the touch screen of the mobile phone.
  • "Z” is identified as the third mark.
  • the user A can also obtain the "ZM" of the input operation input as the mark identifier through the input operation in the mark setting interface in the mobile phone.
  • the first operation in the embodiment of the present invention is a specific gesture or action or input for marking, such as a specific letter gesture, etc., and the terminal can only recognize the specific gesture according to the gesture. Tag ID. If the terminal obtains an operation other than setting the first operation type during the recording process, the terminal does not mark the operation. In this way, it avoids recording The occurrence of mis-marking due to misoperation or no touch during the process.
  • the obtaining, by the terminal, the first operation is performed in a preset time period in which the operation is started, and the preset time may be 30 seconds, and the time value may be customized according to the actual application scenario. limit.
  • the first operation in the embodiment of the present invention may be two touch actions or the number of inputs, etc., therefore, the terminal needs to acquire the first operation within a preset time.
  • the first operation is acquired within a preset time from the time when the first operation occurs, so that the second operation of the second letter that is acquired by the terminal is avoided, and the terminal is mistaken for two. The first operation happens.
  • the terminal When acquiring the i-th first operation, the terminal simultaneously acquires the occurrence time corresponding to the i-th first operation, that is, the i-th marked time point.
  • the occurrence time of the ith first operation in the embodiment of the present invention is the starting time from the recording of this time. That is, the occurrence time of the ith first operation in the embodiment of the present invention is the difference between the occurrence time and the start time of the ith first operation.
  • user A uses the mobile phone to start recording at 10 o'clock.
  • the mobile phone acquires the third gesture operation at 10:30, and the mobile phone records the time when the third gesture occurs. It is 30 minutes.
  • the terminal may obtain the correspondence between the preset tag identifier and the accessed user information before the recording, so the terminal may identify the identifier according to the i-th tag and the preset tag identifier.
  • the correspondence information with the visited user information is used to determine the user information corresponding to the i-th identifier.
  • the user information in the embodiment of the present invention may be the name of the speaker, the avatar of the user, and the like, which can characterize the identity of the speaker.
  • the correspondence between the preset tag identifier and the user information of the respondent may be a corresponding list between the tag identifier and the user identity information, for example, the correspondence between the tag identifier and the user's name or the user's avatar. List relationship.
  • the correspondence between the preset tag identifier and the respondent user information may be that the first letter of the user name is a tag identifier, and the tag identifies the name of the corresponding user.
  • the "ZM"-like tag identifier corresponds to Zhang Ming.
  • the i-th tag is identified as "ZM”, and the phone finds "ZM-Zhang Ming" according to the correspondence between the preset tag identifier and the user information of the respondent. Therefore, the mobile phone determines that the user information corresponding to the i-th tag identifier is Zhang Ming.
  • the audio data can be divided into two channels, the left and right channels are played.
  • the i-th tag identifier can be correspondingly first.
  • the user information is converted into the ith audio information (speech).
  • the conversion method of the user information to the audio information can be implemented by any method currently used, and is not limited herein.
  • the terminal converts the i-th audio information by the user information “Zhang Ming” corresponding to the i-th mark identifier “ZM”.
  • the audio data can be divided into two channels, so when the terminal is recording, at the i-th mark time point, The terminal can insert the ith audio information onto the first track.
  • the terminal converts the user information “Zhang Ming” corresponding to the i-th mark identifier “ZM” into the i-th audio information, and inserts the voice “Zhang Ming” into the audio track of the left channel.
  • the terminal After determining the user information corresponding to the i-th tag identifier according to the corresponding relationship between the i-th tag identifier and the preset tag identifier and the visited user information, the terminal obtains the i+1 at the terminal when i ⁇ N When marking information, the terminal may insert the recorded audio data between the ith mark time point and the i+1th mark time point onto the second track.
  • the terminal may segment the recorded audio data between the ith mark time point and the i+1th mark time point into the second track.
  • i ⁇ N that is, the terminal acquires the i+1th tag information
  • the terminal sets the i-th identification time point and the next i+1th tag.
  • the audio data recorded between the time points is saved and inserted onto the second track.
  • the terminal can continue to perform the recording work of the user corresponding to the normal i+1th tag identifier.
  • the mobile phone inserts the first audio file into the audio track of the right channel after the second marker identification is completed.
  • the terminal After the terminal acquires the recording end time point, the terminal can insert the audio data between the ith mark time point and the recording end time point onto the second track.
  • the terminal inserts N pieces of audio information into the first track, and the terminal inserts audio data between the i-th mark time point and the i+1th mark time point into the second track and the terminal will be the i-th After the audio data between the mark time point and the recording end time point is inserted into the second track, the terminal combines the N audio information on the first track and the audio data on the second track into a composite recording. file.
  • the terminal inserts the user information corresponding to each mark identifier into the first track of the recording at each marked time point, and simultaneously inserts the recorded content into the second track.
  • the terminal inserts the user information corresponding to each mark identifier into the first track at each marked time point, and inserts the speaker's recorded content into the second track.
  • the synthesized recording file obtained by the terminal is a recording file in which two sounds have different audio data and is saved.
  • the terminal When playing the recording file, the terminal first needs to judge whether the multi-channel device needs to play the recorded recording file. If necessary, the left channel plays the audio information of the audio track 1, and the right channel plays normally.
  • the recording of the speaker can be.
  • the track 1 and the track 2 are separated so that the track 1 corresponds to the left channel and the track 2 corresponds to the right channel.
  • the left channel of the earphone will play the corresponding voice information of the user information "Zhang" and "Ming" at a certain point in the mark, and the right channel will play the recording of the speaker Zhang Ming. content.
  • a recording method in the process of recording audio data, obtaining an i-th tag information, where the i-th tag information includes: an i-th tag time point and an i-th tag identifier, where , N ⁇ i ⁇ 1, N ⁇ 2, N is a positive integer; determining the user information corresponding to the i-th tag identifier according to the correspondence relationship between the i-th tag identifier and the preset tag identifier and the visited user information; When ⁇ N, the audio data between the i-th mark time point and the i+1th mark time point is saved as the i-th audio file matched by the user information corresponding to the i-th mark identifier.
  • the terminal since the terminal (in the recording process), the information (user information) of the speaker or the interviewee is correspondingly marked and corresponding to the speech content of the corresponding speaker, the terminal can be based on different users. To record different audio, it reflects the humanized design and improves the intelligence of the terminal.
  • a terminal 1 which may include:
  • the obtaining unit 10 is configured to acquire the i-th flag information in the process of recording the audio data, where the i-th tag information includes: an i-th tag time point and an i-th tag identifier, where N ⁇ i ⁇ 1 , N ⁇ 2, N is a positive integer.
  • the determining unit 11 is configured to determine the user information corresponding to the i-th tag identifier according to the corresponding relationship between the i-th tag identifier acquired by the acquiring unit 10 and the preset tag identifier and the visited user information.
  • the saving unit 12 is configured to save the audio data between the ith mark time point and the i+1th mark time point acquired by the obtaining unit 10 as the determining unit when i ⁇ N
  • the determined i-th flag identifies the i-th audio file that the corresponding user information matches.
  • the terminal 1 further includes: a detecting unit 13 and a synthesizing unit 14.
  • the ith flag determined by the unit 11 identifies the i-th audio file that the corresponding user information matches.
  • the detecting unit 13 is configured to detect whether there are at least two audio files that match the same user information among the N audio files saved by the saving unit 12.
  • the synthesizing unit 14 is configured to synthesize the at least two audio files saved by the saving unit 12 into one audio if the detecting unit 13 detects that there are at least two audio files that match the same user information. a file, the one audio file matching the same user information as the at least two audio files.
  • the terminal 1 further includes: a converting unit 15 and an inserting unit 16.
  • the converting unit 15 is configured to, after the determining unit 11 determines the user information corresponding to the i-th tag identifier, according to the corresponding relationship between the i-th tag identifier and the preset tag identifier and the user information, The user information corresponding to the ith flag identifier determined by the determining unit 11 is converted into the ith audio information.
  • the inserting unit 16 is configured to insert the ith audio information converted by the converting unit 15 into the first audio track when the ith marking time point acquired by the acquiring unit 10, and When i ⁇ N, the audio data between the i-th mark time point and the i+1th mark time point is inserted on the second track.
  • the insertion unit 16 is further configured to insert audio data between the ith mark time point acquired by the acquisition unit 10 and the recording end time point onto the second track.
  • the synthesizing unit 14 is further configured to synthesize the N pieces of audio information on the first track and the audio data on the second track that are integrated by the inserting unit 16 into one synthesized recording file.
  • the obtaining unit 10 is further configured to acquire a preset visited user information database before acquiring the ith flag information.
  • the determining unit 11 is further configured to determine a correspondence between the preset tag identifier and the accessed user information according to the preset visited user information database acquired by the acquiring unit 10.
  • the acquiring, by the acquiring unit 10, the i-th flag information includes: acquiring an i-th first operation, where the first operation is used to determine the i-th tag information; and according to the i-th An operation obtains the i-th tag identifier. And acquiring an occurrence time of the i-th first operation, where the occurrence time of the i-th first operation is the i-th marked time point.
  • the terminal in the embodiment of the present invention is an electronic device having a recording function, such as a voice recorder, a smart phone, a tablet computer, or the like.
  • the terminal in the embodiment of the present invention can receive the mark information through the touch screen, and can also perform corresponding mark setting on the setting interface, and can also receive the mark information by setting a sensor or a sensor that can sense the touch operation, which is not limited in the embodiment of the present invention.
  • the above-mentioned obtaining unit 10, determining unit 11, detecting unit 13, synthesizing unit 14, converting unit 15, and inserting unit 16 may be implemented by a processor located on the terminal 1, for example, by a central processing unit (CPU), micro Implemented by a processor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA), etc.
  • the saving unit 12 can be implemented by a memory, and the memory can be connected to the processor through a system bus, wherein the memory is used for storing
  • the program code is executed, the program code includes computer operating instructions, the memory may include high speed RAM memory, and may also include non-volatile memory, such as at least one disk memory.
  • a terminal provided by the embodiment of the present invention in the process of recording audio data, obtains an i-th tag information, where the i-th tag information includes: an i-th tag time point and an i-th tag identifier, where N ⁇ i ⁇ 1, N ⁇ 2, N is a positive integer; the user information corresponding to the i-th tag identifier is determined according to the correspondence relationship between the i-th tag identifier and the preset tag identifier and the visited user information; N The audio data between the i-th mark time point and the i+1th mark time point is saved as the i-th audio file that matches the user information corresponding to the i-th mark identifier.
  • the information corresponding to the content of the speaker (user information) of the speaker or the interviewee is saved in the process of recording, so that the terminal can be different according to different
  • the user records different audios, which reflects the humanized design and improves the intelligence of the terminal.
  • a computer readable storage medium storing computer executable instructions that, when executed by a processor, implement the recording method.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention can take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device. Having a series of operational steps performed on a computer or other programmable device to produce computer-implemented processing such that instructions executed on a computer or other programmable device are provided for implementing one or more processes and/or block diagrams in the flowchart. The steps of a function specified in a box or multiple boxes.
  • the information (user information) of the speaker or the interviewee is saved with the corresponding content of the corresponding speaker's speech, so that the terminal can record according to different users.
  • Different audios reflect the humanized design and improve the intelligence of the terminal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Telephone Function (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种录音方法和终端,该方法包括:在记录音频数据的过程中,获取第i个标记信息,该第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为标记的总个数,N为正整数;根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息;当i N时,将第i个标记时间点和第i+1个标记时间点之间的音频数据,保存为与第i个标记标识对应的用户信息匹配的第i个音频文件。

Description

一种录音方法及终端 技术领域
本申请涉及但不限于电子应用领域中的信息处理技术,尤其涉及一种录音方法及终端。
背景技术
随着终端的普及,终端已经成为生活中必不可少且随身携带的电子设备,可以随时对身边的事情进行记录。对于一些场合,也需要终端采用录音的方式进行信息记录,例如,会议记录等。
相关技术中,在录音过程中,终端启动录音设置,通过麦克风对场景中的语音信息进行采集,得到音频数据,以使得用户在录音之后的任意时刻,在终端上播放该音频数据时可以再现场景中的语音信息。比如,用户可以通过麦克风记录会议的会议内容,然后通过终端播放该音频数据时,可以再现会议内容,以便于记录整理。
然而,若用户需要在录音得到的音频数据中查询预定内容,比如,会议中有多个人员进行了发言,则需要调整音频数据的播放进度,对音频数据进行试听,以查找到某个人的发言内容,如果音频数据很大,这一查找过程会耗时耗力。因此,相关技术中的查找音频数据中预定的音频内容的操作比较繁琐、不够人性化,降低了用户体验感。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本文提出了一种录音方法及终端,能够根据不同的用户来记录不同的音频,体现了人性化设计,提高了终端的智能化。
一种录音方法,包括:
在记录音频数据的过程中,获取第i个标记信息,所述第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为标记的 总个数,N为正整数;
根据所述第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息;
当i≠N时,将所述第i个标记时间点和第i+1个标记时间点之间的音频数据,保存为与所述第i个标记标识对应的用户信息匹配的第i个音频文件。
可选地,所述方法还包括:
在确定第i个标记标识对应的用户信息之后,当i=N时,获取录音结束时间点;
将所述第i个标记时间点和所述录音结束时间点之间的音频数据,保存为与所述第i个标记标识对应的用户信息匹配的第i个音频文件;
检测N个音频文件中,是否存在匹配了相同的用户信息的至少两个音频文件;
如果存在匹配了相同的用户信息的至少两个音频文件,则将所述至少两个音频文件合成为一个音频文件,所述一个音频文件匹配与所述至少两个音频文件相同的用户信息。
可选地,所述方法还包括:
将所述第i个标记标识对应的用户信息转化为第i个音频信息;
在所述第i个标记时间点时,将所述第i个音频信息插入到第一音轨上;
当i≠N时,将所述第i个标记时间点和第i+1个标记时间点之间的音频数据插入到第二音轨上;
当i=N时,将所述第i个标记时间点和所述录音结束时间点之间的音频数据插入到第二音轨上;
将所述第一音轨上的N个音频信息和所述第二音轨上的音频数据合成为一个合成录音文件。
可选地,所述方法还包括:
在获取所述第i个标记信息之前,获取预设被访用户信息库;
根据所述预设被访用户信息库,确定预设的标记标识与被访用户信息的对应关系。
可选地,所述获取第i个标记信息,包括:
获取第i个第一操作,所述第一操作用于确定所述第i个标记信息;
根据所述第i个第一操作,获取第i个标记标识;
获取所述第i个第一操作的发生时间,所述第i个第一操作的发生时间为所述第i个标记时间点。
一种终端,包括:
获取单元,设置为在记录音频数据的过程中,获取第i个标记信息,所述第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为标记的总个数,N为正整数;
确定单元,设置为根据所述获取单元获取的所述第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息;
保存单元,设置为当i≠N时,将所述获取单元获取的所述第i个标记时间点和第i+1个标记时间点之间的音频数据,保存为与所述确定单元确定的所述第i个标记标识对应的用户信息匹配的第i个音频文件。
可选地,所述终端还包括:检测单元和合成单元;
所述获取单元,还设置为在确定第i个标记标识对应的用户信息之后,当i=N时,获取录音结束时间点;
所述保存单元,还设置为将所述获取单元获取的所述第i个标记时间点和所述录音结束时间点之间的音频数据,保存为与所述确定单元确定的所述第i个标记标识对应的用户信息匹配的第i个音频文件;
所述检测单元,设置为检测所述保存单元保存的N个音频文件中,是否 存在匹配了相同的用户信息的至少两个音频文件;
所述合成单元,设置为如果所述检测单元检测出存在匹配了相同的用户信息的至少两个音频文件,则将所述保存单元保存的所述至少两个音频文件合成为一个音频文件,所述一个音频文件匹配与所述至少两个音频文件相同的用户信息。
可选地,所述终端还包括:转化单元和插入单元;
所述转化单元,设置为将所述确定单元确定的所述第i个标记标识对应的用户信息转化为第i个音频信息;
所述插入单元,设置为在所述获取单元获取的所述第i个标记时间点时,将所述转化单元转化的所述第i个音频信息插入到第一音轨上;
所述插入单元,还用于当i≠N时,将所述第i个标记时间点和第i+1个标记时间点之间的音频数据插入到第二音轨上;
当i=N时,将所述获取单元获取的所述第i个标记时间点和所述录音结束时间点之间的音频数据插入到第二音轨上;
所述合成单元,还设置为将所述插入单元合好的所述第一音轨上的N个音频信息和所述第二音轨上的音频数据合成为一个合成录音文件。
可选地,所述获取单元,还设置为在获取第i个标记信息之前,获取预设被访用户信息库;
所述确定单元,还设置为根据所述获取单元获取的所述预设被访用户信息库,确定预设的标记标识与被访用户信息的对应关系。
可选地,所述获取单元获取第i个标记信息包括:
获取第i个第一操作,所述第一操作用于确定所述第i个标记信息;及根据所述第i个第一操作,获取第i个标记标识;以及获取所述第i个第一操作的发生时间,所述第i个第一操作的发生时间为所述第i个标记时间点。
一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执 行指令被处理器执行时实现所述的录音方法。
本发明实施例提供了一种录音方法及终端,该方法包括:在记录音频数据的过程中,获取第i个标记信息,该第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为正整数;根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息;当i≠N时,将第i个标记时间点和第i+1个标记时间点之间的音频数据,保存为与第i个标记标识对应的用户信息匹配的第i个音频文件。采用本发明实施例的方案,通过在录音的过程中,将被发言人或被访问者的信息(用户信息)与相应的发言人的发言内容相对应标记保存起来,使得终端能够根据不同的用户来记录不同的音频,体现了人性化设计,提高了终端的智能化。
附图概述
图1为本发明实施例提供了一种录音方法的流程图一;
图2为本发明实施例提供了一种录音方法的流程图二;
图3为本发明实施例提供了一种录音标记设置的界面示意图;
图4为本发明实施例提供了一种录音方法的流程图三;
图5为本发明实施例提供了一种录音方法的流程图四;
图6为本发明实施例提供了一种录音方法的流程图五;
图7为本发明实施例提供了一种终端的结构示意图一;
图8为本发明实施例提供了一种终端的结构示意图二;
图9为本发明实施例提供了一种终端的结构示意图三。
本发明的实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。
实施例一
本发明实施例提供了一种录音方法,如图1所示,该方法可以包括步骤S101~S103:
S101、在记录音频数据的过程中,获取第i个标记信息,该第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为标记的总个数,N为正整数。
需要说明的是,本发明实施例所提供的录音方法适用于在一次录音的过程中要记录多个访问者的情况,或者记录多个人在一场会议中要发言的情况,即在进行会议录音或多人访问要录音等的情况。
可选地,本实施例的步骤S101~S103可以但不限于由终端执行。本发明实施例中的终端为具有录音功能的电子设备,例如,录音笔、智能手机、平板电脑等。本发明实施例中的终端可以通过触摸屏接收标记信息,也可以在设置界面进行相应的标记设置,还可以通过设置有可以感应触摸操作的传感器或感应器接收标记信息,本发明实施例不作限制。
需要说明的是,本发明实施例中的第i个就是依次按照顺序实现的,例如第1次获取的标识信息就是第1个标识信息,第2次获取的标识信息就是第2个标识信息,依次类推。
由于本发明实施例中的录音针对两个以上的发言,因此,本发明实施例中的N至少为2,具体的N的数值是可以根据实际用户发言的情况决定的。
如图2所示,本发明实施例中终端获取第i个标识信息的过程包括步骤S1011~S1013:
S1011、获取第i个第一操作,该第一操作用于确定第i个标记信息。
终端在开始记录音频数据时,用户可以在终端的触摸屏、感应区域或者标记设置界面进行标记标识的输入,即终端获取用于确定第i个标记信息的第i个第一操作。
可选地,本发明实施例中,第一操作可以为手势,也可以为输入操作不 限制第一操作的形态。
例如,用户可以通过终端的触摸屏或感应区域滑动第i个第一手势,即终端就获取到了第i个第一操作(滑动手势操作)。或者,用户可以通过具有设置界面或设置按钮的终端直接进行第i个输入操作,使得该终端获取到第i个第一操作(输入操作)。
需要说明的是,由于本发明实施例是以在一次录音中记录多个用户的音频数据为前提,因此,会存在需要获取多个标记信息的情况,从而出现获取多个第一操作的过程,又由于一个用户可以在不同的时段发表言论,从而被终端记录,因此,针对同一个用户在不同时段发言的情况,对于每一个发言时段设置的第一操作可以都相同,也可以不同。为了避免产生混淆,并且提高操作效率,最好每个用户的不同发言时段都对应一个第一操作,以便提取该用户的发言时,通过一个操作可以一次提取与该用户对应的全部发言内容。另外,为了避免混淆,不同的用户的发言需要设置不同的第一操作。
S1012、根据第i个第一操作,获取第i个标记标识。
终端获取第i个第一操作之后,由于第i个第一操作使得终端可以获取到该第i个第一操作对应的第i个操作数据,因此,终端获取到的这些操作数据就是第i个标记标识。
可选地,本发明实施例中的标记标识可以为图形、符号、数字或文字等,本发明实施例不作限制。
例如,用户A使用手机进行录音的过程中,当第2个用户发言时,用户A在手机的触摸屏上滑动了一个折线手势,该折线手势在手机的触摸屏感应出来是一个“Z”,这里,将“Z”作为第3个标记标识。同理,如图3所示,用户A也可以在手机里的标记设置界面,通过输入操作,获取输入操作输入的“ZM”作为标记标识。
需要说明的是,本发明实施例中的第一操作是用于标记的特定的手势或 动作或输入等,例如特定的字母手势等,终端只在获取特定的手势时,才能根据该手势识别出标记标识。若是终端在录音的过程中获取到设定第一操作类型外的操作,则终端对该操作是不作标记处理的。这样,就避免了在录音过程中由于误操作或无触摸等原因造成的误标记情况的发生。
可选地,终端获取第一操作是可以在操作发生开始的预设时间内获取的,该预设时间可以为30秒等,时间数值可以根据实际应用场景进行自定义设置,本发明实施例不作限制。
需要说明的是,本发明实施例中的第一操作可以是两个触摸动作或输入次数等,因此,终端需要在一个预设时间内来获取第一操作。例如,从第一操作发生的时间开始的预设时间内,获取第一操作,这样,就避免了终端获取的第i个第一操作是两个字母手势的情况时会被终端误认为是两个第一操作的情况发生。
例如,用户A使用手机进行录音时,用户A在手机的触摸屏上进行字母手势图像操作,该手机的触摸屏从字母手势图像的边缘信息中提取关键点对手势进行识别,显示屏上则会显示出相应的字母手势图像,比如“张明”发言时,用户可以30秒内在屏幕上分别输入“Z”和“M”字母手势以表示;这里的“Z”和“M”是依次输入。可以理解的是,由于30秒的预设时间较短,即手机获取“Z”和“M”中间间隔时间较短,因此,手机可以根据字母手势图像操作,判定“ZM”为一个完整的标记标识。
S1013、获取第i个第一操作的发生时间,该第i个第一操作的发生时间为第i个标记时间点。
终端在获取第i个第一操作时,同时获取该第i个第一操作对应的发生时间,也就是第i个标记时间点。
需要说明的是,本发明实施例中的第i个第一操作的发生时间是以从这次的录音开始为起始时刻。即本发明实施例中的第i个第一操作的发生时间 为该第i个第一操作的发生时刻与起始时刻之间的差值。
例如,用户A使用手机在10点开始录音的,这个手机在10点30分的时候获取到了第3个手势操作,这时,手机就记录下该第3个手势发生的时间为30分钟。
S102、根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息。
终端在获取第i个标记信息之后,由于在录音之前该终端已经获取到了预设的标记标识与被访用户信息的对应关系,因此,该终端可以根据第i个标记标识和预设的标记标识与被访用户信息的对应关系来确定出第i个标识对应的用户信息。
需要说明的是,本发明实施例中的用户信息可以为发言人的姓名、用户的头像等可以表征发言人身份的信息。
需要说明的是,获取预设的标记标识与被访用户信息的对应关系的过程将在后续的实施例中进行详细地说明。
可选地,预设的标记标识与被访者用户信息的对应关系可以为标记标识与用户身份信息之间的对应列表,例如,可以为标记标识与用户的姓名或用户的头像之间的对应列表关系。
例如,预设的标记标识与被访者用户信息的对应关系可以是用户姓名的首字母为标记标识,该标记标识对应用户的姓名。例如,“ZM”样的标记标识对应的为张明。
例如,用户A使用手机记录会议录音时,获取到了第i个标记标识为“ZM”,这时手机根据预设的标记标识与被访者用户信息的对应关系,找到了“ZM-张明”,于是,该手机就确定了第i个标记标识对应的用户信息为张明。
S103、当i≠N时,将第i个标记时间点和第i+1个标记时间点之间的音 频数据,保存为与第i个标记标识对应的用户信息匹配的第i个音频文件。
终端根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息之后,当i≠N时,在该终端获取到第i+1个标记信息时,该终端可以将第i个标记时间点和第i+1个标记时间点之间的已经记录的音频数据,另存为与第i个标记标识对应的用户信息匹配的第i个音频文件。
终端可以将第i个标记时间点和第i+1个标记时间点之间的已经记录的音频数据分段出来,另存为采用第i个标记标识对应的用户信息命名的第i个音频文件。
需要说明的是,当i≠N时(即终端获取第i+1个标记信息),表明录音还没有结束,于是,终端将第i个标识时间点与下一次获取的第i+1个标记时间点之间记录的音频数据先另保存起来。同时,终端还在同步进行正常第i+1个标记标识对应的用户的录音工作。
例如,用户A使用手机进行会议记录的过程中,手机按照前后两次标记时间点顺序,在第2个标记标识获取完毕后,手机首先分段保存为第1个音频文件,这时第2个标记时刻点则作为下一组分段保存的起始标记点;其中,录音文件的名称根据预设的标记标识与被访用户信息的对应关系自动保存为该发言人的用户信息(姓名,例如“张明”),当本发明实施例中的用户信息还同时包括用户的头像时,该第1个音频文件也会同时以发言者的头像信息显示出来。若接下来还有多个的标记信息,分段保存原理与上述过程相同。若同一人(相同的标记信息)在不同时段均有发言的音频数据也可以依据此方式分段保存,或者自动保存并命名为张明-1、张明-2等样式。
如图4所示,在S102之后,本发明实施例提供的一种录音方法还包括:S201-S204。
S201、当i=N时,获取录音结束时间点。
终端根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息之后,由于此时终端获取的第i个标记信息可能是最后一个标记信息,该终端在继续进行下面的录音时不会再接收到第i+1个标记信息,因此,i=N时的情况下,终端可以获取到录音结束的时间点。
S202、将第i个标记时间点和录音结束时间点之间的音频数据,保存为与第i个标记标识对应的用户信息匹配的第i个音频文件。
终端获取录音结束时间点之后,该终端就可以将第i个标记时间点和录音结束时间点之间的音频数据(也就是最后一个发言人的音频数据),另存为与第i个标记标识对应的用户信息匹配的第i个音频文件。
S203、检测N个音频文件中,是否存在匹配了相同的用户信息的至少两个音频文件。
在终端结束了N个音频文件的保存之后,由于同一个发言人可以在不同的时间点发言了几次,因此,可能存在同一个发言人对应的多个音频文件,于是,该终端可以检测该N个或N段音频文件对应的用户信息是不是有相同的。
S204、如果存在匹配了相同的用户信息的至少两个音频文件,则将至少两个音频文件合成为一个音频文件,该一个音频文件匹配与至少两个音频文件相同的用户信息。
终端检测N个音频文件中,是否存在匹配了相同的用户信息的至少两个音频文件之后,当终端检测到存在与同一个用户信息匹配的至少两个音频文件(多个)时,该终端可以将同一个用户信息匹配的该至少两个音频文件合在一起,合成为一个音频文件。
例如,若终端另存的音频文件有张明-1、李四、张三、张明-2时,该终端就将张明-1和张明-2对应的音频文件拼接,合成为一段音频文件保存起来, 并以张明来命名该段音频文件。这样,就可以将同一个发言人的发言集中在一起了,便于用户以后的查询和整理。
如图5所示,在S101之前,本发明实施例提供的一种录音方法还包括:S301-S302。S301、获取预设被访用户信息库。
S302、根据预设被访用户信息库,确定预设的标记标识与被访用户信息的对应关系。
需要说明的是,在用户进行会议记录或录音工作开始之前,可以先获取到这次录音的发言人(预设被访用户信息库)有哪些,然后就在终端上设置每个发言人对应的标记标识。
预设被访用户信息库可以由用户手动统计记录。终端根据预设被访用户信息库,确定预设的标记标识与被访用户信息的对应关系的过程可以是终端设置标记标识的形式,然后与每个被访者(发言人)的用户信息相关联即可。实现方式可以为相关技术中的信息关联方案,本发明实施例不作限制。
需要说明的是,本发明实施例提供的一种录音方法的使用场景可以是终端黑屏进行后台录音,也可以是屏幕点亮进行后台录音,还可以是没有屏幕的终端进行录音等等。用户可以根据不同的使用场景选择不同的标记信息的输入方式,以使得用户在终端上设置的屏幕暗或亮的情况下都可以进行录音。例如,当终端没有屏幕或屏幕锁屏的情况下,终端还在录音时,用户可以在屏幕上的预设感应区直接手势操作完成标记动作,或者在终端有屏幕且屏幕亮着且在后台录音时,直接在感应区或设置界面输入标记信息。
本发明实施例所提供的一种录音方法,通过在记录音频数据的过程中,获取第i个标记信息,该第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为标记的总个数,N为正整数;根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息;当i≠N时,将第i个标记时间点和第i+1个标记时间 点之间的音频数据,保存为与第i个标记标识对应的用户信息匹配的第i个音频文件。采用上述技术实现方案,由于终端在录音的过程中,将被发言人或被访问者的信息(用户信息)与相应的发言人的发言内容相对应标记保存了,因此,终端能够根据不同的用户来记录不同的音频,体现了人性化设计,提高了终端的智能化。
实施例二
本发明实施例提供一种录音方法,如图6所示,该方法可以包括:步骤S401~S410。
S401、获取预设被访用户信息库。
S402、根据预设被访用户信息库,确定预设的标记标识与被访用户信息的对应关系。
需要说明的是,本发明实施例所提供的录音方法适用于在一次录音的过程中要记录多个访问者的情况,或者记录多个人在一场会议中要发言的情况,即在进行会议录音或多人访问要录音等的情况。
可选地,本发明实施例中的终端为具有录音功能的电子设备,例如,录音笔、智能手机、平板电脑等。本发明实施例中的终端可以通过触摸屏接收标记信息,也可以在设置界面进行相应的标记设置,还可以通过设置有可以感应触摸操作的传感器或感应器接收标记信息,本发明实施例不作限制。
需要说明的是,在用户进行会议记录或录音工作开始之前,可以先获取到这次录音的发言人(预设被访用户信息库)有哪些,然后就在终端上设置每个发言人对应的标记标识。
预设被访用户信息库可以由用户手动统计记录。终端根据预设被访用户信息库,确定预设的标记标识与被访用户信息的对应关系的过程可以是终端设置标记标识的形式,然后与每个被访者(发言人)的用户信息相关联即可。实现方式可以为相关技术中的信息关联方案,本发明实施例不作限制。
需要说明的是,本发明实施例提供的一种录音方法的使用场景可以是终端黑屏进行后台录音,也可以是屏幕点亮进行后台录音,还可以是没有屏幕的终端进行录音等等。用户可以根据不同的使用场景选择不同的标记信息的输入方式,以使得用户在终端上设置的屏幕暗或亮的情况下都可以进行录音。例如,当终端没有屏幕或屏幕锁屏的情况下,终端还在录音时,用户可以在屏幕上的预设感应区直接手势操作完成标记动作,或者在终端有屏幕且屏幕亮着且在后台录音时,直接在感应区或设置界面输入标记信息。
S403、在记录音频数据的过程中,获取第i个标记信息,该第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为正整数。
需要说明的是,本发明实施例中的第i个就是依次按照顺序实现的,例如第1次获取的标识信息就是第1个标识信息,第2次获取的标识信息就是第2个标识信息,依次类推。
由于本发明实施例中的录音针对两个以上的发言,因此,本发明实施例中的N至少为2,N的数值是可以根据实际用户发言的情况决定的。
如图2所示,本发明实施例中终端获取第i个标识信息的过程包括步骤S1011~S1013:
S1011、获取第i个第一操作,该第一操作用于确定第i个标记信息。
终端在开始记录音频数据时,用户可以在终端的触摸屏、感应区域或者标记设置界面进行标记标识的输入,即终端获取用于确定第i个标记信息的第i个第一操作。
可选地,本发明实施例中,第一操作可以为手势,也可以为输入操作不限制第一操作的形态。
例如,用户可以通过终端的触摸屏或感应区域滑动第i个第一手势,即终端就获取到了第i个第一操作(滑动手势操作)。或者,用户可以通过具 有设置界面或设置按钮的终端直接进行第i个输入操作,使得该终端获取到第i个第一操作(输入操作)。
需要说明的是,由于本发明实施例是以在一次录音中记录多个用户的音频数据为前提,因此,会存在需要获取多个标记信息的情况,从而出现获取多个第一操作的过程,又由于一个用户可以在不同的时段发表言论,从而被终端记录,因此,针对同一个用户在不同时段发言的情况,对于每一个发言时段设置的第一操作可以都相同,也可以不同。为了避免产生混淆,并且提高操作效率,最好每个用户的不同发言时段都对应一个第一操作,以便提取该用户的发言时,通过一个操作可以一次提取与该用户对应的全部发言内容。另外,为了避免混淆,不同的用户的发言需要设置不同的第一操作。
S1012、根据第i个第一操作,获取第i个标记标识。
终端获取第i个第一操作之后,由于第i个第一操作使得终端可以获取到该第i个第一操作对应的第i个操作数据,因此,终端获取到的这些操作数据就是第i个标记标识。
可选地,本发明实施例中的标记标识可以为图形、符号、数字或文字等,本发明实施例不作限制。
例如,用户A使用手机进行录音的过程中,当第2个用户发言时,用户A在手机的触摸屏上滑动了一个折线手势,该折线手势在手机的触摸屏感应出来是一个“Z”,这里,将“Z”作为第3个标记标识。同理,如图3所示,用户A也可以在手机里的标记设置界面,通过输入操作,获取输入操作输入的“ZM”作为标记标识。
需要说明的是,本发明实施例中的第一操作是用于标记的特定的手势或动作或输入等,例如特定的字母手势等,终端只在获取特定的手势时,才能根据该手势识别出标记标识。若是终端在录音的过程中获取到设定第一操作类型外的操作,则终端对该操作是不作标记处理的。这样,就避免了在录音 过程中由于误操作或无触摸等原因造成的误标记情况的发生。
可选地,终端获取第一操作是可以在操作发生开始的预设时间内获取的,该预设时间可以为30秒等,时间数值可以根据实际应用场景进行自定义设置,本发明实施例不作限制。
需要说明的是,本发明实施例中的第一操作可以是两个触摸动作或输入次数等,因此,终端需要在一个预设时间内来获取第一操作。例如,从第一操作发生的时间开始的预设时间内,获取第一操作,这样,就避免了终端获取的第i个第一操作是两个字母手势的情况时会被终端误认为是两个第一操作的情况发生。
例如,用户A使用手机进行录音时,用户A在手机的触摸屏上进行字母手势图像操作,该手机的触摸屏从字母手势图像的边缘信息中提取关键点对手势进行识别,显示屏上则会显示出相应的字母手势图像,比如“张明”发言时,用户可以30秒内在屏幕上分别输入“Z”和“M”字母手势以表示;这里的“Z”和“M”是依次输入。可以理解的是,由于30秒的预设时间较短,即手机获取“Z”和“M”中间间隔时间较短,因此,手机可以根据字母手势图像操作,判定“ZM”为一个完整的标记标识。
S1013、获取第i个第一操作的发生时间,该第i个第一操作的发生时间为第i个标记时间点。
终端在获取第i个第一操作时,同时获取该第i个第一操作对应的发生时间,也就是第i个标记时间点。
需要说明的是,本发明实施例中的第i个第一操作的发生时间是以从这次的录音开始为起始时刻。即本发明实施例中的第i个第一操作的发生时间为该第i个第一操作的发生时刻与起始时刻之间的差值。
例如,用户A使用手机在10点开始录音的,这个手机在10点30分的时候获取到了第3个手势操作,这时,手机就记录下该第3个手势发生的时间 为30分钟。
S404、根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息。
终端在获取第i个标记信息之后,由于在录音之前该终端已经获取到了预设的标记标识与被访用户信息的对应关系,因此,该终端可以根据第i个标记标识和预设的标记标识与被访用户信息的对应关系来确定出第i个标识对应的用户信息。
需要说明的是,本发明实施例中的用户信息可以为发言人的姓名、用户的头像等可以表征发言人身份的信息。
需要说明的是,获取预设的标记标识与被访用户信息的对应关系的过程将在后续的实施例中进行详细地说明。
可选地,预设的标记标识与被访者用户信息的对应关系可以为标记标识与用户身份信息之间的对应列表,例如,可以为标记标识与用户的姓名或用户的头像之间的对应列表关系。
例如,预设的标记标识与被访者用户信息的对应关系可以是用户姓名的首字母为标记标识,该标记标识对应用户的姓名。例如,“ZM”样的标记标识对应的为张明。
例如,用户A使用手机记录会议录音时,获取到了第i个标记标识为“ZM”,这时手机根据预设的标记标识与被访者用户信息的对应关系,找到了“ZM-张明”,于是,该手机就确定了第i个标记标识对应的用户信息为张明。
S405、将第i个标记标识对应的用户信息转化为第i个音频信息。
终端在根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息之后,由于音频数据可以分为左右两个声道播放,因此,终端在录音的时候,首先可以将第i个标记标识对应 的用户信息转化为第i个音频信息(语音)。用户信息到音频信息的转化方法可以采用目前惯用的任意一种方法来实现,在此不做限制。
例如,终端将第i个标记标识“ZM”对应的用户信息“张明”转化了第i个音频信息。
S406、在第i个标记时间点时,将第i个音频信息插入到第一音轨上。
终端将第i个标记标识对应的用户信息转化为第i个音频信息之后,由于音频数据可以分为左右两个声道播放,因此,终端在录音的时候,在第i个标记时间点时,该终端可以将第i个音频信息插入到第一音轨上。
例如,终端将第i个标记标识“ZM”对应的用户信息“张明”转化了第i个音频信息,并将语音“张明”插入到左声道的音轨上。
S407、当i≠N时,将第i个标记时间点和第i+1个标记时间点之间的音频数据插入到第二音轨上。
终端在根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息之后,当i≠N时,在该终端获取到第i+1个标记信息时,该终端可以将第i个标记时间点和第i+1个标记时间点之间的已经记录的音频数据插入到第二音轨上。
终端可以将第i个标记时间点和第i+1个标记时间点之间的已经记录的音频数据分段出来插入到第二音轨上。
需要说明的是,当i≠N时(即终端获取第i+1个标记信息),表明录音还没有结束,于是,终端将第i个标识时间点与下一次获取的第i+1个标记时间点之间记录的音频数据保存插入到第二音轨上。同时,终端还可以在继续进行正常第i+1个标记标识对应的用户的录音工作。
例如,用户A使用手机进行会议记录的过程中,手机按照前后两次标记时间点顺序,在第2个标记标识获取完毕后,手机将第1个音频文件插入到右声道所在的音轨上。
S408、当i=N时,获取录音结束时间点。
终端在根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息之后,由于此时终端获取的第i个标记信息可能是最后一个标记信息,该终端在继续进行下面的录音时不会再接收到第i+1个标记信息,因此,i=N时的情况下,终端可以获取到录音结束的时间点。
S409、将第i个标记时间点和录音结束时间点之间的音频数据插入到第二音轨上。
终端获取录音结束时间点之后,该终端可以将第i个标记时间点和录音结束时间点之间的音频数据插入到第二音轨上。
S410、将第一音轨上的N个音频信息和第二音轨上的音频数据合成为一个合成录音文件。
终端将N个音频信息插入到第一音轨,且该终端将第i个标记时间点和第i+1个标记时间点之间的音频数据插入到第二音轨上和该终端将第i个标记时间点和录音结束时间点之间的音频数据插入到第二音轨上之后,该终端将第一音轨上的N个音频信息和第二音轨上的音频数据合成为一个合成录音文件。
终端在每个标记时间点将每个标记标识对应的用户信息插入到录音的第一音轨,同时,将录音的内容插入到第二音轨。
可以理解的是,这样终端在进行录音的过程中,就将每个标记标识对应的用户信息在每个标记时间点插入在第一音轨,而将发言人的录音内容插入到第二音轨,最后终端得到的合成录音文件就是两个声音具有不一样的音频数据的录音文件并保存。
终端在播放录音文件时,首先需要判断是否需要多声道设备播放录制的录音文件,若是需要,则左声道播放音轨1的音频信息,右声道正常播放发 言人的录音(音频数据)即可。将音轨1和音轨2做分离处理,使得音轨1对应左声道,音轨2对应右声道。例如,当用户插入耳机播放录音时,耳机的左声道在标记的某一时刻点上会播放对应用户信息“张”和“明”的语音信息,右声道则播放发言人张明的录音内容。
本发明实施例所提供的一种录音方法,通过在记录音频数据的过程中,获取第i个标记信息,该第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为正整数;根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息;当i≠N时,将第i个标记时间点和第i+1个标记时间点之间的音频数据,保存为与第i个标记标识对应的用户信息匹配的第i个音频文件。采用上述技术实现方案,由于终端在录音的过程中,将被发言人或被访问者的信息(用户信息)与相应的发言人的发言内容相对应标记保存了,因此,终端能够根据不同的用户来记录不同的音频,体现了人性化设计,提高了终端的智能化。
实施例三
如图7所示,本发明实施例提供了一种终端1,该终端1可以包括:
获取单元10,设置为在记录音频数据的过程中,获取第i个标记信息,所述第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为正整数。
确定单元11,设置为根据所述获取单元10获取的所述第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息。
保存单元12,设置为当i≠N时,将所述获取单元10获取的所述第i个标记时间点和第i+1个标记时间点之间的音频数据,保存为与所述确定单元11确定的所述第i个标记标识对应的用户信息匹配的第i个音频文件。
可选地,如图8所示,所述终端1还包括:检测单元13和合成单元14。
所述获取单元10,还设置为所述确定单元11根据所述第i个标记标识和预设的标记标识与用户信息的对应关系,确定第i个标记标识对应的用户信息之后,当i=N时,获取录音结束时间点。
所述保存单元12,还设置为当i=N时,将所述获取单元10获取的所述第i个标记时间点和所述录音结束时间点之间的音频数据,保存为与所述确定单元11确定的所述第i个标记标识对应的用户信息匹配的第i个音频文件。
所述检测单元13,设置为检测所述保存单元12保存的N个音频文件中,是否存在匹配了相同的用户信息的至少两个音频文件。
所述合成单元14,设置为如果所述检测单元13检测出存在匹配了相同的用户信息的至少两个音频文件,则将所述保存单元12保存的所述至少两个音频文件合成为一个音频文件,所述一个音频文件匹配与所述至少两个音频文件相同的用户信息。
可选地,如图9所示,所述终端1还包括:转化单元15和插入单元16。
所述转化单元15,设置为在所述确定单元11根据所述第i个标记标识和预设的标记标识与用户信息的对应关系,确定第i个标记标识对应的用户信息之后,将所述确定单元11确定的所述第i个标记标识对应的用户信息转化为第i个音频信息。
所述插入单元16,设置为在所述获取单元10获取的所述第i个标记时间点时,将所述转化单元15转化的所述第i个音频信息插入到第一音轨上,以及当i≠N时,将所述第i个标记时间点和第i+1个标记时间点之间的音频数据插入到第二音轨上。
所述获取单元10,还设置为当i=N时,获取所述录音结束时间点。
所述插入单元16,还设置为将所述获取单元10获取的所述第i个标记时间点和所述录音结束时间点之间的音频数据插入到第二音轨上。
所述合成单元14,还设置为将所述插入单元16合好的所述第一音轨上的N个音频信息和所述第二音轨上的音频数据合成为一个合成录音文件。
可选地,所述获取单元10,还设置为在获取第i个标记信息之前,获取预设被访用户信息库。
所述确定单元11,还设置为根据所述获取单元10获取的所述预设被访用户信息库,确定预设的标记标识与被访用户信息的对应关系。
可选地,所述获取单元10获取第i个标记信息包括:,获取第i个第一操作,所述第一操作用于确定所述第i个标记信息;及根据所述第i个第一操作,获取第i个标记标识。以及获取所述第i个第一操作的发生时间,所述第i个第一操作的发生时间为所述第i个标记时间点。
可选地,本发明实施例中的终端为具有录音功能的电子设备,例如,录音笔、智能手机、平板电脑等。本发明实施例中的终端可以通过触摸屏接收标记信息,也可以在设置界面进行相应的标记设置,还可以通过设置有可以感应触摸操作的传感器或感应器接收标记信息,本发明实施例不作限制。
在实际应用中,上述获取单元10、确定单元11、检测单元13、合成单元14、转化单元15和插入单元16可由位于终端1上的处理器实现,例如,通过中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等实现,保存单元12可由存储器实现,该存储器可以通过系统总线与处理器连接,其中,存储器用于存储可执行程序代码,该程序代码包括计算机操作指令,存储器可能包含高速RAM存储器,也可能还包括非易失性存储器,例如,至少一个磁盘存储器。
本发明实施例所提供的一种终端,通过在记录音频数据的过程中,获取第i个标记信息,该第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为正整数;根据第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息;当i≠N 时,将第i个标记时间点和第i+1个标记时间点之间的音频数据,保存为与第i个标记标识对应的用户信息匹配的第i个音频文件。采用本发明实施例的方案,通过在录音的过程中,将被发言人或被访问者的信息(用户信息)与相应的发言人的发言内容相对应标记保存起来,因此,终端能够根据不同的用户来记录不同的音频,体现了人性化设计,提高了终端的智能化。
实施例四
一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现所述的录音方法。本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上, 使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。
工业实用性
通过本发明实施例的方案,在录音过程中,将被发言人或被访问者的信息(用户信息)与相应的发言人的发言内容相对应标记保存起来,使得终端能够根据不同的用户来记录不同的音频,体现了人性化设计,提高了终端的智能化。

Claims (11)

  1. 一种录音方法,包括:
    在记录音频数据的过程中,获取第i个标记信息,所述第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为标记的总个数,N为正整数;
    根据所述第i个标记标识和预设的标记标识与用户信息的对应关系,确定第i个标记标识对应的用户信息;
    当i≠N时,将所述第i个标记时间点和第i+1个标记时间点之间的音频数据,保存为与所述第i个标记标识对应的用户信息匹配的第i个音频文件。
  2. 根据权利要求1所述的方法,所述方法还包括:
    在确定第i个标记标识对应的用户信息之后,当i=N时,获取录音结束时间点;
    将所述第i个标记时间点和所述录音结束时间点之间的音频数据,保存为与所述第i个标记标识对应的用户信息匹配的第i个音频文件;
    检测N个音频文件中,是否存在匹配了相同的用户信息的至少两个音频文件;
    如果存在匹配了相同的用户信息的至少两个音频文件,则将所述至少两个音频文件合成为一个音频文件,所述一个音频文件匹配的用户信息与所述至少两个音频文件相同。
  3. 根据权利要求2所述的方法,所述方法还包括:
    将所述第i个标记标识对应的用户信息转化为第i个音频信息;
    在所述第i个标记时间点时,将所述第i个音频信息插入到第一音轨上;
    当i≠N时,将所述第i个标记时间点和第i+1个标记时间点之间的音频数据插入到第二音轨上;
    当i=N时,将所述第i个标记时间点和所述录音结束时间点之间的音频数据插入到第二音轨上;
    将所述第一音轨上的N个音频信息和所述第二音轨上的音频数据合成为一个合成录音文件。
  4. 根据权利要求1所述的方法,所述方法还包括:
    在获取所述第i个标记信息之前,获取预设被访用户信息库;
    根据所述预设被访用户信息库,确定预设的标记标识与被访用户信息的对应关系。
  5. 根据权利要求1或4所述的方法,其中,所述获取第i个标记信息,包括:
    获取第i个第一操作,所述第一操作用于确定所述第i个标记信息;
    根据所述第i个第一操作,获取第i个标记标识;
    获取所述第i个第一操作的发生时间,所述第i个第一操作的发生时间为所述第i个标记时间点。
  6. 一种终端,包括:
    获取单元,设置为在记录音频数据的过程中,获取第i个标记信息,所述第i个标记信息包括:第i个标记时间点和第i个标记标识,其中,N≥i≥1,N≥2,N为标记的总个数,N为正整数;
    确定单元,设置为根据所述获取单元获取的所述第i个标记标识和预设的标记标识与被访用户信息的对应关系,确定第i个标记标识对应的用户信息;
    保存单元,设置为当i≠N时,将所述获取单元获取的所述第i个标记时间点和第i+1个标记时间点之间的音频数据,保存为与所述确定单元确定的所述第i个标记标识对应的用户信息匹配的第i个音频文件。
  7. 根据权利要求6所述的终端,所述终端还包括:检测单元和合成单元;
    所述获取单元,还设置为在确定第i个标记标识对应的用户信息之后,当i=N时,获取录音结束时间点;
    所述保存单元,还设置为将所述获取单元获取的所述第i个标记时间点和所述录音结束时间点之间的音频数据,保存为与所述确定单元确定的所述第i个标记标识对应的用户信息匹配的第i个音频文件;
    所述检测单元,设置为检测所述保存单元保存的N个音频文件中,是否存在匹配了相同的用户信息的至少两个音频文件;
    所述合成单元,设置为如果所述检测单元检测出存在匹配了相同的用户信息的至少两个音频文件,则将所述保存单元保存的所述至少两个音频文件合成为一个音频文件,所述一个音频文件匹配与所述至少两个音频文件相同的用户信息。
  8. 根据权利要求7所述的终端,所述终端还包括:转化单元和插入单元;
    所述转化单元,设置为将所述确定单元确定的所述第i个标记标识对应的用户信息转化为第i个音频信息;
    所述插入单元,设置为在所述获取单元获取的所述第i个标记时间点时,将所述转化单元转化的所述第i个音频信息插入到第一音轨上;
    所述插入单元,还设置为当i≠N时,将所述第i个标记时间点和第i+1个标记时间点之间的音频数据插入到第二音轨上;
    当i=N时,
    将所述获取单元获取的所述第i个标记时间点和所述录音结束时间点之间的音频数据插入到第二音轨上;
    所述合成单元,还设置为将所述插入单元合好的所述第一音轨上的N个音频信息和所述第二音轨上的音频数据合成为一个合成录音文件。
  9. 根据权利要求6所述的终端,其中,
    所述获取单元,还设置为在获取第i个标记信息之前,获取预设被访用 户信息库;
    所述确定单元,还设置为根据所述获取单元获取的所述预设被访用户信息库,确定预设的标记标识与被访用户信息的对应关系。
  10. 根据权利要求6或9所述的终端,其中,
    所述获取单元获取第i个标记信息包括:获取第i个第一操作,所述第一操作用于确定所述第i个标记信息;及根据所述第i个第一操作,获取第i个标记标识;以及获取所述第i个第一操作的发生时间,所述第i个第一操作的发生时间为所述第i个标记时间点。
  11. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现权利要求1-5任意一项所述的录音方法。
PCT/CN2016/079919 2016-02-02 2016-04-21 一种录音方法及终端 WO2016197708A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610073408.0A CN107025913A (zh) 2016-02-02 2016-02-02 一种录音方法及终端
CN201610073408.0 2016-02-02

Publications (1)

Publication Number Publication Date
WO2016197708A1 true WO2016197708A1 (zh) 2016-12-15

Family

ID=57503070

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/079919 WO2016197708A1 (zh) 2016-02-02 2016-04-21 一种录音方法及终端

Country Status (2)

Country Link
CN (1) CN107025913A (zh)
WO (1) WO2016197708A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108052578B (zh) * 2017-12-08 2020-07-28 上海星佑网络科技有限公司 用于信息处理的方法和装置
CN111145803A (zh) * 2019-12-11 2020-05-12 秒针信息技术有限公司 语音信息的采集方法及装置、存储介质和电子装置
CN111198646A (zh) * 2019-12-11 2020-05-26 秒针信息技术有限公司 语音信息的采集方法及装置、存储介质和电子装置
CN111191199B (zh) * 2019-12-11 2021-11-16 秒针信息技术有限公司 语音信息的采集方法及装置、存储介质和电子装置
CN111224785A (zh) * 2019-12-19 2020-06-02 秒针信息技术有限公司 语音设备的绑定方法及装置、存储介质和电子装置
CN111191754B (zh) * 2019-12-30 2023-10-27 秒针信息技术有限公司 语音采集方法、装置、电子设备及存储介质
CN112017655B (zh) * 2020-07-25 2024-06-14 云开智能(深圳)有限公司 一种智能语音收录回放方法及其系统
CN113055529B (zh) * 2021-03-29 2022-12-13 深圳市艾酷通信软件有限公司 录音控制方法和录音控制装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013176365A1 (en) * 2012-05-21 2013-11-28 Lg Electronics Inc. Method and electronic device for easily searching for voice record
CN104581351A (zh) * 2015-01-28 2015-04-29 上海与德通讯技术有限公司 音频或视频的录制方法及其播放方法、电子装置
CN104657074A (zh) * 2015-01-27 2015-05-27 中兴通讯股份有限公司 一种实现录音的方法、装置和移动终端
CN105227744A (zh) * 2015-09-15 2016-01-06 广州三星通信技术研究有限公司 在通信终端中记录通话内容的方法和设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013176365A1 (en) * 2012-05-21 2013-11-28 Lg Electronics Inc. Method and electronic device for easily searching for voice record
CN104657074A (zh) * 2015-01-27 2015-05-27 中兴通讯股份有限公司 一种实现录音的方法、装置和移动终端
CN104581351A (zh) * 2015-01-28 2015-04-29 上海与德通讯技术有限公司 音频或视频的录制方法及其播放方法、电子装置
CN105227744A (zh) * 2015-09-15 2016-01-06 广州三星通信技术研究有限公司 在通信终端中记录通话内容的方法和设备

Also Published As

Publication number Publication date
CN107025913A (zh) 2017-08-08

Similar Documents

Publication Publication Date Title
WO2016197708A1 (zh) 一种录音方法及终端
WO2016119370A1 (zh) 一种实现录音的方法、装置和移动终端
JP4175390B2 (ja) 情報処理装置、および情報処理方法、並びにコンピュータ・プログラム
US20050182627A1 (en) Audio signal processing apparatus and audio signal processing method
CN104123115B (zh) 一种音频信息处理方法及电子设备
CN106024009A (zh) 音频处理方法及装置
CN105975569A (zh) 一种语音处理的方法及终端
WO2017028704A1 (zh) 伴奏音乐的提供方法和装置
CN105391837A (zh) 管理音频信号的方法和设备
CN103165131A (zh) 语音处理系统及语音处理方法
WO2005069171A1 (ja) 文書対応付け装置、および文書対応付け方法
CN111527746B (zh) 一种控制电子设备的方法及电子设备
WO2017080239A1 (zh) 录音标记方法及录音装置
CN106297843A (zh) 一种录音标记显示方法及装置
CN105335414A (zh) 音乐推荐方法、装置及终端
US20120035919A1 (en) Voice recording device and method thereof
JP4405418B2 (ja) 情報処理装置及びその方法
CN105139848B (zh) 数据转换方法和装置
CN106982344A (zh) 视频信息处理方法及装置
CN105550643A (zh) 医学术语识别方法及装置
TW201405546A (zh) 可語音控制之點歌系統及其運作流程
CN113055529B (zh) 录音控制方法和录音控制装置
KR102036721B1 (ko) 녹음 음성에 대한 빠른 검색을 지원하는 단말 장치 및 그 동작 방법
CN104182473A (zh) 设置选项的显示方法及装置
KR20160129787A (ko) 디지털 녹취 파일 녹취록 생성 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16806606

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16806606

Country of ref document: EP

Kind code of ref document: A1