WO2022156709A1 - 音频信号处理方法、装置、电子设备和可读存储介质 - Google Patents

音频信号处理方法、装置、电子设备和可读存储介质 Download PDF

Info

Publication number
WO2022156709A1
WO2022156709A1 PCT/CN2022/072745 CN2022072745W WO2022156709A1 WO 2022156709 A1 WO2022156709 A1 WO 2022156709A1 CN 2022072745 W CN2022072745 W CN 2022072745W WO 2022156709 A1 WO2022156709 A1 WO 2022156709A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio signal
segment
track
input
recording
Prior art date
Application number
PCT/CN2022/072745
Other languages
English (en)
French (fr)
Inventor
张鑫
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2022156709A1 publication Critical patent/WO2022156709A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]

Definitions

  • the present application belongs to the field of electronic technology, and in particular relates to an audio signal processing method, apparatus, electronic device and readable storage medium.
  • the inventor found that there are at least the following problems in the prior art: in the process of recording audio signals, users often have incorrect or unclear expressions, resulting in the audio signals containing incorrect or unclear information , at this time, only the already recorded audio signal can be discarded, and a new audio signal can be re-recorded, which reduces the efficiency of voice communication.
  • the purpose of the embodiments of the present application is to provide an audio signal processing method, apparatus, electronic device, and readable storage medium, which can solve the problem that a new audio signal needs to be re-recorded when the audio signal contains erroneous or unclear information.
  • an embodiment of the present application provides an audio signal processing method, the method comprising:
  • recording an original audio signal In response to the first input, recording an original audio signal, and displaying a recording track of the original audio signal; the recording track is used to indicate the time axis of the original audio signal;
  • the division mark is used to divide the recording track into at least two track segments;
  • the audio segment in the original audio signal corresponding to the track segment is processed to obtain a target audio signal.
  • an audio signal processing device comprising:
  • a receiving module for receiving the first input
  • a display module configured to record an original audio signal in response to the first input, and display a recording track of the original audio signal; the recording track is used to indicate the time axis of the original audio signal;
  • the adding module is used for adding at least one division mark on the recording track; the division mark is used for dividing the recording track into at least two track segments;
  • a segmentation module configured to segment the original audio signal into audio segments corresponding to the track segments based on the time points on the time axis corresponding to the segmentation markers;
  • the processing module is configured to process the audio segment in the original audio signal corresponding to the track segment based on the input of the track segment to obtain a target audio signal.
  • embodiments of the present application provide an electronic device, the electronic device includes a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction being The processor implements the steps of the method according to the first aspect when executed.
  • an embodiment of the present application provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the steps of the method according to the first aspect are implemented .
  • an embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction, and implement the first aspect the method described.
  • the electronic device receives a first input, records an original audio signal in response to the first input, displays a recording track of the original audio signal, divides the recording track into at least two track segments by using a segmentation mark, and The original audio signal is divided into audio segments corresponding to the track segments based on the segmentation marks, and based on the input of the track segments, the audio segments in the original audio signal corresponding to the track segments are processed to obtain the target audio signal.
  • the user can divide the audio signal into a plurality of corresponding audio segments through track segmentation. Avoid re-recording audio signals, which can improve the efficiency of voice communication.
  • FIG. 1 is a flowchart of steps of an audio signal processing method provided according to an exemplary embodiment
  • FIG. 2 is a schematic diagram of a chat interface provided according to an exemplary embodiment
  • FIG. 3 is a schematic diagram of another chat interface provided according to an exemplary embodiment
  • FIG. 4 is a flowchart of steps of another audio signal processing method provided according to an exemplary embodiment
  • FIG. 5 is a schematic diagram of yet another chat interface provided according to an exemplary embodiment
  • FIG. 6 is a schematic diagram of yet another chat interface provided according to an exemplary embodiment
  • FIG. 7 is a schematic diagram of an audio sending interface provided according to an exemplary embodiment
  • FIG. 8 is a schematic diagram of another audio sending interface provided according to an exemplary embodiment
  • FIG. 9 is a schematic diagram of still another audio sending interface provided according to an exemplary embodiment.
  • FIG. 10 is a schematic diagram of yet another chat interface provided according to an exemplary embodiment
  • FIG. 11 is a schematic diagram of yet another chat interface provided according to an exemplary embodiment
  • FIG. 12 is a schematic structural diagram of an audio signal processing apparatus provided according to an exemplary embodiment
  • FIG. 13 is a schematic structural diagram of an electronic device provided according to an exemplary embodiment
  • FIG. 14 is a schematic diagram of a hardware structure of an electronic device provided according to an exemplary embodiment.
  • first, second and the like in the description and claims of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and distinguish between “first”, “second”, etc.
  • the objects are usually of one type, and the number of objects is not limited.
  • the first object may be one or more than one.
  • “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the associated objects are in an "or” relationship.
  • FIG. 1 is a flowchart of steps of an audio signal processing method provided according to an exemplary embodiment. As shown in FIG. 1 , the method includes:
  • Step 101 Receive a first input.
  • Step 102 In response to the first input, record the original audio signal, and display the recording track of the original audio signal.
  • the recording track is used to indicate the time axis of the original audio signal.
  • the audio signal processing method may be performed by an electronic device such as a mobile phone, a notebook computer, a wearable device, etc., which has a display screen, a microphone, and other devices.
  • the first input is used to control the electronic device to start recording the original audio signal, and display the recording track corresponding to the original audio signal on the display screen.
  • the original audio signal is a sound signal that needs to be recorded into the electronic device, which may be a sound signal sent by a user or a sound signal in the environment where the electronic device is located.
  • the first input may be a click operation of clicking a recording button in the recording interface
  • the electronic device may start recording the original audio signal in response to the user's clicking operation, and display the recording track in the recording interface.
  • FIG. 2 is a schematic diagram of a chat interface provided according to an exemplary embodiment.
  • the user can operate the interface display controls in the display screen.
  • the interface display controls are, for example, virtual buttons in the chat interface, and the electronic device can respond to the For the user's operation of the interface display controls, a recording interface 201 is displayed at the bottom of the chat interface, and a virtual recording button 202 is displayed at the bottom of the recording interface 201 .
  • the electronic device may start the microphone to collect the sound signal, and start recording the original audio signal.
  • the electronic device displays the track axis 203 in the recording interface 201 , and displays the recording track 204 on the track axis 203 .
  • the recording track 204 is used to indicate the time axis of the original audio signal, and the time axis corresponds to the time length of the original audio signal, so the length of the recording track 204 can represent the time length of the recorded original audio signal.
  • the time length of the original audio signal increases continuously, and the length of the recording track 204 increases synchronously with the time length of the original audio signal. As shown in FIG.
  • the recording track 204 starts to display from the left end of the track axis 203, and the length of the recording track 204 gradually increases as the time length of the original audio signal increases.
  • the time length of the original audio signal is 10 seconds
  • the corresponding time length of the recording track 204 is also 10 seconds.
  • the recording time reaches 40 seconds
  • the time length of the original audio signal is 40 seconds
  • the recording track 204 The corresponding time length is 40 seconds.
  • the recording track can also be directly displayed in the chat interface, and the form of the recording track may include but not limited to the straight line shown in FIG.
  • the first input may be clicking the recording button in the recording interface, or clicking the physical button in the electronic device, or the sliding operation of sliding along the preset direction in the display screen.
  • the specific form of the first input in this embodiment, and The specific form of the recording track is not limited.
  • Step 103 adding at least one division mark to the recording track.
  • the division mark is used to divide the recording track into at least two track segments.
  • the electronic device may automatically add a division mark to the recording track, or may add a division mark to the recording track in response to the user's input, and divide the recording track into at least two track segments by the division mark.
  • step 103 may be implemented in the following manner:
  • the eighth input may be a user input of clicking a mark adding button, and the user may manually add a segmentation mark on the recording track during the recording process of the original audio signal through the mark adding button.
  • the recording interface 201 displays a mark adding button 205.
  • the electronic device collects the voice signal sent by the user in real time. If the user finds an error in the expression at the 10th second, he can click the mark adding button. 205. In response to the user's click operation, the electronic device may add a division mark 206 at the position of the recording track corresponding to the 10th second, that is, at the end of the recording track 204 at the current moment.
  • the eighth input may be a user input of directly clicking the recording track.
  • the electronic device may respond to the The user's click operation adds a division mark 206 to the end of the recording track 204 corresponding to the current moment.
  • the eighth input may be a user input of double-clicking the recording interface.
  • the electronic device may respond to the The user's double-click operation adds a split marker 206 to the end of the recording track 204 corresponding to the current moment.
  • the specific form of the eighth input may include, but is not limited to, the above-mentioned user operations of clicking a mark to add a button, clicking a recording track, or double-clicking a recording interface.
  • the electronic device may divide the recording track into at least two track segments by using the dividing mark.
  • FIG. 3 is a schematic diagram of another chat interface provided according to an exemplary embodiment
  • FIG. 3 is a schematic diagram of the chat interface after the recording of the original audio signal is completed.
  • the user can click the recording button 202 again, and the electronic device can respond to the user's click operation, stop collecting the sound signal, obtain the original audio signal, and stop.
  • the length of the recording track 204 is increased to obtain the recording track 204 as shown in FIG. 3 , which can represent the time length of the original audio signal.
  • a division mark 206 manually added by the user is displayed on the recording track 204, the time length corresponding to the recording track 204 is 40 seconds, the time point corresponding to the division mark 206 is the 10th second, and the division mark 206 divides the recording track at the 10th second.
  • 204 is segmented into a first track segment to the left of segment marker 206 and a second track segment to the right of segment marker 206 .
  • the user finds that there is an error in the currently entered original audio signal during the recording of the original audio signal, he or she can add a segmentation mark to the position of the recording track corresponding to the current moment in time, which is convenient for the user to add according to the recording process.
  • the segmentation flag determines the audio segments that need to be processed, and quickly processes the audio segments in the original audio signal that are problematic.
  • Step 104 Divide the original audio signal into audio segments corresponding to the track segments based on the time points on the time axis corresponding to the segment markers.
  • the electronic device may segment the original audio signal based on the time point corresponding to the segmentation mark.
  • the time length of the original audio signal is 40 seconds
  • the original audio signal can be divided based on the 10th second of the time point corresponding to the division mark 206 .
  • the electronic device can determine the 10th second of the time point corresponding to the division mark 206, and divide the original audio signal from the 10th second of the original audio signal, and divide the original audio signal into The first audio segment between seconds 0 and 10 and the second audio segment between seconds 10 and 40.
  • the first audio segment corresponds to the first track segment between the 0th second and the 10th second in the recording track 204
  • the second audio segment corresponds to the first track segment between the 10th second and the 40th second in the recording track 204.
  • Step 105 Based on the input of the track segment, process the audio segment in the original audio signal corresponding to the track segment to obtain the target audio signal.
  • the user may process the audio segments in the original audio signal based on the track segments in the recording track to obtain the target audio signal.
  • the user can process the first audio segment by processing the first track segment, and realize the processing of the second audio segment by processing the second track segment to obtain the target audio Signal.
  • step 104 may be implemented in the following manner:
  • the third input may be a user input of long-pressing the track segment. If the user long-presses the first track segment, the electronic device may delete the first track segment and delete the original audio in response to the user's long-press operation. From the first audio segment in the signal corresponding to the first track segment, a recording track including only the second track segment and an original audio signal including only the second audio segment, that is, the target audio signal, are obtained.
  • the third input may be a user input for dragging the track segment. If the user long presses the first track segment shown in FIG. 3 and drags the first track segment to the recording interface 201
  • the electronic device may delete the first track segment and delete the first audio segment in the original audio signal in response to the user's drag operation.
  • the form of the third input may include, but is not limited to, user input of long-pressing the track segment or dragging the track segment.
  • the corresponding audio segment in the audio signal can be deleted, which can facilitate the user to delete the audio segment with problems in the audio signal, and can solve the problem in the audio signal. Problems recording audio signals.
  • the electronic device receives the first input, records the original audio signal in response to the first input, and displays the recording track of the original audio signal, and divides the recording track into at least two track segments by dividing marks. segment, and based on the segmentation mark, the original audio signal is divided into audio segments corresponding to the track segments, and based on the input of the track segments, the audio segments in the original audio signal corresponding to the track segments are processed to obtain the target audio Signal.
  • the user can divide the audio signal into a plurality of corresponding audio segments through track segmentation. Avoid re-recording audio signals, which can improve the efficiency of voice communication.
  • FIG. 4 is a flowchart of steps of another audio signal processing method provided according to an exemplary embodiment. As shown in FIG. 4 , the method includes:
  • Step 401 Receive a first input.
  • Step 402 In response to the first input, record the original audio signal, and display the recording track of the original audio signal.
  • Step 403 Add at least one division mark to the recording track.
  • step 403 can also be implemented in the following manner:
  • a split position is determined in the recording track, and a split marker is added at the split position.
  • the ninth input may be a drag of a target segmentation marker in the already added at least one segmentation marker.
  • the user can long press the division mark 206 shown in FIG. 3 and drag the division mark 206 along the recording track 204, and the electronic device can respond to the user's drag operation and determine the drag
  • the position where the user releases the division mark 206 is determined as a new division position, and a new division mark 207 is added on the new division position.
  • the user can drag the division mark 206 to the left of the division mark 206 along the recording track 204 to add a new division mark 207 to the left of the division mark 206, or along the recording track 204. Drag the split marker 206 to the right of the split marker 206 to add a new split marker to the right of the split marker 206 .
  • the ninth input may be a user input of directly clicking the recording track, and the electronic device may, in response to the user's click operation, determine the position clicked by the user as the split position, and add a split mark on the split position.
  • the user can estimate the time length of the original audio signal according to the recording duration, and when manually adding a division mark, can roughly estimate the division position where the division mark needs to be added.
  • the electronic device may play the audio content corresponding to the split position, so as to facilitate the user to adjust the split position according to the played audio content.
  • the electronic device can start from the time point in the original audio signal corresponding to the division mark 207 to play the audio content in the original audio signal .
  • the user can determine whether the division position corresponding to the division mark 207 is the division position required by the user according to the audio content played.
  • the user can continue to drag the division mark 206 and record The split marker 206 is released at other positions of the track 204, and the split position is re-determined.
  • the electronic device can play the audio content corresponding to the split position again, repeat the above steps until a split position that meets the user's needs is determined, and add a split mark on the split position.
  • a division mark such as division mark 206
  • the user can drag the segment marker to add a corresponding segment marker in the recording track, such as segment marker 207, so that a track segment to be processed (ie, segment marker 206) can be obtained from the recording track. and the track segment between the split marker 207) to process the audio segment corresponding to the track segment.
  • the user can manually add a segmentation mark to the recording track, which can facilitate the user to divide the audio signal into several corresponding audio segments, so as to facilitate the user to segment the original audio signal. deal with.
  • step 403 can be implemented in the following manner:
  • the electronic device may detect the original audio signal, determine the pause interval in the original audio signal, and add a segmentation mark to the target track segment corresponding to the pause interval. For example, in the process of collecting the user's voice signal by the electronic device, if the intensity of the collected audio signal is less than or equal to the preset intensity threshold at the 10th second, it can be determined that the user starts to pause talking at the 10th second, if the intensity is less than or equal to the preset intensity threshold or equal to the preset intensity threshold until the 15th second, it can be determined that the user did not speak between the 10th and 15th seconds, and the time interval between the 10th and the 15th second is greater than the preset duration (The preset duration is 4 seconds, for example), then it can be determined that the time period between the 10th second and the 15th second is the pause interval, and the start time of the pause interval on the time axis is the 10th second, and the end time is the 15th second .
  • the preset duration is 4 seconds, for example
  • the electronic device can determine the track segment between the 10th and 15th seconds as the target track segment in the recording track, and add a segmentation mark to any position of the target track segment, that is, at the 10th second Add a split marker anywhere between the 15th second.
  • the electronic device can also detect the original audio signal after the recording of the original audio signal is completed, determine one or more pause intervals in the original audio signal, and add a segmentation mark to the corresponding position of the recording track.
  • the method for determining the pause interval may include, but is not limited to, determining according to the intensity of the audio signal.
  • the specific values of the preset duration and the preset intensity threshold may be set according to requirements, which will not be repeated in this embodiment.
  • the electronic device can add segmentation marks at the corresponding positions of the recording track according to the pauses in the original audio signal, so as to realize the automatic addition of segmentation marks, which can simplify the user's operation of adding segmentation marks and improve the processing efficiency of audio signals.
  • Step 404 Divide the original audio signal into audio segments corresponding to track segments based on the time points on the time axis corresponding to the segment markers.
  • Step 405 Based on the input of the track segment, process the audio segment in the original audio signal corresponding to the track segment to obtain the target audio signal.
  • step 405 can be implemented in the following manner:
  • the modified audio signal is used to replace the audio segment to be modified, and the audio segment to be modified is the audio segment corresponding to the track segment to be modified in the original audio signal.
  • the user can determine the audio segment to be modified from the original audio signal, and replace the audio segment to be modified with a new audio signal, and the modified audio signal is a new audio signal.
  • the track segment to be modified may be the track segment between the segment marker 206 and the segment marker 207
  • the second input may be a user input of double-clicking the track segment
  • the electronic device may respond to the user's double-click operation , and determine the track segment between the segment marker 206 and the segment marker 207 as the track segment to be modified.
  • the electronic device can activate the microphone, collect a piece of audio signal again, use the new collected audio signal as the modified audio signal, and use the modified audio signal to replace the track between the segmentation mark 206 and the segmentation mark 207 in the original audio signal
  • the audio segment corresponding to the segment can be set according to requirements, which is not limited in this embodiment.
  • the step of acquiring the corrected audio signal may be implemented in the following manner:
  • the electronic device may receive text information input by the user, and convert the text information input by the user into a modified audio signal. For example, after receiving the second input and determining the trajectory segment to be modified, the electronic device can display a text input box, the user can input text information through the text input box, the electronic device can receive the text information input by the user, and convert the text information into to correct the audio signal.
  • the specific method for converting text information into an audio signal can be set according to requirements, which is not limited in this embodiment.
  • the modified audio signal may be a pre-stored audio signal in the electronic device.
  • the electronic device may display an audio list, where the audio list includes a plurality of pre-stored audio signals, and the user may select one audio signal as the modified audio signal.
  • the acquisition method of the modified audio signal may include, but is not limited to, the method of re-recording the audio signal, converting text information into an audio signal, or selecting a pre-stored audio signal, and any audio signal acquisition methods known or unknown in the art can be applied to this implementation. example.
  • the user can replace the problematic audio segment in the original audio signal by segmenting the track, which can facilitate the user to modify the problematic audio segment in the original audio signal to avoid re-recording the audio signal. Improve the efficiency of voice communication.
  • the user can choose to directly send the original audio signal, or choose to process the original audio signal to obtain the target audio signal.
  • FIG. 6 is a schematic diagram of another chat interface provided according to an exemplary embodiment.
  • the electronic device in the process of recording the original audio signal, if the user clicks the record button 202 again, The electronic device can stop recording the original audio signal in response to the user's click operation, and display the selection interface 301 in the chat interface.
  • the selection interface 301 includes the sending control 3011 and the editing control 3012. If the user clicks the sending control 3011, the electronic device can respond. In response to the user's click operation, the original audio signal is directly sent; if the user clicks on the editing control 3012, the electronic device can respond to the user's click operation and display the chat interface shown in Figure 5, and the user can pass the chat interface shown in Figure 5.
  • the track segments are processed to obtain the target audio signal.
  • the above is only an exemplary example, and the specific process of selecting to directly send the original audio signal or selecting to process the original audio signal can be set according to requirements, which is not limited in this embodiment.
  • Step 406 in response to the seventh input, determine a target trajectory segment from the at least two trajectory segments.
  • Step 407 Determine the target audio segment corresponding to the target track segment from the target audio signal, and send the target audio segment.
  • the user can select one or more of the at least two track segments. Audio segments corresponding to multiple track segments are sent.
  • FIG. 7 is a schematic diagram of an audio sending interface provided according to an exemplary embodiment.
  • the electronic device may display the interface shown in FIG. 7.
  • a recording track 201 is displayed on the top of the audio sending interface, and a plurality of sending objects are displayed on the bottom.
  • the seventh input can be a drag operation of dragging the track segment. If the user drags the first track segment 2011 in the recording track to the top of the target sending object 401 in the multiple sending objects and releases it, the electronic device can respond According to the user's drag operation, the audio segment corresponding to the first track segment 2011 is sent to the target sending object 401 .
  • the electronic device may display the corresponding virtual track segment.
  • FIG. 8 is a schematic diagram of another audio sending interface provided according to an exemplary embodiment.
  • the electronic device may display the first track segment 2011
  • the electronic device may display the first track segment 2011
  • the electronic device can send the audio segment corresponding to the first track segment 2011 to the target sending object 401 .
  • FIG. 9 is a schematic diagram of another audio sending interface provided according to an exemplary embodiment.
  • the seventh input can be a user input of double-clicking the recording track. If the user double-clicks the recording track, the electronic device can respond to the user.
  • the virtual recording track 2013 is displayed below the recording track 201, and the virtual recording track 2013 corresponds to the entire recording track 201. At this time, the user can drag the virtual recording track 2013, and drag the virtual recording track 2013 to the target sending object. up and release.
  • the electronic device may send the entire target audio signal to the target sending object in response to the user's drag operation.
  • the user can also choose not to process the audio segment, but directly enter the audio sending interface and select the sending target audio segment.
  • the user can select audio segments in the target audio signal through track segments, and send different audio segments to different sending objects, which can realize segmented transmission of audio signals and improve the efficiency of voice communication.
  • the method may further include:
  • the user in the process of recording the original audio signal, the user can suspend the recording of the original audio signal, so as to facilitate the user to flexibly input the longer original audio signal.
  • the fourth input can be a user input of clicking the pause button 208 in the recording interface 201.
  • the user can click the pause button 208 if he needs to process other affairs.
  • the recording of the original audio signal and increasing the length of the recording track 204 may be stopped in response to the user's operation of clicking the pause button 208 .
  • the electronic device can change the display state of the pause button 208, and change the pause button 208 to the pause state as shown in FIG. 10, which is a schematic diagram of another chat interface provided according to an exemplary embodiment, the fifth input It can be the user input of clicking the pause button 208 in the paused state.
  • the electronic device pauses to record the original audio signal, if it receives the operation of clicking the pause button 208 again, it can continue to record the original audio signal in response to the click operation, and continue to increase the recording.
  • the length of the trace 204 the electronic device can change the state of the pause button 208 to the recording state as shown in FIG. 2 .
  • the method may further include:
  • a cut mark corresponding to the pause mark is added at the target position of the recording track, and the pause mark and the cut mark are used to divide the track segment to be cut from the recording track;
  • a pause mark 209 may be displayed at the end of the recording track 204 .
  • the sixth input may be a user input of dragging the pause mark 209.
  • the user may drag the pause mark 209 to the left of the pause mark 209 along the recording track 204, and release the pause mark 209 at a desired position.
  • the electronic device may determine the release position of the pause mark as the target position in response to the user's drag operation, and add the cut mark at the target position. As shown in FIG.
  • the electronic device may add a cut mark 210 on the target position, and determine the pause The track segment between the marker 209 and the cutting marker 210 is the track segment to be cut. At this time, the electronic device can determine the time point on the time axis corresponding to the cut mark 210, and delete the audio segment located after the time point corresponding to the cut mark 210 in the original audio signal, that is, delete the audio segment between the pause mark 209 and the cut mark 210. The audio segment corresponding to the track segment.
  • the sixth input may also be a user input of double-clicking the target position in the recording track, or clicking the target position in the recording track, and the specific form of the sixth input can be set according to requirements.
  • the user when an error occurs during the input of the original audio signal, the user can suspend the input of the original audio signal in time, and modify the audio signal just entered, which can facilitate the user to modify the input audio signal in a timely manner. , to improve the recording efficiency of audio signals.
  • the execution body may be an audio signal processing apparatus, or a control module in the audio signal processing apparatus for executing the loading audio signal processing method.
  • the audio signal processing method provided by the embodiment of the present application is described by taking an audio signal processing apparatus executing the method for processing a loaded audio signal as an example.
  • Fig. 12 is a schematic structural diagram of an audio signal processing apparatus according to an exemplary embodiment. As shown in Processing module 1205.
  • the receiving module 1201 is used for receiving the first input.
  • the display module 1202 is configured to record the original audio signal in response to the first input, and display the recording track of the original audio signal, where the recording track is used to indicate the time axis of the original audio signal.
  • the adding module 1203 is configured to add at least one division mark on the recording track, where the division mark is used to divide the recording track into at least two track segments.
  • the segmentation module 1204 is configured to segment the original audio signal into audio segments corresponding to the track segments based on the time points on the time axis corresponding to the segment markers.
  • the processing module 1205 is configured to process the audio segment in the original audio signal corresponding to the track segment based on the input of the track segment to obtain the target audio signal.
  • the processing module 1205 is specifically configured to, in response to the second input, determine the track segment to be modified from the at least two track segments; obtain the modified audio signal; replace the audio segment to be modified with the modified audio signal,
  • the modified audio segment is the audio segment in the original audio signal corresponding to the track segment to be modified.
  • the processing module 1205 is specifically configured to, in response to the third input, determine the track segment to be deleted from the at least two track segments; delete the audio segment corresponding to the track segment to be deleted in the original audio signal.
  • the apparatus 1200 may further include: a suspending module, configured to suspend the recording of the original audio signal when the fourth input is received; and continue the recording of the original audio signal when the fifth input is received.
  • a suspending module configured to suspend the recording of the original audio signal when the fourth input is received; and continue the recording of the original audio signal when the fifth input is received.
  • the apparatus 1200 may further include: a deletion module for displaying a pause mark at the end of the recording track; in response to the sixth input, adding a cut mark, a pause mark and a cut mark corresponding to the pause mark at the target position of the recording track It is used to divide the track segment to be cut from the recording track; delete the audio segment corresponding to the track segment to be cut from the original audio signal.
  • a deletion module for displaying a pause mark at the end of the recording track; in response to the sixth input, adding a cut mark, a pause mark and a cut mark corresponding to the pause mark at the target position of the recording track It is used to divide the track segment to be cut from the recording track; delete the audio segment corresponding to the track segment to be cut from the original audio signal.
  • the apparatus 1200 may further include:
  • a determination module for determining a target trajectory segment from the at least two trajectory segments in response to the seventh input.
  • the sending module is used for determining the target audio segment corresponding to the target track segment from the target audio signal, and sending the target audio segment.
  • the adding module 1203 is specifically configured to add a division mark on the position of the recording track corresponding to the current moment if the eighth input is received during the recording of the original audio signal.
  • the adding module 1203 is specifically used to determine a pause interval with a pause duration greater than or equal to a preset duration in the original audio signal, and to determine the start time and end time of the pause interval on the time axis; target trajectory segment between the start time and end time, and add a split marker on the target trajectory segment.
  • the adding module 1203 is specifically configured to, in response to the ninth input, determine a division position in the recording track, and add a division mark on the division position.
  • the electronic device receives a first input, records an original audio signal in response to the first input, displays a recording track of the original audio signal, divides the recording track into at least two track segments by using a segmentation mark, and The original audio signal is divided into audio segments corresponding to the track segments based on the segmentation marks, and based on the input of the track segments, the audio segments in the original audio signal corresponding to the track segments are processed to obtain the target audio signal.
  • the user can divide the audio signal into a plurality of corresponding audio segments through track segmentation. Avoid re-recording audio signals, which can improve the efficiency of voice communication.
  • the audio signal processing apparatus in this embodiment of the present application may be an apparatus, or may be a component, an integrated circuit, or a chip in a terminal.
  • the apparatus may be a mobile electronic device or a non-mobile electronic device.
  • the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook, or a personal digital assistant (personal digital assistant).
  • UMPC ultra-mobile personal computer
  • netbook or a personal digital assistant
  • non-mobile electronic devices can be servers, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (television, TV), teller machine or self-service machine, etc., this application Examples are not specifically limited.
  • Network Attached Storage NAS
  • personal computer personal computer, PC
  • television television
  • teller machine or self-service machine etc.
  • the audio signal processing apparatus in the embodiment of the present application may be an apparatus having an operating system.
  • the operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
  • the audio signal processing apparatus provided in the embodiment of the present application can implement each process implemented by the method embodiment of FIG. 1 or FIG. 4 , and to avoid repetition, details are not described here.
  • the electronic device 1300 includes a processor 1301 and a memory 1302 , which are stored on the memory 1302 and can be stored on the processor 1301
  • the running program or instruction when the program or instruction is executed by the processor 1301, implements each process of the above-mentioned audio signal processing method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
  • the electronic devices in the embodiments of the present application include the aforementioned mobile electronic devices and non-mobile electronic devices.
  • FIG. 14 is a schematic diagram of a hardware structure of an electronic device provided according to an exemplary embodiment.
  • the electronic device 1400 includes but is not limited to: a radio frequency unit 1401, a network module 1402, an audio output unit 1403, an input unit 1404, a sensor 1405, a display unit 1406, a user input unit 1407, an interface unit 1408, a memory 1409, and a processor 1410, etc. part.
  • the electronic device 1400 may also include a power supply (such as a battery) for supplying power to various components, and the power supply may be logically connected to the processor 1410 through a power management system, so as to manage charging, discharging, and power consumption through the power management system. consumption management and other functions.
  • a power supply such as a battery
  • the structure of the electronic device shown in FIG. 14 does not constitute a limitation on the electronic device.
  • the electronic device may include more or less components than those shown in the figure, or combine some components, or arrange different components, which will not be repeated here. .
  • a display unit 1406 for receiving a first input
  • the user input unit 1407 is used to input the original audio signal in response to the first input, and the display unit 1406 is also used to display the recording track of the original audio signal, and the recording track is used to indicate the time axis of the original audio signal;
  • the display unit 1406 is further configured to add at least one division mark on the recording track, and the division mark is used to divide the recording track into at least two track segments;
  • the processor 1410 divides the original audio signal into audio segments corresponding to the track segments based on the time points on the time axis corresponding to the segment markers.
  • the processor 1410 is configured to process the audio segments in the original audio signal corresponding to the track segments based on the input of the track segments to obtain the target audio signal.
  • the electronic device receives a first input, records an original audio signal in response to the first input, displays a recording track of the original audio signal, divides the recording track into at least two track segments by using a segmentation mark, and The original audio signal is divided into audio segments corresponding to the track segments based on the segmentation marks, and based on the input of the track segments, the audio segments in the original audio signal corresponding to the track segments are processed to obtain the target audio signal.
  • the user can divide the audio signal into a plurality of corresponding audio segments through track segmentation. Avoid re-recording audio signals, which can improve the efficiency of voice communication.
  • the processor 1410 is specifically configured to, in response to the second input, determine the track segment to be modified from the at least two track segments; obtain the modified audio signal; replace the audio segment to be modified with the modified audio signal,
  • the modified audio segment is the audio segment in the original audio signal corresponding to the track segment to be modified.
  • the user can replace the problematic audio segment in the original audio signal by segmenting the track, which can facilitate the user to modify the problematic audio segment in the original audio signal to avoid re-recording the audio signal. Improve the efficiency of voice communication.
  • the processor 1410 is specifically configured to, in response to the third input, determine the track segment to be deleted from the at least two track segments; delete the audio segment corresponding to the track segment to be deleted in the original audio signal.
  • the corresponding audio segment in the audio signal can be deleted, which can facilitate the user to delete the audio segment with problems in the audio signal, and can solve the problem in the audio signal. Problems recording audio signals.
  • the processor 1410 is further configured to suspend the recording of the original audio signal in the case of receiving the fourth input; and continue the recording of the original audio signal in the case of receiving the fifth input.
  • the user when an error occurs during the input of the original audio signal, the user can suspend the input of the original audio signal in time, and modify the audio signal just entered, which can facilitate the user to modify the input audio signal in a timely manner. , to improve the recording efficiency of audio signals.
  • the display unit 1406 is also used to display a pause mark at the end of the recording track; in response to the sixth input, a cut mark corresponding to the pause mark is added at the target position of the recording track, and the pause mark and the cut mark are used to start the recording track from the recording track.
  • the track segment to be cut is divided in the audio signal; the processor 1410 is further configured to delete the audio segment corresponding to the track segment to be cut from the original audio signal.
  • the processor 1410 is further configured to, in response to the seventh input, determine a target track segment from at least two track segments; determine a target audio segment corresponding to the target track segment from the target audio signal, and send the target Audio segment.
  • the user can select audio segments in the target audio signal through track segments, and send different audio segments to different sending objects, which can realize segmented transmission of audio signals and improve the efficiency of voice communication.
  • the display unit 1406 is specifically configured to add a division mark on the position of the recording track corresponding to the current moment if the eighth input is received during the recording of the original audio signal.
  • the user can add a segmentation mark to the position of the recording track corresponding to the current moment in time.
  • the audio segments corresponding to the segmentation marks can be processed, which is convenient for the user to determine the audio segments to be processed according to the segmentation marks added during the recording process, and quickly perform the audio segments with problems in the original audio signal. deal with.
  • the processor 1410 is specifically configured to determine a pause interval whose pause duration is greater than or equal to a preset duration in the original audio signal, and to determine the start time and end time of the pause interval on the time axis; target trajectory segment between the start time and end time, and add a split marker on the target trajectory segment.
  • the electronic device can add segmentation marks at the corresponding positions of the recording track according to the pauses in the original audio signal, so as to realize the automatic addition of segmentation marks, which can simplify the user's operation of adding segmentation marks and improve the processing efficiency of audio signals.
  • the display unit 1406 is specifically configured to, in response to the ninth input, determine a division position in the recording track, and add a division mark on the division position.
  • the user can manually add a segmentation mark to the recording track, which can facilitate the user to divide the audio signal into several corresponding audio segments, so as to facilitate the user to segment the original audio signal. deal with.
  • the input unit 1404 may include a graphics processor (Graphics Processing Unit, GPU) 14041 and a microphone 14042. Such as camera) to obtain still pictures or video image data for processing.
  • the display unit 1406 may include a display panel 14081, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 1407 includes a touch panel 14081 and other input devices 14072 .
  • the touch panel 14081 is also called a touch screen.
  • the touch panel 14081 may include two parts, a touch detection device and a touch controller.
  • Other input devices 14072 may include but are not limited to physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here.
  • Memory 1409 may be used to store software programs as well as various data including, but not limited to, application programs and operating systems.
  • the processor 1410 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and application programs, and the like, and the modem processor mainly handles wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 1410.
  • the embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, each process of the above-mentioned audio signal processing method embodiment can be achieved, and can achieve The same technical effect, in order to avoid repetition, will not be repeated here.
  • the processor is the processor in the electronic device described in the foregoing embodiments.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • An embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement the above-mentioned embodiment of the audio signal processing method and can achieve the same technical effect, in order to avoid repetition, it will not be repeated here.
  • the chip mentioned in the embodiments of the present application may also be referred to as a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip, or the like.
  • the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of this application.
  • a storage medium such as ROM/RAM, magnetic disk, CD-ROM

Abstract

本申请公开了一种音频信号处理方法、装置、电子设备和可读存储介质,属于电子技术领域。该方法包括:响应于第一输入,录入原始音频信号,并显示原始音频信号的录音轨迹,通过分割标记将录音轨迹分割为至少两个轨迹分段,并基于分割标记将原始音频信号分割为对应于轨迹分段的音频分段,基于对轨迹分段的输入,对轨迹分段对应的原始音频信号中的音频分段进行处理,得到目标音频信号。

Description

音频信号处理方法、装置、电子设备和可读存储介质
相关申请的交叉引用
本申请主张在2021年01月22日在中国提交的中国专利申请号202110090251.3的优先权,其全部内容通过引用包含于此。
技术领域
本申请属于电子技术领域,具体涉及一种音频信号处理方法、装置、电子设备和可读存储介质。
背景技术
随着互联网技术的发展,即时通讯工具的应用越来越广泛,用户可以使用即时通讯工具即时发送和接收图片、视频、音频和文字等信息。在即时通讯过程中,由于音频信号的录入简单快捷,受到越来越多用户的欢迎。
在实现本申请的过程中,发明人发现现有技术中至少存在如下问题:在音频信号的录入过程中,用户经常会出现表述错误或不清楚的情况,导致音频信号包含错误或不清楚的信息,此时只能丢弃已经录制好的音频信号,重新录制新的音频信号,降低了语音通讯效率。
发明内容
本申请实施例的目的是提供一种音频信号处理方法、装置、电子设备和可读存储介质,能够解决音频信号包含错误或不清楚的信息时,需要重新录制新的音频信号的问题。
为了解决上述技术问题,本申请是这样实现的:
第一方面,本申请实施例提供了一种音频信号处理方法,该方法包括:
接收第一输入;
响应于所述第一输入,录入原始音频信号,并显示所述原始音频信号的录音轨迹;所述录音轨迹用于指示所述原始音频信号的时间轴;
在所述录音轨迹上添加至少一个分割标记;所述分割标记用于将所述录音轨迹分割为至少两个轨迹分段;
基于所述分割标记对应的所述时间轴上的时间点,将所述原始音频信号分割为对应于所述轨迹分段的音频分段;
基于对所述轨迹分段的输入,对所述轨迹分段对应的所述原始音频信号中的音频分段进行处理,得到目标音频信号。
第二方面,本申请实施例提供了一种音频信号处理装置,该装置包括:
接收模块,用于接收第一输入;
显示模块,用于响应于所述第一输入,录入原始音频信号,并显示所述原始音频信号的录音轨迹;所述录音轨迹用于指示所述原始音频信号的时间轴;
添加模块,用于在所述录音轨迹上添加至少一个分割标记;所述分割标记用于将所述录音轨迹分割为至少两个轨迹分段;
分割模块,用于基于所述分割标记对应的所述时间轴上的时间点,将所述原始音频信号分割为对应于所述轨迹分段的音频分段;
处理模块,用于基于对所述轨迹分段的输入,对所述轨迹分段对应的所述原始音频信号中的音频分段进行处理,得到目标音频信号。
第三方面,本申请实施例提供了一种电子设备,该电子设备包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤。
第四方面,本申请实施例提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤。
第五方面,本申请实施例提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法。
在本申请实施例中,电子设备接收第一输入,响应于第一输入,录入原始 音频信号,并显示原始音频信号的录音轨迹,通过分割标记将录音轨迹分割为至少两个轨迹分段,并基于分割标记将原始音频信号分割为对应于轨迹分段的音频分段,基于对轨迹分段的输入,对轨迹分段对应的原始音频信号中的音频分段进行处理,得到目标音频信号。用户在音频信号的录入过程中,可以通过轨迹分段将音频信号分割为对应的多个音频分段,通过对轨迹分段的操作,可以对音频信号中存在问题的音频分段进行处理,可以避免重新录制音频信号,从而可以提高语音通讯效率。
附图说明
图1是根据一示例性实施例提供的一种音频信号处理方法的步骤流程图;
图2是根据一示例性实施例提供的一种聊天界面的示意图;
图3是根据一示例性实施例提供的另一种聊天界面的示意图;
图4是根据一示例性实施例提供的另一种音频信号处理方法的步骤流程图;
图5是根据一示例性实施例提供的又一种聊天界面的示意图;
图6是根据一示例性实施例提供的又一种聊天界面的示意图;
图7是根据一示例性实施例提供的一种音频发送界面的示意图;
图8是根据一示例性实施例提供的另一种音频发送界面的示意图;
图9是根据一示例性实施例提供的又一种音频发送界面的示意图;
图10是根据一示例性实施例提供的又一种聊天界面的示意图;
图11是根据一示例性实施例提供的又一种聊天界面的示意图;
图12是根据一示例性实施例提供的一种音频信号处理装置的结构示意图;
图13是根据一示例性实施例提供的一种电子设备的结构示意图;
图14是根据一示例性实施例提供的一种电子设备的硬件结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部 的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供的音频信号处理方法进行详细地说明。
图1是根据一示例性实施例提供的一种音频信号处理方法的步骤流程图,如图1所示,该方法包括:
步骤101、接收第一输入。
步骤102、响应于第一输入,录入原始音频信号,并显示原始音频信号的录音轨迹。
其中,录音轨迹用于指示原始音频信号的时间轴。
本实施例中,音频信号处理方法可以由手机、笔记本电脑、可穿戴设备等具有显示屏和麦克风等器件的电子设备执行。第一输入用于控制电子设备开始录入原始音频信号,并在显示屏中显示与原始音频信号对应的录音轨迹。原始音频信号为需要录入到电子设备中的声音信号,可以是用户发出的声音信号,也可以是电子设备所处环境中的声音信号。
示例性地,第一输入可以是点击录音界面中的录音按键的点击操作,电子设备可以响应于用户的点击操作,开始录入原始音频信号,并在录音界面中显示录音轨迹。如图2所示,图2是根据一示例性实施例提供的一种聊天界面的示意图,用户可以操作显示屏中的界面显示控件,界面显示控件例如聊天界面 中的虚拟按键,电子设备可以响应于用户对界面显示控件的操作,在聊天界面的底部显示录音界面201,录音界面201的底部显示有虚拟的录音按键202。电子设备可以响应于用户对录音按键202的点击操作,启动麦克风采集声音信号,开始录入原始音频信号。与此同时,电子设备在录音界面201中显示轨迹轴203,并在轨迹轴203上显示录音轨迹204。录音轨迹204用于指示原始音频信号的时间轴,时间轴对应原始音频信号的时间长度,因此录音轨迹204的长度可以表征录入的原始音频信号的时间长度。在原始音频信号的录入过程中,原始音频信号的时间长度不断增加,录音轨迹204的长度与原始音频信号的时间长度同步增加。如图2所示,当电子设备从第0秒开始录入原始音频信号时,录音轨迹204从轨迹轴203的左端开始显示,随着原始音频信号的时间长度的增加,录音轨迹204的长度逐渐增加,在第10秒时,原始音频信号的时间长度为10秒,录音轨迹204对应的时间长度也为10秒,当录制时长达到40秒时,原始音频信号的时间长度为40秒,录音轨迹204对应的时间长度为40秒。
实际应用中,录音轨迹也可以直接在聊天界面中显示,录音轨迹的形式可以包括但不限于图2所示的直线,也可以为曲线、直方图和扇形图等形式。第一输入可以是点击录音界面中的录音按键,也可以是点击电子设备中的实体按键,或者在显示屏中沿预设方向滑动的滑动操作,本实施例对第一输入的具体形式,以及录音轨迹的具体形式不做限制。
步骤103、在录音轨迹上添加至少一个分割标记。
其中,分割标记用于将录音轨迹分割为至少两个轨迹分段。电子设备可以自动在录音轨迹上添加分割标记,也可以响应于用户的输入,在录音轨迹上添加分割标记,通过分割标记将录音轨迹分割为至少两个轨迹分段。
可选地,步骤103可以通过如下方式实现:
在原始音频信号的录入过程中,若接收到第八输入,则在当前时刻对应的录音轨迹的位置上添加分割标记。
示例性地,第八输入可以是点击标记添加按键的用户输入,用户可以通过标记添加按键,在原始音频信号的录入过程中,手动在录音轨迹上添加分割标记。如图2所示,录音界面201中显示有标记添加按键205,在用户说话的过程中,电子设备实时采集用户发出的声音信号,若用户在第10秒时发现表述错误,可以点击标记添加按键205,电子设备可以响应于用户的点击操作,在第10秒对应的录音轨迹的位置上,也即在当前时刻录音轨迹204的末端,添加分割标记206。
在一种实施例中,第八输入可以是直接点击录音轨迹的用户输入,结合上述举例,在原始音频信号的录入过程中,若用户在第10秒时点击录音轨迹204,电子设备可以响应于用户的点击操作,在当前时刻对应的录音轨迹204的末端添加分割标记206。
在另一种实施例中,第八输入可以是双击录音界面的用户输入,结合上述举例,在原始音频信号的录入过程中,若用户在第3秒时双击录音界面201,电子设备可以响应于用户的双击操作,在当前时刻对应的录音轨迹204的末端添加分割标记206。需要说明的是,第八输入的具体形式可以包括但不限于如上所述的点击标记添加按键、点击录音轨迹或双击录音界面的用户操作。
本实施例中,电子设备可以通过分割标记将录音轨迹分割为至少两个轨迹分段。示例性地,如图3所示,图3是根据一示例性实施例提供的另一种聊天界面的示意图,图3为原始音频信号录制完成之后的聊天界面的示意图,结合图2,在原始音频信号的录入过程中,若用户在第40秒时结束原始音频信号的录入,可以再次点击录音按键202,电子设备可以响应于用户的点击操作,停止采集声音信号,得到原始音频信号,并停止增加录音轨迹204的长度,得到如图3所示的、可以表征原始音频信号的时间长度的录音轨迹204。同时,录音轨迹204上显示有用户手动添加的分割标记206,录音轨迹204对应的时间长度为40秒,分割标记206对应的时间点为第10秒,分割标记206在第10秒处将录音轨迹204分割为位于分割标记206左侧的第一轨迹分段和位于分割 标记206右侧的第二轨迹分段。
实际应用中,用户在原始音频信号的录入过程中,若发现当前录入的原始音频信号出现错误,可以及时的在当前时刻对应的录音轨迹的位置上添加分割标记,可以方便用户根据录入过程中添加的分割标记确定需要处理的音频分段,快速的对原始音频信号中有问题的音频分段进行处理。
步骤104、基于分割标记对应的时间轴上的时间点,将原始音频信号分割为对应于轨迹分段的音频分段。
本实施例中,电子设备可以基于分割标记对应的时间点,对原始音频信号进行分割。结合上述举例,原始音频信号的时间长度为40秒,可以基于分割标记206对应的时间点第10秒对原始音频信号进行分割。在录制得到长度为40秒的原始音频信号之后,电子设备可以确定分割标记206对应的时间点第10秒,从原始音频信号的第10秒处对原始音频信号进行分割,将原始音频信号分割为第0秒至第10秒之间的第一个音频分段和第10秒至第40秒的第二个音频分段。第一个音频分段对应录音轨迹204中第0秒至第10秒之间的第一个轨迹分段,第二个音频分段对应录音轨迹204中第10秒至第40秒之间的第二个轨迹分段。
步骤105、基于对轨迹分段的输入,对轨迹分段对应的原始音频信号中的音频分段进行处理,得到目标音频信号。
本实施例中,用户在原始音频信号录入完成之后,可以基于录音轨迹中的轨迹分段,对原始音频信号中的音频分段进行处理,得到目标音频信号。结合上述举例,用户可以通过对第一轨迹分段的处理,实现对第一音频分段的处理,以及通过对第二轨迹分段的处理,实现对第二音频分段的处理,得到目标音频信号。
可选地,步骤104可以通过如下方式实现:
响应于第三输入,从至少两个轨迹分段中确定待删除的轨迹分段;
删除原始音频信号中与待删除的轨迹分段对应的音频分段。
示例性的,第三输入可以是长按轨迹分段的用户输入,若用户长按第一轨迹分段,电子设备可以响应于用户的长按操作,删除第一轨迹分段,并删除原始音频信号中与第一轨迹分段对应的第一音频分段,得到只包括第二轨迹分段的录音轨迹,以及只包括第二音频分段的原始音频信号,即目标音频信号。
在一种实施例中,第三输入可以是拖动轨迹分段的用户输入,若用户长按图3所示的第一轨迹分段,并将第一轨迹分段拖动到录音界面201之外,电子设备可以响应于用户的拖动操作,删除第一轨迹分段,并删除原始音频信号中的第一音频分段。第三输入的形式可以包括但不限于长按轨迹分段或拖动轨迹分段的用户输入。
实际应用中,用户删除录音轨迹中的轨迹分段,可以删除音频信号中对应的音频分段,可以方便用户删除音频信号中存在问题的音频分段,可以解决音频信号中存在问题时,需要重新录入音频信号的问题。
综上所述,本实施例中,电子设备接收第一输入,响应于第一输入,录入原始音频信号,并显示原始音频信号的录音轨迹,通过分割标记将录音轨迹分割为至少两个轨迹分段,并基于分割标记将原始音频信号分割为对应于轨迹分段的音频分段,基于对轨迹分段的输入,对轨迹分段对应的原始音频信号中的音频分段进行处理,得到目标音频信号。用户在音频信号的录入过程中,可以通过轨迹分段将音频信号分割为对应的多个音频分段,通过对轨迹分段的操作,可以对音频信号中存在问题的音频分段进行处理,可以避免重新录制音频信号,从而可以提高语音通讯效率。
图4是根据一示例性实施例提供的另一种音频信号处理方法的步骤流程图,如图4所示,该方法包括:
步骤401、接收第一输入。
步骤402、响应于第一输入,录入原始音频信号,并显示原始音频信号的录音轨迹。
步骤403、在录音轨迹上添加至少一个分割标记。
可选地,步骤403还可以通过如下方式实现:
响应于第九输入,在录音轨迹中确定分割位置,并在分割位置上添加分割标记。
本实施例中,在完成原始音频信号的录入之后,用户可以手动在录音轨迹上添加分割标记。示例性地,如图5所示,图5是根据一示例性实施例提供的又一种聊天界面的示意图,第九输入可以是对已经添加的至少一个分割标记中的目标分割标记的拖动操作,在完成原始音频信号的录入之后,用户可以长按图3所示的分割标记206,并将分割标记206沿录音轨迹204拖动,电子设备可以响应于用户的拖动操作,确定拖动操作的释放位置,将用户释放分割标记206的位置确定为新的分割位置,并在新的分割位置上添加一个新的分割标记207。用户在拖动分割标记206的过程中,可以沿录音轨迹204向分割标记206的左侧拖动分割标记206,以在分割标记206的左侧添加新的分割标记207,也可以沿录音轨迹204向分割标记206右侧拖动分割标记206,以在分割标记206的右侧添加一个新的分割标记。
在一种实施例中,第九输入可以是直接点击录音轨迹的用户输入,电子设备可以响应于用户的点击操作,确定用户点击的位置为分割位置,并在分割位置上添加一个分割标记。实际应用中,用户可以根据录音时长估计原始音频信号的时间长度,在手动添加分割标记时,可以大概估计需要添加分割标记的分割位置。
在另一种实施例中,电子设备在确定分割位置之后,可以播放分割位置对应的音频内容,以方便用户根据播放的音频内容,调整分割位置。如图5所示,当用户将分割标记206拖动到分割标记207所在的分割位置时,电子设备可以从分割标记207对应的原始音频信号中的时间点开始,播放原始音频信号中的音频内容。此时,用户可以根据播放的音频内容,确定分割标记207对应的分割位置是否为用户需要的分割位置,若分割标记207对应的分割位置不符合需求,用户可以继续拖动分割标记206,在录音轨迹204的其他位置释放分割标 记206,重新确定分割位置,电子设备可以再次播放分割位置对应的音频内容,重复上述步骤直至确定符合用户需求的分割位置,在分割位置上添加分割标记。
在一种场景下,用户在原始音频信号的录入过程中,若确定当前时刻录入的声音信号有问题时,可以在录音轨迹中添加分割标记,例如分割标记206。在原始音频信号录入完成之后,用户可以拖动分割标记,在录音轨迹中添加一个对应的分割标记,例如分割标记207,从而可以从录音轨迹中得到一个需要处理的轨迹分段(即分割标记206和分割标记207之间的轨迹分段),以对轨迹分段对应的音频分段进行处理。
实际应用中,在原始音频信号录入完成之后,用户可以手动在录音轨迹中添加分割标记,可以方便用户将音频信号分割为相应的几个音频分段,从而可以方便用户对原始音频信号进行分段处理。
可选地,步骤403可以通过如下方式实现:
确定原始音频信号中停顿时长大于或等于预设时长的停顿区间,并确定停顿区间在时间轴上的起始时间和结束时间;
从录音轨迹中确定位于起始时间和结束时间之间的目标轨迹分段,并在目标轨迹分段上添加分割标记。
示例性地,电子设备在原始音频信号的录入过程中,可以对原始音频信号进行检测,确定原始音频信号中的停顿区间,在停顿区间对应的目标轨迹分段上添加分割标记。例如,电子设备在采集用户的声音信号的过程中,若在第10秒开始,采集得到的音频信号的强度小于或等于预设强度阈值,可以确定用户在第10秒开始暂停说话,若强度小于或等于预设强度阈值的情况一直持续到第15秒,则可以确定在第10秒至第15秒之间用户并没有说话,并且第10秒与第15秒之间的时间间隔大于预设时长(预设时长例如4秒),则可以确定第10秒至第15秒之间的时间段为停顿区间,以及停顿区间在时间轴上的起始时间为第10秒,结束时间为第15秒。此时,电子设备可以在录音轨迹中确定位 于第10秒和第15秒之间的轨迹分段为目标轨迹分段,并在目标轨迹分段的任意位置添加一个分割标记,即在第10秒至第15秒之间的任意位置添加一个分割标记。
需要说明的是,电子设备也可以在原始音频信号录入完成之后,对原始音频信号进行检测,确定原始音频信号中的一个或多个停顿区间,并在录音轨迹的对应位置添加分割标记。停顿区间的确定方法可以包括但不限于根据音频信号的强度确定,预设时长和预设强度阈值的具体数值可以根据需求设置,本实施例对此不做赘述。
实际应用中,电子设备可以根据原始音频信号中的停顿,在录音轨迹的对应位置添加分割标记,实现分割标记的自动添加,可以简化用户添加分割标记的操作,提高音频信号的处理效率。
步骤404、基于分割标记对应的时间轴上的时间点,将原始音频信号分割为对应于轨迹分段的音频分段。
步骤405、基于对轨迹分段的输入,对轨迹分段对应的原始音频信号中的音频分段进行处理,得到目标音频信号。
可选地,步骤405可以通过如下方式实现:
响应于第二输入,从至少两个轨迹分段中确定待修改的轨迹分段;
获取修正音频信号;
采用修正音频信号替换待修改的音频分段,待修改的音频分段为原始音频信号中与待修改的轨迹分段对应的音频分段。
本实施例中,用户可以从原始音频信号中确定待修改的音频分段,并采用新的音频信号替换待修的音频分段,修正音频信号即为新的音频信号。如图5所示,待修改的轨迹分段可以是分割标记206和分割标记207之间的轨迹分段,第二输入可以是双击轨迹分段的用户输入,电子设备可以响应于用户的双击操作,确定分割标记206和分割标记207之间的轨迹分段为待修改的轨迹分段。与此同时,电子设备可以启动麦克风,再次采集一段音频信号,将采集到的新 的音频信号作为修正音频信号,并采用修正音频信号替换原始音频信中分割标记206和分割标记207之间的轨迹分段对应的音频分段。第二输入的具体形式可以根据需求设置,本实施例对此不做限制。
可选地,获取修正音频信号的步骤可以通过如下方式实现:
接收输入的文本信息,将文本信息转换为修正音频信号。
本实施例中,电子设备可以接收用户输入的文本信息,将用户输入的文本信息转换为修正音频信号。例如,电子设备在接收到第二输入,确定待修改的轨迹分段之后,可以显示文本输入框,用户可以通过文本输入框输入文字信息,电子设备可以接收用户输入的文本信息,将文本信息转换为修正音频信号。将文本信息转换为音频信号的具体方法可以根据需求设置,本实施例对此不做限制。
在一种实施例中,修正音频信号可以是电子设备中预先存储的音频信号。电子设备在确定待修改的轨迹分段之后,可以显示音频列表,音频列表中包括预先存储的多个音频信号,用户可以选择其中的一个音频信号作为修正音频信号。修正音频信号的获取方法可以包括但不限于重新录制音频信号、将文本信息转换为音频信号或者选择预先存储的音频信号的方法,本领域已知或未知的音频信号获取方法都可以应用到本实施例中。
实际应用中,用户可以通过轨迹分段,对原始音频信号中存在问题的音频分段进行替换,可以方便用户对原始音频信号中存在问题的音频分段进行修改,以避免重新录入音频信号,可以提高语音通讯效率。
在一种实施例中,用户在原始音频信号录入完成之后,可以选择直接发送原始音频信号,或者选择对原始音频信号进行处理,得到目标音频信号。
示例性地,如图6所示,图6是根据一示例性实施例提供的又一种聊天界面的示意图,结合上述举例,在原始音频信号的录入过程中,若用户再次点击录音按键202,电子设备可以响应于用户的点击操作,停止录入原始音频信号,并在聊天界面中显示选择界面301,选择界面301中包括发送控件3011和编辑 控件3012,若用户点击发送控件3011,电子设备可以响应于用户的点击操作,直接发送原始音频信号;若用户点击编辑控件3012,电子设备可以响应于用户的点击操作,显示如图5所示的聊天界面,用户可以通过如图5所示的聊天界面对轨迹分段进行处理,得到目标音频信号。以上仅为示例性举例,选择直接发送原始音频信号或者选择对原始音频信号进行处理的具体过程可以根据需求设置,本实施例对此不做限制。
步骤406、响应于第七输入,从至少两个轨迹分段中确定目标轨迹分段。
步骤407、从目标音频信号中确定目标轨迹分段对应的目标音频分段,并发送目标音频分段。
本实施例中,在对原始音频信号中的音频分段进行处理,得到目标音频信号之后,若录音轨迹中还包括至少一个分割标记,用户可以从至少两个轨迹分段中选择其中的一个或多个轨迹分段对应的音频分段发送。
示例性地,如图7所示,图7是根据一示例性实施例提供的一种音频发送界面的示意图,在用户完成对音频分段的处理之后,电子设备可以显示如图7所示的音频发送界面,音频发送界面的顶部显示有录音轨迹201,底部显示有多个发送对象。第七输入可以是拖动轨迹分段的拖动操作,若用户将录音轨迹中的第一轨迹分段2011拖动到多个发送对象中的目标发送对象401的上方并释放,电子设备可以响应于用户的拖动操作,向目标发送对象401发送第一轨迹分段2011对应的音频分段。
在另一种实施例中,在用户拖动轨迹分段的过程中,电子设备可以显示对应的虚拟轨迹分段。如图8所示,图8是根据一示例性实施例提供的另一种音频发送界面的示意图,在用户拖动第一轨迹分段2011的过程中,电子设备可以显示第一轨迹分段2011对应的虚拟轨迹分段2012,当用户将虚拟轨迹分段2012拖动到目标发送对象401的上方释放时,电子设备可以向目标发送对象401发送第一轨迹分段2011对应的音频分段。
在一种实施例中,用户可以选择直接发送目标音频信号。如图9所示,图 9是根据一示例性实施例提供的又一种音频发送界面的示意图,第七输入可以是双击录音轨迹的用户输入,若用户双击录音轨迹,电子设备可以响应于用户的双击操作,在录音轨迹201的下方显示虚拟录音轨迹2013,虚拟录音轨迹2013对应整段录音轨迹201,此时用户可以拖动虚拟录音轨迹2013,将虚拟录音轨迹2013拖动到目标发送对象的上方并释放。电子设备可以响应于用户的拖动操作,向目标发送对象发送整段目标音频信号。
需要说明的是,在录音轨迹中添加分割标记之后,用户也可以选择不对音频分段进行处理,而是直接进入音频发送界面,选择发送目标音频分段。
实际应用中,用户可以通过轨迹分段选择目标音频信号中的音频分段,将不同的音频分段发送给不同的发送对象,可以实现音频信号的分段发送,可以提高语音通讯效率。
可选地,在步骤405之前,该方法还可以包括:
在接收到第四输入的情况下,暂停原始音频信号的录入;
在接收到第五输入的情况下,继续原始音频信号的录入。
本实施例中,在原始音频信号的录入过程中,用户可以暂停原始音频信号的录入,以方便用户灵活的录入较长的原始音频信号。如图2所示,第四输入可以是点击录音界面201中的暂停按键208的用户输入,用户在原始音频信号的录入过程中,若需要处理其他事务时,可以点击暂停按键208,电子设备在可以响应于用户点击暂停按键208的操作,停止录入原始音频信号,并停止增加录音轨迹204的长度。
同时,电子设备可以更改暂停按键208的显示状态,将暂停按键208更改为如图10所示的暂停状态,图10是根据一示例性实施例提供的又一种聊天界面的示意图,第五输入可以是点击处于暂停状态的暂停按键208的用户输入,电子设备在暂停录入原始音频信号后,若再次接收到点击暂停按键208的操作,可以响应于点击操作继续录入原始音频信号,并继续增加录音轨迹204的长度。同时,电子设备可以将暂停按键208的状态更改为如图2所示的录音状态。
实际应用中,在原始音频信号的录入过程中,用户可以暂停原始音频信号的录入,处理其他事务,在处理其他事务之后,可以继续原始音频信号的录入,可以方便用户灵活的处理多项事务,提高音频信号录入的灵活性。
可选地,在步骤在接收到第五输入的情况下,继续原始音频信号的录入之前,该方法还可以包括:
在录音轨迹的末端显示暂停标记;
响应于第六输入,在录音轨迹的目标位置添加与暂停标记对应的切割标记,暂停标记和切割标记用于从录音轨迹中划分出待切割的轨迹分段;
从原始音频信号中删除待切割的轨迹分段对应的音频分段。
本实施例中,用户在暂停录入原始音频信号时,可以对原始音频信号中的音频分段进行修改。如图10所示,电子设备在暂停原始音频信号的录入时,可以在录音轨迹204的末端显示暂停标记209。第六输入可以是拖动暂停标记209的用户输入,用户可以沿录音轨迹204,向暂停标记209的左侧拖动暂停标记209,并在需要的位置释放暂停标记209。电子设备可以响应于用户的拖动操作,确定暂停标记的释放位置为目标位置,并在目标位置添加切割标记。如图11所示,图11是根据一示例性实施例提供的又一种聊天界面的示意图,若用户在目标位置释放暂停标记209,电子设备可以在目标位置上添加切割标记210,并确定暂停标记209与切割标记210之间的轨迹分段为待切割的轨迹分段。此时,电子设备可以确定切割标记210对应的时间轴上的时间点,删除原始音频信号中位于切割标记210对应的时间点之后的音频分段,即删除暂停标记209至切割标记210之间的轨迹分段对应的音频分段。需要说明的是,第六输入也可以是双击录音轨迹中的目标位置,或单击录音轨迹中的目标位置的用户输入,第六输入的具体形式可以根据需求设置。
实际应用中,用户在原始音频信号的录入过程中,在出现错误时,可以及时暂停原始音频信号的录入,并对刚刚录入的音频信号进行修改,可以方便用户及时的对录入的音频信号进行修改,提高音频信号的录入效率。
需要说明的是,本申请实施例提供的音频信号处理方法,执行主体可以为音频信号处理装置,或者该音频信号处理装置中用于执行加载音频信号处理方法的控制模块。本申请实施例中以音频信号处理装置执行加载音频信号处理方法为例,说明本申请实施例提供的音频信号处理方法。
图12是根据一示例性实施例提供的一种音频信号处理装置的结构示意图,如图12所示,音频信号处理装置1200包括:接收模块1201、显示模块1202、添加模块1203、分割模块1204和处理模块1205。
接收模块1201,用于接收第一输入。
显示模块1202,用于响应于第一输入,录入原始音频信号,并显示原始音频信号的录音轨迹,录音轨迹用于指示原始音频信号的时间轴。
添加模块1203,用于在录音轨迹上添加至少一个分割标记,分割标记用于将录音轨迹分割为至少两个轨迹分段。
分割模块1204,用于基于分割标记对应的时间轴上的时间点,将原始音频信号分割为对应于轨迹分段的音频分段。
处理模块1205,用于基于对轨迹分段的输入,对轨迹分段对应的原始音频信号中的音频分段进行处理,得到目标音频信号。
可选地,处理模块1205具体用于响应于第二输入,从至少两个轨迹分段中确定待修改的轨迹分段;获取修正音频信号;采用修正音频信号替换待修改的音频分段,待修改的音频分段为原始音频信号中与待修改的轨迹分段对应的音频分段。
可选地,处理模块1205具体用于响应于第三输入,从至少两个轨迹分段中确定待删除的轨迹分段;删除原始音频信号中与待删除的轨迹分段对应的音频分段。
可选地,装置1200还可以包括:暂停模块,用于在接收到第四输入的情况下,暂停原始音频信号的录入;在接收到第五输入的情况下,继续原始音频信号的录入。
可选地,装置1200还可以包括:删除模块,用于在录音轨迹的末端显示暂停标记;响应于第六输入,在录音轨迹的目标位置添加与暂停标记对应的切割标记,暂停标记和切割标记用于从录音轨迹中划分出待切割的轨迹分段;从原始音频信号中删除待切割的轨迹分段对应的音频分段。
可选地,装置1200还可以包括:
确定模块,用于响应于第七输入,从至少两个轨迹分段中确定目标轨迹分段。
发送模块,用于从目标音频信号中确定目标轨迹分段对应的目标音频分段,并发送目标音频分段。
可选地,添加模块1203具体用于在原始音频信号的录入过程中,若接收到第八输入,则在当前时刻对应的录音轨迹的位置上添加分割标记。
可选地,添加模块1203具体用于确定原始音频信号中停顿时长大于或等于预设时长的停顿区间,并确定停顿区间在时间轴上的起始时间和结束时间;从录音轨迹中确定位于起始时间和结束时间之间的目标轨迹分段,并在目标轨迹分段上添加分割标记。
可选地,添加模块1203具体用于响应于第九输入,在录音轨迹中确定分割位置,并在分割位置上添加分割标记。
在本申请实施例中,电子设备接收第一输入,响应于第一输入,录入原始音频信号,并显示原始音频信号的录音轨迹,通过分割标记将录音轨迹分割为至少两个轨迹分段,并基于分割标记将原始音频信号分割为对应于轨迹分段的音频分段,基于对轨迹分段的输入,对轨迹分段对应的原始音频信号中的音频分段进行处理,得到目标音频信号。用户在音频信号的录入过程中,可以通过轨迹分段将音频信号分割为对应的多个音频分段,通过对轨迹分段的操作,可以对音频信号中存在问题的音频分段进行处理,可以避免重新录制音频信号,从而可以提高语音通讯效率。
本申请实施例中的音频信号处理装置可以是装置,也可以是终端中的部 件、集成电路、或芯片。该装置可以是移动电子设备,也可以为非移动电子设备。示例性的,移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本或者个人数字助理(personal digital assistant,PDA)等,非移动电子设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。
本申请实施例中的音频信号处理装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统,可以为ios操作系统,还可以为其他可能的操作系统,本申请实施例不作具体限定。
本申请实施例提供的音频信号处理装置能够实现图1或图4的方法实施例实现的各个过程,为避免重复,这里不再赘述。
如图13所示,图13是根据一示例性实施例提供的一种电子设备的结构示意图,电子设备1300包括处理器1301和存储器1302,存储在存储器1302上并可在所述处理器1301上运行的程序或指令,该程序或指令被处理器1301执行时实现上述音频信号处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要说明的是,本申请实施例中的电子设备包括上述所述的移动电子设备和非移动电子设备。
图14是根据一示例性实施例提供的一种电子设备的硬件结构示意图。
该电子设备1400包括但不限于:射频单元1401、网络模块1402、音频输出单元1403、输入单元1404、传感器1405、显示单元1406、用户输入单元1407、接口单元1408、存储器1409、以及处理器1410等部件。
本领域技术人员可以理解,电子设备1400还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器1410逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图14中示出 的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
显示单元1406,用于接收第一输入;
用户输入单元1407,用于响应于第一输入,录入原始音频信号,显示单元1406还用于显示原始音频信号的录音轨迹,录音轨迹用于指示原始音频信号的时间轴;
显示单元1406还用于在录音轨迹上添加至少一个分割标记,分割标记用于将录音轨迹分割为至少两个轨迹分段;
处理器1410基于分割标记对应的时间轴上的时间点,将原始音频信号分割为对应于轨迹分段的音频分段。
处理器1410用于基于对轨迹分段的输入,对轨迹分段对应的原始音频信号中的音频分段进行处理,得到目标音频信号。
在本申请实施例中,电子设备接收第一输入,响应于第一输入,录入原始音频信号,并显示原始音频信号的录音轨迹,通过分割标记将录音轨迹分割为至少两个轨迹分段,并基于分割标记将原始音频信号分割为对应于轨迹分段的音频分段,基于对轨迹分段的输入,对轨迹分段对应的原始音频信号中的音频分段进行处理,得到目标音频信号。用户在音频信号的录入过程中,可以通过轨迹分段将音频信号分割为对应的多个音频分段,通过对轨迹分段的操作,可以对音频信号中存在问题的音频分段进行处理,可以避免重新录制音频信号,从而可以提高语音通讯效率。
可选地,处理器1410具体用于响应于第二输入,从至少两个轨迹分段中确定待修改的轨迹分段;获取修正音频信号;采用修正音频信号替换待修改的音频分段,待修改的音频分段为原始音频信号中与待修改的轨迹分段对应的音频分段。
实际应用中,用户可以通过轨迹分段,对原始音频信号中存在问题的音频分段进行替换,可以方便用户对原始音频信号中存在问题的音频分段进行修 改,以避免重新录入音频信号,可以提高语音通讯效率。
可选地,处理器1410具体用于响应于第三输入,从至少两个轨迹分段中确定待删除的轨迹分段;删除原始音频信号中与待删除的轨迹分段对应的音频分段。
实际应用中,用户删除录音轨迹中的轨迹分段,可以删除音频信号中对应的音频分段,可以方便用户删除音频信号中存在问题的音频分段,可以解决音频信号中存在问题时,需要重新录入音频信号的问题。
可选地,处理器1410还用于在接收到第四输入的情况下,暂停原始音频信号的录入;在接收到第五输入的情况下,继续原始音频信号的录入。
实际应用中,用户在原始音频信号的录入过程中,在出现错误时,可以及时暂停原始音频信号的录入,并对刚刚录入的音频信号进行修改,可以方便用户及时的对录入的音频信号进行修改,提高音频信号的录入效率。
可选地,显示单元1406还用于在录音轨迹的末端显示暂停标记;响应于第六输入,在录音轨迹的目标位置添加与暂停标记对应的切割标记,暂停标记和切割标记用于从录音轨迹中划分出待切割的轨迹分段;处理器1410还用于从原始音频信号中删除待切割的轨迹分段对应的音频分段。
实际应用中,在原始音频信号的录入过程中,用户可以暂停原始音频信号的录入,处理其他事务,在处理其他事务之后,可以继续原始音频信号的录入,可以方便用户灵活的处理多项事务,提高音频信号录入的灵活性。
可选地,处理器1410还用于响应于第七输入,从至少两个轨迹分段中确定目标轨迹分段;从目标音频信号中确定目标轨迹分段对应的目标音频分段,并发送目标音频分段。
实际应用中,用户可以通过轨迹分段选择目标音频信号中的音频分段,将不同的音频分段发送给不同的发送对象,可以实现音频信号的分段发送,可以提高语音通讯效率。
可选地,显示单元1406具体用于在原始音频信号的录入过程中,若接收 到第八输入,则在当前时刻对应的录音轨迹的位置上添加分割标记。
实际应用中,用户在原始音频信号的录入过程中,若发现当前录入的原始音频信号出现错误,可以及时的在当前时刻对应的录音轨迹的位置上添加分割标记。在录入完成之后,可以对分割标记对应的音频分段进行处理,可以方便用户根据录入过程中添加的分割标记确定需要处理的音频分段,快速的对原始音频信号中有问题的音频分段进行处理。
可选地,处理器1410具体用于确定原始音频信号中停顿时长大于或等于预设时长的停顿区间,并确定停顿区间在时间轴上的起始时间和结束时间;从录音轨迹中确定位于起始时间和结束时间之间的目标轨迹分段,并在目标轨迹分段上添加分割标记。
实际应用中,电子设备可以根据原始音频信号中的停顿,在录音轨迹的对应位置添加分割标记,实现分割标记的自动添加,可以简化用户添加分割标记的操作,提高音频信号的处理效率。
可选地,显示单元1406具体用于响应于第九输入,在录音轨迹中确定分割位置,并在分割位置上添加分割标记。
实际应用中,在原始音频信号录入完成之后,用户可以手动在录音轨迹中添加分割标记,可以方便用户将音频信号分割为相应的几个音频分段,从而可以方便用户对原始音频信号进行分段处理。
应理解的是,本申请实施例中,输入单元1404可以包括图形处理器(Graphics Processing Unit,GPU)14041和麦克风14042,图形处理器14041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元1406可包括显示面板14081,可以采用液晶显示器、有机发光二极管等形式来配置显示面板14081。用户输入单元1407包括触控面板14081以及其他输入设备14072。触控面板14081,也称为触摸屏。触控面板14081可包括触摸检测装置和触摸控制器两个部分。其他输入设备14072可以包括但不限于物理键盘、功能键(比如音量控制按键、开 关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。存储器1409可用于存储软件程序以及各种数据,包括但不限于应用程序和操作系统。处理器1410可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1410中。
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述音频信号处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质包括计算机可读存储介质,如计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(RandomAccess Memory,RAM)、磁碟或者光盘等。
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述音频信号处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步 骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (23)

  1. 一种音频信号处理方法,包括:
    接收第一输入;
    响应于所述第一输入,录入原始音频信号,并显示所述原始音频信号的录音轨迹,所述录音轨迹用于指示所述原始音频信号的时间轴;
    在所述录音轨迹上添加至少一个分割标记,所述分割标记用于将所述录音轨迹分割为至少两个轨迹分段;
    基于所述分割标记对应的所述时间轴上的时间点,将所述原始音频信号分割为对应于所述轨迹分段的音频分段;
    基于对所述轨迹分段的输入,对所述轨迹分段对应的所述原始音频信号中的音频分段进行处理,得到目标音频信号。
  2. 根据权利要求1所述的方法,其中,所述基于对所述轨迹分段的输入,对所述轨迹分段对应的所述原始音频信号中的音频分段进行处理,得到目标音频信号,包括:
    响应于第二输入,从所述至少两个轨迹分段中确定待修改的轨迹分段;
    获取修正音频信号;
    采用所述修正音频信号替换待修改的音频分段,所述待修改的音频分段为所述原始音频信号中与所述待修改的轨迹分段对应的音频分段。
  3. 根据权利要求1所述的方法,其中,所述基于对所述轨迹分段的输入,对所述轨迹分段对应的所述原始音频信号中的音频分段进行处理,得到目标音频信号,包括:
    响应于第三输入,从所述至少两个轨迹分段中确定待删除的轨迹分段;
    删除所述原始音频信号中与所述待删除的轨迹分段对应的音频分段。
  4. 根据权利要求1所述的方法,其中,在所述基于对所述轨迹分段的输入,对所述轨迹分段对应的所述原始音频信号中的音频分段进行处理,得到目标音频信号之前,还包括:
    在接收到第四输入的情况下,暂停所述原始音频信号的录入;
    在接收到第五输入的情况下,继续所述原始音频信号的录入。
  5. 根据权利要求4所述的方法,其中,在所述在接收到第五输入的情况下,继续所述原始音频信号的录入之前,还包括:
    在所述录音轨迹的末端显示暂停标记;
    响应于第六输入,在所述录音轨迹的目标位置添加与所述暂停标记对应的切割标记,所述暂停标记和所述切割标记用于从所述录音轨迹中划分出待切割的轨迹分段;
    从所述原始音频信号中删除所述待切割的轨迹分段对应的音频分段。
  6. 根据权利要求1所述的方法,其中,在所述基于对所述轨迹分段的输入,对所述轨迹分段对应的所述原始音频信号中的音频分段进行处理,得到目标音频信号之后,还包括:
    响应于第七输入,从所述至少两个轨迹分段中确定目标轨迹分段;
    从所述目标音频信号中确定所述目标轨迹分段对应的目标音频分段,并发送所述目标音频分段。
  7. 根据权利要求1所述的方法,其中,所述在所述录音轨迹上添加至少一个分割标记,包括:
    在所述原始音频信号的录入过程中,若接收到第八输入,则在当前时刻对应的所述录音轨迹的位置上添加所述分割标记。
  8. 根据权利要求1所述的方法,其中,所述在所述录音轨迹上添加至少一个分割标记,包括:
    确定所述原始音频信号中停顿时长大于或等于预设时长的停顿区间,并确定所述停顿区间在所述时间轴上的起始时间和结束时间;
    从所述录音轨迹中确定位于所述起始时间和所述结束时间之间的目标轨迹分段,并在所述目标轨迹分段上添加所述分割标记。
  9. 根据权利要求1-8中任一项所述的方法,其中,所述在所述录音轨迹上添加至少一个分割标记,包括:
    响应于第九输入,在所述录音轨迹中确定分割位置,并在所述分割位置上 添加所述分割标记。
  10. 一种音频信号处理装置,包括:
    接收模块,用于接收第一输入;
    显示模块,用于响应于所述第一输入,录入原始音频信号,并显示所述原始音频信号的录音轨迹,所述录音轨迹用于指示所述原始音频信号的时间轴;
    添加模块,用于在所述录音轨迹上添加至少一个分割标记,所述分割标记用于将所述录音轨迹分割为至少两个轨迹分段;
    分割模块,用于基于所述分割标记对应的所述时间轴上的时间点,将所述原始音频信号分割为对应于所述轨迹分段的音频分段;
    处理模块,用于基于对所述轨迹分段的输入,对所述轨迹分段对应的所述原始音频信号中的音频分段进行处理,得到目标音频信号。
  11. 根据权利要求10所述的装置,其中,所述处理模块具体用于:
    响应于第二输入,从所述至少两个轨迹分段中确定待修改的轨迹分段;
    获取修正音频信号;
    采用所述修正音频信号替换待修改的音频分段,所述待修改的音频分段为所述原始音频信号中与所述待修改的轨迹分段对应的音频分段。
  12. 根据权利要求10所述的装置,其中,所述处理模块具体用于:
    响应于第三输入,从所述至少两个轨迹分段中确定待删除的轨迹分段;
    删除所述原始音频信号中与所述待删除的轨迹分段对应的音频分段。
  13. 根据权利要求10所述的装置,还包括暂停模块,用于:
    在接收到第四输入的情况下,暂停所述原始音频信号的录入;
    在接收到第五输入的情况下,继续所述原始音频信号的录入。
  14. 根据权利要求13所述的装置,还包括删除模块,用于:
    在所述录音轨迹的末端显示暂停标记;
    响应于第六输入,在所述录音轨迹的目标位置添加与所述暂停标记对应的切割标记,所述暂停标记和所述切割标记用于从所述录音轨迹中划分出待切割的轨迹分段;
    从所述原始音频信号中删除所述待切割的轨迹分段对应的音频分段。
  15. 根据权利要求10所述的装置,还包括:
    确定模块,用于响应于第七输入,从所述至少两个轨迹分段中确定目标轨迹分段;
    发送模块,用于从所述目标音频信号中确定所述目标轨迹分段对应的目标音频分段,并发送所述目标音频分段。
  16. 根据权利要求10所述的装置,其中,所述添加模块具体用于:
    在所述原始音频信号的录入过程中,若接收到第八输入,则在当前时刻对应的所述录音轨迹的位置上添加所述分割标记。
  17. 根据权利要求10所述的装置,其中,所述添加模块具体用于:
    确定所述原始音频信号中停顿时长大于或等于预设时长的停顿区间,并确定所述停顿区间在所述时间轴上的起始时间和结束时间;
    从所述录音轨迹中确定位于所述起始时间和所述结束时间之间的目标轨迹分段,并在所述目标轨迹分段上添加所述分割标记。
  18. 根据权利要求10-17中任一项所述的装置,其中,所述添加模块具体用于:
    响应于第九输入,在所述录音轨迹中确定分割位置,并在所述分割位置上添加所述分割标记。
  19. 一种电子设备,包括处理器,存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至9中任一项所述的音频信号处理方法的步骤。
  20. 一种电子设备,被配置用于执行如权利要求1至9中任一项所述的音频信号处理方法的步骤。
  21. 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1至9中任一项所述的音频信号处理方法的步骤。
  22. 一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理 器耦合,所述处理器用于运行程序或指令,实现如权利要求1至9中任一项所述的音频信号处理方法的步骤。
  23. 一种计算机程序产品,其中,所述程序产品被存储在存储介质中,所述程序产品被至少一个处理器执行以实现如权利要求1-9任一项所述的显示方法。
PCT/CN2022/072745 2021-01-22 2022-01-19 音频信号处理方法、装置、电子设备和可读存储介质 WO2022156709A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110090251.3A CN112887480B (zh) 2021-01-22 2021-01-22 音频信号处理方法、装置、电子设备和可读存储介质
CN202110090251.3 2021-01-22

Publications (1)

Publication Number Publication Date
WO2022156709A1 true WO2022156709A1 (zh) 2022-07-28

Family

ID=76050520

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/072745 WO2022156709A1 (zh) 2021-01-22 2022-01-19 音频信号处理方法、装置、电子设备和可读存储介质

Country Status (2)

Country Link
CN (1) CN112887480B (zh)
WO (1) WO2022156709A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116087930A (zh) * 2022-08-18 2023-05-09 荣耀终端有限公司 音频测距方法、设备、存储介质和程序产品

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112887480B (zh) * 2021-01-22 2022-07-29 维沃移动通信有限公司 音频信号处理方法、装置、电子设备和可读存储介质
CN114999464A (zh) * 2022-05-25 2022-09-02 高创(苏州)电子有限公司 语音数据处理方法及装置
CN115237316A (zh) * 2022-06-06 2022-10-25 华为技术有限公司 一种音轨标记方法及电子设备
CN116527813B (zh) * 2023-06-26 2023-08-29 深圳市易赛通信技术有限公司 录音手表的录音方法及录音手表

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105163021A (zh) * 2015-07-08 2015-12-16 成都西可科技有限公司 一种运动相机的视频标记方法
CN107295416A (zh) * 2017-05-05 2017-10-24 中广热点云科技有限公司 截取视频片段的方法和装置
CN107481743A (zh) * 2017-08-07 2017-12-15 捷开通讯(深圳)有限公司 移动终端、存储器及录音文件的编辑方法
CN108124059A (zh) * 2017-12-21 2018-06-05 维沃移动通信有限公司 一种录音方法及移动终端
WO2020134851A1 (zh) * 2018-12-28 2020-07-02 广州市百果园信息技术有限公司 语音信号变换方法、装置、设备和存储介质
CN112887480A (zh) * 2021-01-22 2021-06-01 维沃移动通信有限公司 音频信号处理方法、装置、电子设备和可读存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104505108B (zh) * 2014-12-04 2018-01-19 广东欧珀移动通信有限公司 一种信息定位方法及终端
CN111124349A (zh) * 2019-12-03 2020-05-08 维沃移动通信有限公司 一种音频处理方法及电子设备
CN111445929A (zh) * 2020-03-12 2020-07-24 维沃移动通信有限公司 一种语音信息处理方法及电子设备
CN111464428B (zh) * 2020-03-31 2022-03-01 维沃移动通信有限公司 音频处理方法、服务器、电子设备及计算机可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105163021A (zh) * 2015-07-08 2015-12-16 成都西可科技有限公司 一种运动相机的视频标记方法
CN107295416A (zh) * 2017-05-05 2017-10-24 中广热点云科技有限公司 截取视频片段的方法和装置
CN107481743A (zh) * 2017-08-07 2017-12-15 捷开通讯(深圳)有限公司 移动终端、存储器及录音文件的编辑方法
CN108124059A (zh) * 2017-12-21 2018-06-05 维沃移动通信有限公司 一种录音方法及移动终端
WO2020134851A1 (zh) * 2018-12-28 2020-07-02 广州市百果园信息技术有限公司 语音信号变换方法、装置、设备和存储介质
CN112887480A (zh) * 2021-01-22 2021-06-01 维沃移动通信有限公司 音频信号处理方法、装置、电子设备和可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116087930A (zh) * 2022-08-18 2023-05-09 荣耀终端有限公司 音频测距方法、设备、存储介质和程序产品
CN116087930B (zh) * 2022-08-18 2023-10-20 荣耀终端有限公司 音频测距方法、设备、存储介质和程序产品

Also Published As

Publication number Publication date
CN112887480A (zh) 2021-06-01
CN112887480B (zh) 2022-07-29

Similar Documents

Publication Publication Date Title
WO2022156709A1 (zh) 音频信号处理方法、装置、电子设备和可读存储介质
US20210360215A1 (en) Creating a combined video vignette
WO2017092257A1 (zh) 一种现场直播中的共同收看仿真方法和装置
US11568899B2 (en) Method, apparatus and smart mobile terminal for editing video
WO2023066297A1 (zh) 消息处理方法、装置、设备及存储介质
WO2022022536A1 (zh) 音频播放方法、音频播放装置和电子设备
CN110417641A (zh) 一种发送会话消息的方法与设备
CN107071512B (zh) 一种配音方法、装置及系统
US9304612B2 (en) Off-screen input capture for mobile device
CN110215707B (zh) 游戏中语音交互的方法及装置、电子设备、存储介质
WO2022156668A1 (zh) 信息处理方法和电子设备
CN112311658A (zh) 语音信息处理方法、装置及电子设备
WO2023072083A1 (zh) 文件处理的方法和电子设备
US20220262339A1 (en) Audio processing method, apparatus, and device, and storage medium
WO2015117526A1 (zh) 一种触控处理方法和装置
WO2022206538A1 (zh) 信息发送方法、信息发送装置和电子设备
WO2022089481A1 (zh) 信息处理方法、装置及电子设备
WO2022068768A1 (zh) 权限设置方法、文件播放方法、装置和电子设备
WO2020057241A1 (zh) 应用程序显示的方法、装置及终端设备
CN107291564B (zh) 信息复制粘贴方法、装置和电子设备
US20230343325A1 (en) Audio processing method and apparatus, and electronic device
WO2022135259A1 (zh) 语音输入方法、装置及电子设备
WO2022228433A1 (zh) 信息处理方法、装置以及电子设备
EP2950185A1 (en) Method for controlling a virtual keyboard and electronic device implementing the same
WO2022089480A1 (zh) 信息处理方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22742188

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22742188

Country of ref document: EP

Kind code of ref document: A1