CN113643728B - Audio recording method, electronic equipment, medium and program product - Google Patents

Audio recording method, electronic equipment, medium and program product Download PDF

Info

Publication number
CN113643728B
CN113643728B CN202110924280.5A CN202110924280A CN113643728B CN 113643728 B CN113643728 B CN 113643728B CN 202110924280 A CN202110924280 A CN 202110924280A CN 113643728 B CN113643728 B CN 113643728B
Authority
CN
China
Prior art keywords
audio
processed
user
duration
speed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110924280.5A
Other languages
Chinese (zh)
Other versions
CN113643728A (en
Inventor
彭连银
余艳辉
赵俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honor Device Co Ltd
Original Assignee
Honor Device Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honor Device Co Ltd filed Critical Honor Device Co Ltd
Priority to CN202110924280.5A priority Critical patent/CN113643728B/en
Publication of CN113643728A publication Critical patent/CN113643728A/en
Application granted granted Critical
Publication of CN113643728B publication Critical patent/CN113643728B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72433User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones

Abstract

The application discloses an audio recording method, electronic equipment, a medium and a program product, wherein the method can be applied to the electronic equipment, and comprises the steps that the electronic equipment acquires an audio frame, the audio frame is divided into a plurality of audio segment units with equal duration, two adjacent audio segment units in the plurality of audio segment units have overlapping areas, then the electronic equipment adjusts the duration of the overlapping areas according to the target playing speed configured by a user, and then the audio frame after processing is obtained according to the adjusted overlapping areas and the plurality of audio segment units, and the audio frame after processing is encoded to obtain a video file. Therefore, the method supports the user to adjust the playing speed of the recorded audio file in the audio recording process, and the recorded video file can achieve the effects of speed change and tone change through the time domain overlapping method.

Description

Audio recording method, electronic equipment, medium and program product
Technical Field
The present application relates to the field of audio technology, and in particular, to an audio recording method, an electronic device, a computer storage medium, and a computer program product.
Background
With the development of audio technology, more and more electronic devices (e.g., mobile phones, tablet computers) support audio recording. Based on this, the user can record sounds in various scenes using the electronic device. For example, in an educational scenario, a user may record a sound of a teacher explaining knowledge in a class using an electronic device; for another example, in a dialect scene, a user may record the voice of both parties in the dialect using an electronic device.
Taking an educational scenario as an example, a user can conduct variable speed processing on recorded audio files based on the speech speed of a teacher in a classroom during knowledge interpretation, so that the audio files with proper speech speed are obtained, and the efficiency of learning knowledge in a review classroom is improved. In some examples, the class is 40 minutes, and the teacher teaches knowledge slower at 0-20 minutes, and faster at 21-40 minutes. The user can clip the recorded audio file to obtain the audio file with moderate speech rate. Specifically, the audio file is divided into a first part (a part corresponding to 0-20 minutes) and a second part (a part corresponding to 21-40 minutes), and then each part is subjected to variable speed processing, namely, the playing speed of the first part is fast, and the playing speed of the second part is slow; and finally, splicing the audio file of the first part after speed regulation and the audio file of the second part after speed regulation to obtain the audio file after clipping.
However, in the above manner of editing an audio file, operations of decoding, shifting, encoding, etc. of the recorded audio file are required, and when it is required to separately speed-adjust different portions of the audio file, it is also required to split the video file. Therefore, the method for editing the audio file after the audio file is recorded is complex in operation and low in processing efficiency.
Disclosure of Invention
The application aims at: an audio recording method, an electronic device, a medium and a program product are provided, which simplify the operation and improve the processing efficiency.
In order to achieve the above purpose, the application adopts the following technical scheme:
in a first aspect, the present application provides an audio recording method, which may be applied to an electronic device. The electronic device may be a mobile phone, a recording pen, a tablet computer, etc. Specifically, the method comprises the following steps:
the electronic equipment acquires an audio frame, divides the audio frame into a plurality of audio segment units with equal duration, and two adjacent audio segment units in the plurality of audio segment units have overlapping areas; the electronic equipment can adjust the duration of the overlapped area according to the target playing speed configured by the user; and then the electronic equipment obtains a processed audio frame according to the adjusted overlapping area and the plurality of audio segment units, and encodes the processed audio frame to obtain an audio file. For example, the electronic device obtains a processed timestamp of the processed audio frame according to the target playing speed, and encodes the processed audio frame based on the processed timestamp of the processed audio frame to obtain the audio file.
In the method, a user can configure the target playing speed in real time in the process of audio recording by the electronic equipment, so that the recorded audio file can be played at the playing speed meeting the requirements of the user, the user does not need to further process such as editing the recorded audio file, the user operation is simplified, and the processing efficiency is improved.
In addition, the method adjusts the audio frames in a time domain overlapping mode. For example, the audio frame is divided into a plurality of audio segment units with smaller duration equal, and two adjacent audio segment units in the plurality of audio segment units have an overlapping area, and the duration of the overlapping area is adjusted, so that the speed change effect is achieved. In this way, the audio frame does not need to be compressed, and the frequency of the sound recorded in the original audio frame is not compressed, and the tone of the sound in the original audio frame is not changed. Therefore, the method can enable the recorded audio file to achieve the effects of speed change and tone change.
In some possible implementations, when the electronic device may adjust the overlapping area based on a size relationship between the target playing speed configured by the user and the preset playing speed. Specifically, when the target playing speed configured by the user is greater than the preset playing speed, the electronic equipment lengthens the duration of the overlapping area, so that the effect of shortening the duration of the whole audio frame is achieved, and quick playing is realized; when the target playing speed configured by the user is smaller than the preset playing speed, the electronic equipment shortens the duration of the overlapping area, so that the effect of prolonging the duration of the whole audio frame is achieved, and slow playing is realized.
In some possible implementations, the electronic device may also calculate the processed duration of the overlap region based on the user-configured target play speed. Specifically, the electronic device adjusts the duration of the overlapping area to be the duration after processing according to the duration of the overlapping area, the number of the plurality of audio segment units, the duration of any audio segment unit in the plurality of audio segment units, and the target playing speed configured by the user. In some examples, the electronic device may calculate the post-processing duration of the overlap region using equation (1) as follows:
t x a post-treatment duration of the overlapping region; v is a target playing speed configured by the user, for example, when the target playing speed configured by the user is 0.5 times speed, v=0.5, and when the target playing speed configured by the user is 1.5 times speed, v=1.5; t is t 0 A length of time for the overlap region; n is the number of the audio segment units; t is the duration of any one of the plurality of audio segment units.
Therefore, the electronic device can directly calculate the processed time length of the adjusted overlapping area based on the target playing speed configured by the user, and further can adjust the time length of the original overlapping area to the processed time length of the adjusted overlapping area.
In some possible implementations, the electronic device may obtain the processed sampling points based on the target playing speed and the preset sampling points of the microphone; and then based on the number of the sampling points after processing, the number of channels, the sampling rate and the number of bytes of the sampling points, acquiring the time interval between the audio frame after processing and the audio frame before the audio frame after processing, and adding the time interval and the time stamp after processing of the audio frame before the audio frame after processing to obtain the time stamp after processing of the audio frame after processing.
In some examples, the electronic device may calculate the post-processing timestamp of the post-processing audio frame using equation (2) as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,a post-processing timestamp for the nth post-processing audio frame; s is S i The number of post-processing sample points for the ith audio frame (or post-processing audio frame); c is the number of channels (channel number), e.g., c=2; r is the sample rate, which may be, for example, 44.1kHz, representing 44.1k samples per second; b is the number of bytes per sample (byte), for example b=2.
The electronic device may then calculate the post-processing timestamp of the processed audio frame by the above formula (2), encode the processed audio frame according to the post-processing timestamp of the processed audio frame, and further obtain the audio file. For example, the electronic device may send the processed timestamp of the processed audio frame and the processed audio frame to an audio encoder, where the audio encoder outputs the output encoding result to a wrapper, where the wrapper encapsulates the encoding result to obtain the audio file.
In some possible implementations, the user may configure the target playback speed before the electronic device begins audio recording, such as by configuring the target playback speed with a throttle control. The user may then trigger a recording request by clicking on the recording control, where the recording request includes the user-configured target play speed. And the electronic equipment starts recording according to the recording request, so as to obtain an audio frame. Therefore, the electronic equipment can record the audio at the starting time of recording the audio at the target playing speed configured by the user, the applicability of the method is improved, the user requirements are met, and the user experience is improved.
In some possible implementations, the user may configure the target playback speed during audio recording by the electronic device, for example, by configuring the target playback speed through a pacing control, triggering a pacing request including the user-configured target playback speed. The electronic device can adjust the duration of the overlapping area according to the speed regulation request by utilizing the target playing speed carried in the speed regulation request.
In a second aspect, the present application provides an electronic device comprising:
An acquisition unit configured to acquire an audio frame;
a splitting unit, configured to split the audio frame into a plurality of audio segment units with equal duration; overlapping areas exist in two adjacent audio segment units in the plurality of audio segment units;
the adjusting unit is used for adjusting the duration of the overlapped area according to the target playing speed configured by the user;
a synthesizing unit, configured to obtain a processed audio frame according to the adjusted overlapping area and the plurality of audio segment units;
the encoding unit is used for acquiring the processed time stamp of the processed audio frame according to the target playing speed; and encoding the processed audio frame according to the processed timestamp of the processed audio frame to obtain an audio file.
In some possible implementations, the adjusting unit is specifically configured to adjust the duration of the overlapping area when the target playing speed configured by the user is greater than a preset playing speed; and when the target playing speed configured by the user is smaller than the preset playing speed, shortening the duration of the overlapped area.
In some possible implementations, the adjusting unit is specifically configured to adjust the duration of the overlapping area to be the processed duration according to the target playing speed configured by the user, the duration of the overlapping area, the number of the plurality of audio segment units, and the duration of any one of the plurality of audio segment units.
In some possible implementations, the encoding unit is specifically configured to obtain the processed sampling point according to the target playing speed and a preset sampling point of the microphone; acquiring a time interval between the processed audio frame and a previous audio frame of the processed audio frame according to the number of the processed sampling points, the number of channels, the sampling rate and the byte number of the sampling points; and adding the time interval and the processed time stamp of the previous audio frame of the processed audio frame to obtain the processed time stamp of the processed audio frame.
In some possible implementations, the obtaining unit is specifically configured to obtain a recording request, where the recording request includes a target playing speed configured by a user; and acquiring the audio frame according to the recording request.
In some possible implementations, the adjusting unit is specifically configured to obtain a pacing request, where the pacing request includes a target playing speed configured by a user; and according to the speed regulation request, adjusting the duration of the overlapped area by utilizing the target playing speed configured by the user.
In a third aspect, the present application provides an electronic device comprising a processor and a memory; wherein one or more computer programs are stored in the memory, the one or more computer programs comprising instructions; the instructions, when executed by the processor, cause the electronic device to perform the audio recording method as described in any one of the possible designs of the first aspect above.
In a fourth aspect, the present application provides a computer storage medium comprising computer instructions which, when run on an electronic device, performs an audio recording method as described in any one of the possible designs of the first aspect above.
In a fifth aspect, the present application provides a computer program product for performing the audio recording method described in any one of the possible designs of the first aspect above, when the computer program product is run on a computer.
It should be appreciated that the description of technical features, aspects, benefits or similar language in the present application does not imply that all of the features and advantages may be realized with any single embodiment. Conversely, it should be understood that the description of features or advantages is intended to include, in at least one embodiment, the particular features, aspects, or advantages. Therefore, the description of technical features, technical solutions or advantageous effects in this specification does not necessarily refer to the same embodiment. Furthermore, the technical features, technical solutions and advantageous effects described in the present embodiment may also be combined in any appropriate manner. Those of skill in the art will appreciate that an embodiment may be implemented without one or more particular features, aspects, or benefits of a particular embodiment. In other embodiments, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments.
Drawings
Fig. 1 is a diagram illustrating a composition example of an electronic device according to an embodiment of the present application;
fig. 2 is a frame diagram of an audio recording according to an embodiment of the present application;
fig. 3 is a flowchart of an audio recording method according to an embodiment of the present application;
FIGS. 4A-4C are diagrams illustrating a user interface for starting a recording by an electronic device according to an embodiment of the present application;
fig. 5A-5E are schematic diagrams of a speed control provided in an embodiment of the present application;
FIGS. 6A-6C are schematic diagrams illustrating splitting audio frames according to embodiments of the present application;
fig. 7 is a schematic diagram of a frame extraction method according to an embodiment of the present application;
fig. 8 is a schematic diagram of an electronic device according to an embodiment of the present application;
fig. 9 is a schematic diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The terms first, second, third and the like in the description and in the claims and in the drawings are used for distinguishing between different objects and not for limiting the specified order.
In embodiments of the application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
For clarity and conciseness in the description of the following embodiments, a brief description of the related art will be given first:
audio recording refers to capturing sound by a sound pickup device (e.g., a microphone) to obtain a sound sequence, and then packaging the sound sequence (e.g., audio frames) into an audio file. The audio file can be played by a player to recover the sound recorded by the microphone. In some embodiments, the user may record the audio of the scene of the music activity, the dialect game, the lecture game, and the classroom explanation, to obtain the audio file of the corresponding scene.
In order to improve the playing effect of the audio file, a user usually clips the recorded audio file, so that the processed audio file is played at a more comfortable playing speed, and further, audience understanding of the recorded content of the audio file is facilitated. Taking the recorded content of the audio file as the sound of the teaching knowledge of the teacher in the class as an example, playing the audio file at a comfortable playing speed can improve the efficiency of the user to review the learned knowledge in the class.
However, in order to obtain the audio file with better playing effect, operations of decoding, changing speed, encoding or splitting, merging and the like are required for the audio file obtained by recording. Therefore, there is a need for an audio recording method that simplifies the user operation and improves the processing efficiency.
In view of this, the embodiment of the application provides an audio recording method, which can be applied to an electronic device. In the method, a user can preset the playing speeds of the recorded audio files at different positions on the playing progress bar according to requirements in the audio recording process. Specifically, the electronic device acquires an audio frame, divides the audio frame into a plurality of audio segment units with equal duration, such as a first audio segment unit and a second audio segment unit, and two adjacent audio segment units in the plurality of audio segment units have overlapping areas; then, the electronic device adjusts the duration of the overlapping area based on the target playing speed configured by the user, obtains a processed audio frame according to the adjusted overlapping area and the plurality of audio segment units, and encodes the processed audio frame to obtain the audio file. For example, the electronic device obtains a processed timestamp of the processed audio frame according to the target playing speed, and encodes the processed audio frame based on the processed timestamp of the processed audio frame to obtain the audio file.
On one hand, the method supports the user to set the playing speed of the recorded audio file in the audio recording process. Therefore, after the electronic equipment records the audio file, the playing speed of the audio file is the audio file meeting the requirements of the user, further clipping and other processing by the user are not needed, the user operation is simplified, and the processing efficiency is improved.
On the other hand, the method adjusts the audio frames in a time domain overlapping manner. Specifically, the method further divides the audio frame into a plurality of audio segment units with equal duration, two adjacent audio segment units in the plurality of audio segment units have overlapping areas, and the duration of the overlapping areas is adjusted through the target playing speed configured by the user, so that the purpose of adjusting the duration of the audio frame is achieved. The method does not need to compress the audio frame, further does not compress the frequency of the sound recorded in the original audio frame, and further does not change the tone of the sound in the original audio frame. Therefore, the method can achieve the effect of speed change and tone change, and further improve user experience.
In some embodiments, the electronic device may be a voice recorder, a mobile phone, a tablet, a desktop, a laptop, a notebook, an Ultra-mobile personal computer (UMPC), a handheld computer, a netbook, a personal digital assistant (Personal Digital Assistant, PDA), a wearable electronic device, a smart watch, etc., and the application is not limited in particular to the specific form of the above-described electronic device. In this embodiment, the structure of the electronic device may be shown in fig. 1, and fig. 1 is a schematic structural diagram of the electronic device according to the embodiment of the present application.
As shown in fig. 1, the electronic device may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charge management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, keys 190, a motor 191, an indicator 192, a camera 193, a display 194, a user identification module (subscriber identification module, SIM) card interface 195, and the like. The sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
It is to be understood that the configuration illustrated in this embodiment does not constitute a specific limitation on the electronic apparatus. In other embodiments, the electronic device may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
The processor 110 may include one or more processing units, such as: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors. For example, in the present application, a processor may perform the steps of: the method comprises the steps of obtaining an audio frame, dividing the audio frame into a plurality of audio segment units with equal time length, wherein two adjacent audio segment units in the plurality of audio segment units have overlapping areas, adjusting the time length of the overlapping areas according to target playing speeds configured by users, obtaining a processed audio frame according to the adjusted overlapping areas and the plurality of audio segment units, and encoding the processed audio frame to obtain an audio file.
The controller can be a neural center and a command center of the electronic device. The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution.
A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby improving the efficiency of the system.
In some embodiments, the processor 110 may include one or more interfaces. The interfaces may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous receiver transmitter (universal asynchronous receiver/transmitter, UART) interface, a mobile industry processor interface (mobile industry processor interface, MIPI), a general-purpose input/output (GPIO) interface, a subscriber identity module (subscriber identity module, SIM) interface, and/or a universal serial bus (universal serial bus, USB) interface, among others.
The I2C interface is a bi-directional synchronous serial bus comprising a serial data line (SDA) and a serial clock line (derail clock line, SCL). In some embodiments, the processor 110 may contain multiple sets of I2C buses. The processor 110 may be coupled to the touch sensor 180K, charger, flash, camera 193, etc., respectively, through different I2C bus interfaces. For example: the processor 110 may be coupled to the touch sensor 180K through an I2C interface, so that the processor 110 and the touch sensor 180K communicate through an I2C bus interface to implement a touch function of the electronic device.
The MIPI interface may be used to connect the processor 110 to peripheral devices such as a display 194, a camera 193, and the like. The MIPI interfaces include camera serial interfaces (camera serial interface, CSI), display serial interfaces (display serial interface, DSI), and the like. In some embodiments, processor 110 and camera 193 communicate through a CSI interface to implement the photographing function of the electronic device. The processor 110 and the display screen 194 communicate via a DSI interface to implement the display functionality of the electronic device.
The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal or as a data signal. In some embodiments, a GPIO interface may be used to connect the processor 110 with the camera 193, the display 194, the wireless communication module 160, the audio module 170, the sensor module 180, and the like. The GPIO interface may also be configured as an I2C interface, an I2S interface, a UART interface, an MIPI interface, etc.
The USB interface 130 is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 130 may be used to connect a charger to charge an electronic device, or may be used to transfer data between the electronic device and a peripheral device. And can also be used for connecting with a headset, and playing audio through the headset. The interface may also be used to connect other electronic devices, such as AR devices, etc.
It should be understood that the connection relationship between the modules illustrated in this embodiment is only illustrative, and does not limit the structure of the electronic device. In other embodiments of the present application, the electronic device may also use different interfacing manners, or a combination of multiple interfacing manners in the foregoing embodiments.
The electronic device implements display functions via a GPU, a display screen 194, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
The display screen 194 is used to display images, videos, and the like. The display 194 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) or an active-matrix organic light-emitting diode (matrix organic light emitting diode), a flexible light-emitting diode (flex), a mini, a Micro-led, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like. In some embodiments, the electronic device may include 1 or N display screens 194, N being a positive integer greater than 1.
A series of graphical user interfaces (graphical user interface, GUIs) may be displayed on the display 194 of the electronic device, all of which are home screens of the electronic device. Generally, the size of the display 194 of an electronic device is fixed and only limited controls can be displayed in the display 194 of the electronic device. A control is a GUI element that is a software component contained within an application program that controls all data processed by the application program and interactive operations on that data, and a user can interact with the control by direct manipulation (direct manipulation) to read or edit information about the application program. In general, controls may include visual interface elements such as icons, buttons, menus, tabs, text boxes, dialog boxes, status bars, navigation bars, widgets, and the like. .
The electronic device may implement shooting functions through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
The ISP is used to process data fed back by the camera 193. For example, when photographing, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electric signal, and the camera photosensitive element transmits the electric signal to the ISP for processing and is converted into an image visible to naked eyes. ISP can also optimize the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in the camera 193.
The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, which is then transferred to the ISP to be converted into a digital image signal. The ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into an image signal in a standard RGB, YUV, or the like format. In some embodiments, the electronic device may include 1 or N cameras 193, N being a positive integer greater than 1.
The digital signal processor is used for processing digital signals, and can process other digital signals besides digital image signals. For example, when the electronic device selects a frequency bin, the digital signal processor is used to fourier transform the frequency bin energy, and so on.
Video codecs are used to compress or decompress digital video. The electronic device may support one or more video codecs. In this way, the electronic device may play or record video in a variety of encoding formats, such as: dynamic picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
The electronic device may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music play, audio recording, etc.
The audio module 170 is used to convert digital audio information into an analog audio signal output and also to convert an analog audio input into a digital audio signal. The audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or a portion of the functional modules of the audio module 170 may be disposed in the processor 110.
The speaker 170A, also referred to as a "horn," is used to convert audio electrical signals into sound signals. The electronic device may listen to music, or to hands-free conversations, through speaker 170A.
A receiver 170B, also referred to as a "earpiece", is used to convert the audio electrical signal into a sound signal. When the electronic device picks up a phone call or voice message, the voice can be picked up by placing the receiver 170B close to the human ear.
Microphone 170C, also referred to as a "microphone" or "microphone", is used to convert sound signals into electrical signals. When making a call or transmitting voice information, the user can sound near the microphone 170C through the mouth, inputting a sound signal to the microphone 170C. The electronic device may be provided with at least one microphone 170C. In other embodiments, the electronic device may be provided with two microphones 170C, and may implement a noise reduction function in addition to collecting sound signals. In other embodiments, the electronic device may also be provided with three, four, or more microphones 170C to enable collection of sound signals, noise reduction, identification of sound sources, directional recording functions, etc. In some examples, the electronic device may obtain the audio frame through a microphone.
For ease of understanding, the electronic device audio recording process is described below in conjunction with the audio recording framework shown in FIG. 2.
As shown in fig. 2, the framework of audio recording may be divided into three layers, such as an application layer, a framework layer (framework), and a hardware abstraction layer (hardware abstraction layer, HAL).
The application layer may include a series of application packages, among other things. The application package may include applications such as a recorder, camera, gallery, calendar, talk, map, navigation, WLAN, bluetooth, music, video, short message, etc. The framework layer is used to provide an application programming interface (application programming interface, API) and programming framework. The hardware abstraction layer is used for packaging the Linux kernel driver, providing an interface upwards and shielding the implementation details of the bottom hardware. In some examples, the user may trigger a corresponding request by an application such as a recorder, camera, etc., and the framework layer may respond upon receiving the request, for example, by sending an instruction to the hardware abstraction layer to instruct to turn on a microphone or turn on a camera, etc.
In an audio recording scene, a recording tool (audio recorder) may output an audio frame collected by a microphone, then an electronic device processes the audio frame, then the processed audio frame (for example, an audio stream formed by the processed audio frame) is sent to an audio encoder, and then an encoding result output by the audio encoder is encapsulated by an encapsulator to obtain an audio file.
In some examples, the electronic device may divide the audio frame into a plurality of audio segment units with equal duration, where two adjacent audio segment units in the plurality of audio segment units have overlapping areas, and then adjust the duration of the overlapping areas according to a target playing speed configured by a user, so as to achieve a target of adjusting the duration of the audio frame, and further achieve a speed change effect. The electronic device then obtains a processed audio frame based on the adjusted overlap region and the plurality of audio segment units. The electronic device may also adjust the time stamp of the processed audio frame, and the audio encoder encodes the processed audio frame based on the processed time stamp.
In a video recording scene, on the basis of the recording scene, a video recording tool (media recorder) can output video frames collected by a camera, then an electronic device processes the video frames, then the processed video frames (for example, a video stream formed by the processed video frames) are sent to a video encoder, and then a packer encapsulates the encoding result of the video encoder and the encoding result of the audio encoder to obtain a video file. In some examples, the electronic device may also align the video frames with the audio frames and then send the aligned video frames and the audio frames to a video encoder and an audio encoder, respectively, to encode the audio frames and the audio frames, respectively, so that the sound and the mouth shape in the obtained video file are corresponding.
In some examples, the electronic device may adjust the recording frame rate of the camera according to the user-configured target play speed. When the target playing speed is greater than the preset playing speed, the electronic device can reduce the recording frame rate of the camera, for example, when a user sets 2 times of speed video, the recording frame rate of the camera is adjusted from 30fps to 15fps, compared with the original recording frame rate of 30fps, after the recording frame rate is adjusted to 15fps, the video frames obtained in unit time are reduced by 15 frames, wherein the current playing speed of the preset playing speed is reduced. When the target playing speed is smaller than the preset playing speed, the electronic equipment can increase the recording frame rate of the camera, for example, when the user sets 0.5 times of speed video, the recording frame rate of the camera is adjusted from 30fps to 60fps. Then, the electronic device can adjust the timestamp of the video frame obtained after adjusting the recording frame rate, and the video encoder can encode the processed video frame according to the processed timestamp.
Therefore, the audio recording method provided by the embodiment of the application not only can be applied to audio recording to obtain the audio file after speed change, but also can be applied to video recording to obtain the video file after speed change. Based on this, the user can also record highlight clips appearing in various scenes by using the electronic device, for example, the highlight clips can be clips corresponding to shooting of football players in a sports event. In order to make the video file obtained by recording the highlight have a better playing effect, for example, the video file is played slowly to attract the attention of audiences, a user can record the highlight at a speed which is 0.5 times that of the user in the video recording process of the electronic equipment, the user does not need to process clipping and the like, the video file with the slow playing effect can be directly obtained, and the user operation is simplified.
In order to make the technical scheme of the application clearer and easier to understand, the audio recording method provided by the embodiment of the application is described below from the angle of the electronic equipment. Referring to fig. 3, an embodiment of the present application provides a flowchart of an audio recording method, which may include:
s301: the electronic device obtains an audio frame.
In some examples, the electronic device may present a user interface, as shown in fig. 4A, and according to a user's touch operation on the recorder icon 401, the electronic device may launch an application corresponding to the recorder icon 410 (may be simply referred to as a recorder application), and present a recording preview interface as shown in fig. 4B. In addition, in the embodiment of the application, the electronic equipment can start the recorder application and present the recording preview interface in other modes. For example, when the mobile phone is in a black screen, locked screen or a user interface of a certain application, the mobile phone can respond to a voice command or a shortcut operation of a user, start the recorder application, and present a recording preview interface of the recorder application.
As shown in fig. 4B, the recording preview interface includes a speed control 420 and a recording control 431 and a recording list 440. The speed control 420 is used for a user to configure a target playing speed. In some examples, the user may configure the target play speed prior to recording such that the electronic device may record audio directly at the user-configured target play speed after clicking on record control 431. For example, the user clicks the record control 431 to trigger a record request, where the record request includes the target playing speed configured by the user (for example, the playing speed configured by the speed regulation control 420, which is described above), and then the electronic device starts recording according to the record request, and obtains the audio frame.
As shown in fig. 4C, after the user clicks the recording control 431, the recording control 431 is switched to a pause control 432, and an end control 450 is presented on the recording preview interface. Wherein the pause control 432 is used to pause the recording and the end control 450 is used to end the recording. Accordingly, the electronic device begins to acquire audio frames, such as by a microphone. The microphone may be a built-in microphone of the electronic device, or an external microphone connected with the electronic device, where the external microphone may be connected with the electronic device in a wired or wireless manner.
In some embodiments, the electronic device may collect an audio stream by a microphone, the audio stream being composed of a plurality of audio frames, and the audio frames acquired by the electronic device may be audio frames in the audio stream.
In some possible implementations, after the user clicks the recording control 431 and the electronic device records the audio, the user may click the speed regulation control 420 to configure the target playing speed, so as to adjust the playing speed of the recorded audio file in real time during the recording process of the electronic device.
As shown in fig. 5A, after the user clicks the speed control 420, the electronic device presents a slider control 421 on the recording preview interface, and the user can configure the target playing speed through the slider control 421. For example, the user drags the slider to slide in the direction of 0.5 times the speed, so that the playing speed of the audio file recorded by the electronic device can be reduced, and the user drags the slider to slide in the direction of 2 times the speed, so that the playing speed of the audio file recorded by the electronic device can be increased. The present application is not limited to the manner in which the user triggers the speed adjustment operation, the configuration of the target play speed by the slider control 421 is merely an example, and the speed adjustment range shown in the figure is 0.5 to 2 times speed is also merely an example.
In some scenes, a user can record the voice of the teacher explanation knowledge on a classroom by using electronic equipment, the speed of the teacher explanation knowledge is slower in 0-20 minutes and faster in 21-40 minutes in the classroom, and in order to obtain the audio file with moderate speed, the user can adjust the playing speed of the recorded audio file in real time in the recording process. As shown in FIG. 5B, the user can drag the slider to slide in the direction of 2 times of speed within 0-20 minutes, so that the audio file recorded by the electronic equipment can be played at a higher speed within 0-20 minutes; as shown in FIG. 5C, the user can drag the slider to slide in the direction of 0.5 times speed in 21-40 minutes, so that the playing speed of the video file recorded by the electronic equipment in 21-40 minutes is slower. Therefore, the user can obtain the audio file with moderate speech speed without clipping, so that the user operation is simplified, and the user time is saved.
In other examples, as shown in fig. 5D, after the user clicks on the speed adjustment control 420, a speed selection control 422 is presented on the recording preview interface, and the user may configure the target playback speed by clicking on a candidate speed in the speed selection control 422. For example, the user may click "1.5×", configuring the target play speed to be 1.5 times the speed. Wherein configuring the target play speed by the speed selection control 422 is merely exemplary, and the speed ranges of 0.5 times speed, 1 times speed, 1.5 times speed, and 2 times speed in the figure are also merely exemplary.
In other embodiments, as shown in FIG. 5E, when the user clicks on the speed control 420, a speed input control 423 is presented at the recording preview interface and the user can configure the target play speed at the speed input control 423. For example, the user may enter 1.5 at the speed input control 423, configuring the target playback speed to be 1.5 times the speed. Wherein configuring the target speed through the speed input control 423 is merely an example, and the numerical intervals (0.5-2) that allow input shown in the figure are also merely an example.
S302: the electronic equipment divides the audio frame into a plurality of audio segment units with equal duration, and two adjacent audio segment units in the plurality of audio segment units have an overlapping area.
Where an audio segment unit refers to a unit that is smaller relative to an audio frame. For example, when the duration of the audio frame is 40ms, the duration of the audio segment unit may be 1ms, so that the audio frame is composed of at least 40 audio segment units. It can be seen that the audio frame comprises a plurality of audio segment units, based on which the electronic device can split the audio frame to obtain a plurality of audio segment units.
In some possible implementations, the electronic device may divide the audio frame into a plurality of audio segment units of equal duration, and an overlapping region exists between two adjacent audio segment units of the plurality of audio segment units. The duration of the overlapping area may be a preset duration. The embodiment of the application does not specifically limit the specific value of the preset time length, and a person skilled in the art can set the specific value of the preset time length according to actual needs.
As shown in fig. 6A, a schematic diagram of an audio frame divided into a plurality of audio segment units is shown. The audio frame 600 may be divided into 3 audio segment units, an audio segment unit 611, an audio segment unit 612, and an audio segment unit 613. Wherein the duration of the audio segment unit 611, the duration of the audio segment unit 612, and the duration of the audio segment unit 613 are equal. The adjacent two audio segment units may be the audio segment unit 611 and the audio segment unit 612, or may be the audio segment unit 612 and the audio segment unit 613. Two adjacent audio segment units have an overlap region, such as an overlap region 621 between the audio segment unit 611 and the audio segment unit 612, and an overlap region 622 between the audio segment unit 612 and the audio segment unit 613. The length of time that the adjacent two audio segment units have overlapping regions is equal, for example, the length of time of the overlapping region 621 and the length of time of the overlapping region 622 are equal.
S303: and the electronic equipment adjusts the duration of the overlapped area according to the target playing speed configured by the user.
As described above with respect to fig. 5A-5E, a user may configure a target playback speed via a throttle control 420 in the recording preview interface. In some embodiments, the user triggers a pacing request during the audio recording via a pacing control 420. For example, a user may configure a target play speed via the pacing control 420, triggering a pacing request including the user-configured target play speed. After the electronic device receives the speed regulation request, the duration of the overlapping area can be adjusted by utilizing the target playing speed configured by the user according to the speed regulation request.
In other embodiments, the user may also configure the target playback speed through physical keys on the electronic device. For example, the user configures the target play speed by pressing the "volume + key" long or pressing the "volume-key" long.
In some embodiments, the electronic device may compare the target playing speed configured by the user with a preset playing speed to determine a manner of adjusting the overlapping area. The preset playing speed refers to the current audio recording speed. By way of example, example 1: the current time is 5 th seconds, 0-3 th seconds, the speed of audio recording is 1 time speed, the user configures the speed of audio recording to be 0.5 time speed in 3 rd seconds, and the preset playing speed is 0.5 time speed; example 2: the current time is 5 th second, 0-3 th second, the speed of audio recording is 1 time speed, the speed of audio recording is configured to be 0.5 time speed in 3 rd second, the speed of audio recording is configured to be 1 time speed in 4 th second, and the preset playing speed is 1 time speed.
Specifically, when the target playing speed configured by the user is greater than the preset playing speed, the electronic device lengthens the duration of the overlapping area. As shown in fig. 6B, fig. 6B shows a schematic diagram of an electronic device for adjusting the duration of the overlapping area. And when the target playing speed configured by the user is smaller than the preset playing speed, the electronic equipment shortens the duration of the overlapping area. As shown in fig. 6C, fig. 6C shows a schematic diagram of an electronic device for shortening the duration of the overlapping area.
Taking the example that the target playing speed configured by the user is greater than the preset playing speed, namely that the user hopes that the audio file can play the sound recorded by the audio file at a faster playing speed. In some examples, the audio frame 600 has a duration of 40ms, a number of 3 audio segment units, a duration of 15ms for each audio segment unit, and a duration of 2.5ms for each overlap region. The electronic device may adjust the duration of the overlapping area to 5ms based on the target playing speed configured by the user, so that the duration of the audio frame 600 becomes 35ms, and the duration of the visible audio frame 600 becomes shorter. That is, when the player plays the audio frame 600, the player can complete playing the audio frame 600 in a shorter time, thereby reducing the time required for playing the audio frame 600 and improving the playing speed of the sound recorded in the audio frame 600.
It should be noted that, in the embodiment of the present application, the audio frame is only described by taking the audio frame divided into 3 audio segment units as an example, and in practical application, the audio frame may be divided into 100, 200 or more audio segment units.
When the target playing speed configured by the user is smaller than the preset playing speed, the process of adjusting the overlapping area by the electronic device is similar to the adjustment process described when the target playing speed configured by the user is larger than the preset playing speed, and will not be repeated here.
In some embodiments, the electronic device may directly calculate the processed duration of the overlapping area according to the target playing speed configured by the user. Specifically, the electronic device adjusts the duration of the overlapping area to be the processed duration according to the target playing speed, the duration of the overlapping area, the number of the plurality of audio segment units and the duration of any audio segment unit in the plurality of audio segment units. The target playing speed is a parameter configured by a user, the duration of the overlapping area is a preset duration, the number of the plurality of audio segment units and the duration of any audio segment unit in the plurality of audio segment units, and the target playing speed is a known parameter for the electronic equipment. Based on this, the electronic apparatus can calculate the post-processing time length of the overlap region by the following formula (1):
wherein t is x Post-processing time length for overlapping regionsThe method comprises the steps of carrying out a first treatment on the surface of the v is a target playing speed configured by the user, for example, when the target playing speed configured by the user is 0.5 times speed, v=0.5, and when the target playing speed configured by the user is 1.5 times speed, v=1.5; t is t 0 The time length of the overlapping area is the preset time length; n is the number of the audio segment units; t is the duration of any one of the plurality of audio segment units.
S304: and the electronic equipment obtains the processed audio frame according to the adjusted overlapping area and the plurality of audio segment units.
As shown in fig. 6B or 6C, the adjusted overlap region includes an overlap region 621 and an overlap region 622. In some examples, the electronic device may superimpose the plurality of audio segment units based on the overlapping region 621, 622 to obtain a processed audio frame, which may be the audio frame shown in fig. 6B or 6C.
In some possible implementations, the electronic device may cause the loudness of the sound recorded in the overlapping area 621 to become larger after the overlapping area 621 is subjected to the overlapping process, and the electronic device may multiply the overlapping area 621 by an attenuation coefficient to reduce the influence of the overlapping on the loudness of the sound. Similarly, the electronic device may also perform similar processing for the overlap region 622. In some examples, the electronic device may also multiply an attenuation coefficient for the processed audio frame as a whole, thereby reducing the impact of the superposition on the loudness of the sound and further improving the user experience.
S305: the electronic device encodes the processed audio frame to obtain an audio file.
After the electronic device obtains the processed audio frame, the processed audio frame can be sent to an audio encoder, and then the encoding result output by the audio encoder is encapsulated to obtain an audio file.
In some embodiments, the electronic device may obtain the post-processing time stamp of the post-processing audio frame according to the target playback speed. In some possible implementations, the electronic device may obtain the post-processing sampling point based on the target playing speed and the sampling point of the microphone, then obtain a time interval between the post-processing audio frame and a previous audio frame of the post-processing audio frame according to the post-processing sampling point, the channel number, the sampling rate, and the byte number of the sampling point, and then add the time interval on the basis of the post-processing time stamp of the previous audio frame of the post-processing audio frame to obtain the post-processing time stamp of the post-processing audio frame. Specifically, the electronic device may calculate the post-processing time stamp of the post-processing audio frame by the following formula (2):
wherein, the liquid crystal display device comprises a liquid crystal display device,a post-processing timestamp for the nth post-processing audio frame; s is S i The number of post-processing sample points for the ith audio frame (or post-processing audio frame); c is the number of channels (channel number), e.g., c=2; r is the sample rate, which may be, for example, 44.1kHz, representing 44.1k samples per second; b is the number of bytes per sample (byte), for example b=2.
Note that PTS (Presentation Time Stamp) is a display time stamp, where the display time stamp is used to instruct the player to play video frame data or audio frame data corresponding to the display time stamp at a time corresponding to the display time stamp.
To calculateFor example, a->Wherein (1)>For the time interval between the 2 nd audio frame and the 1 st audio frame, +.>For the time interval of the 3 rd audio frame and the 2 nd audio frame.
In some possible implementations, the electronic device obtains the processed sample points based on the target play speed and the sample points of the microphone. For example, when the target playing speed is 2 times of the playing speed, the electronic device takes the ratio of the sampling point number of the microphone to 2 as the processed sampling point number, i.e. the processed sampling point number is one half of the sampling point number of the microphone. For example, when the target playing speed is 0.5 times of the playing speed, the electronic device takes the ratio of the sampling point number of the microphone to 0.5 as the processed sampling point number, namely, the processed sampling point number is twice the sampling point number of the microphone.
The electronic device may then calculate the post-processing timestamp of the processed audio frame by the above formula (2), encode the processed audio frame according to the post-processing timestamp of the processed audio frame, and further obtain the audio file. In some examples, the electronic device may send the processed timestamp of the processed audio frame and the processed audio frame to an audio encoder, which outputs the output encoding result to a encapsulator, which encapsulates the encoding result to obtain the audio file.
Based on the above description, the embodiment of the application provides an audio recording method. The method supports the user to set the playing speed of the recorded audio file in the audio recording process. Therefore, after the electronic equipment records the audio file, the playing speed of the audio file is the audio file meeting the requirements of the user, further clipping and other processing by the user are not needed, the user operation is simplified, and the processing efficiency is improved.
The method adjusts the audio frames in a time domain overlapping mode. Specifically, the method further divides the audio frame into a plurality of audio segment units with equal duration, two adjacent audio segment units in the plurality of audio segment units have overlapping areas, and the duration of the overlapping areas is adjusted through the target playing speed configured by the user, so that the purpose of adjusting the duration of the audio frame is achieved. The method does not need to compress the audio frame, further does not compress the frequency of the sound recorded in the original audio frame, and further does not change the tone of the sound in the original audio frame. Therefore, the method can achieve the effect of speed change and tone change, and further improve user experience.
The audio recording method provided by the embodiment of the application can be combined with video recording, so that the playing speed of the video file obtained by recording is adjusted in real time in the video recording process of the electronic equipment. The above embodiments describe the processing of audio frames, and the following describes the processing of video frames.
In some possible implementations, the electronic device may achieve a variable speed effect on the video frame by means of frame extraction or pin extraction based on the target playing speed configured by the user. In some examples, the electronic device may adjust the recording frame rate of the camera based on the user-configured target playback speed.
When the target playing speed configured by the user is greater than the preset playing speed, the electronic equipment can reduce the recording frame rate of the camera. For example, the target playing speed configured by the user is 2 times of speed, the recording frame rate of the camera is 30fps, and the recording frame rate of the camera can be the frame rate of the video file recorded and obtained by the preset setting of the user. The electronic device may adjust the recording frame rate to 15fps, i.e., reduce the number of video frames reported by the camera per unit time (e.g., 1 second). The electronic device can modify the time stamp of the 15 frames of video frames acquired within 1 second, and encode the 15 frames of video frames according to the processed time stamp of the 15 frames of video frames, so that the 15 frames of video frames are displayed within 0.5 second, and the effect of increasing the playing speed of the picture is achieved.
In some possible implementations, when the target playing speed configured by the user is greater than the preset playing speed, the electronic device may not adjust the recording frame rate of the camera, but discard, i.e. extract, a part of the video frames in the obtained plurality of video frames. For example, the target playing speed configured by the user is 2 times of speed, the recording frame rate of the camera is 30fps, and the recording frame rate of the camera can be the frame rate of the video file recorded and obtained by the preset setting of the user. At a recording frame rate of 30fps, the electronic device may acquire 30 video frames per unit time, and the electronic device may discard the video frames at intervals. As shown in fig. 7, the video frames corresponding to the broken lines are video frames discarded by the electronic device. The electronic device may modify the time stamp of the remaining 15 frames of video frames, and encode the 15 frames of video frames according to the processed time stamp of the 15 frames of video frames, so that the 15 frames of video frames are displayed within 0.5 seconds, thereby achieving the effect of increasing the playing speed of the picture.
When the target playing speed configured by the user is smaller than the preset playing speed, the electronic equipment can increase the recording frame rate of the camera. For example, the target playing speed configured by the user is 0.5 times of the speed, the recording frame rate of the camera is 30fps, and the recording frame rate of the camera can be the frame rate of the video file obtained by recording and preset by the user. The electronic device may adjust the recording frame rate to 60fps, i.e., increase the number of video frames reported by the camera per unit time (e.g., 1 second). The electronic device can modify the time stamp of the 60 frames of video frames acquired within 1 second, and encode the 60 frames of video frames according to the processed time stamp of the 60 frames of video frames, so that the 60 frames of video frames are distributed and displayed within 2 seconds, and the effect of reducing the playing speed of the picture is achieved.
In some embodiments, the electronic device may obtain a post-processing timestamp of the video frame according to the target play speed. In some examples, the electronic device obtains a time interval between a processed timestamp of the video frame and a processed timestamp of a previous video frame of the video frame based on the target playback speed, the original timestamp of the video frame, and the original timestamp of the previous video frame of the video frame; and the electronic equipment obtains the processed time stamp of the video frame according to the time interval and the processed time stamp of the previous video frame of the video frame. Specifically, the electronic device may calculate the post-processing time stamp of the video frame by the following formula (3):
Wherein, the liquid crystal display device comprises a liquid crystal display device,a post-processing timestamp for the nth video frame; t (T) i An original timestamp for the ith video frame; t (T) i-1 The original time stamp of the ith-1 video frame is used for obtaining the original time stamp of the ith-1 video frame, wherein the ith-1 video frame is the previous video frame of the ith video frame; t (T) Pi For the pause time corresponding to the ith video frame, that is, the time elapsed from when the camera pauses the video recording to when the video recording starts again, the speed is the target playing speed configured by the user, for example, the target playing speed may be 0.5 in the case of 0.5 times of speed, and the target playing speed may be 2 in the case of 2 times of speed.
To calculateFor example, a->Wherein (1)>For the time interval between the 2 nd audio frame and the 1 st audio frame +.>Is the time interval between the 3 rd audio frame and the 2 nd audio frame.
Then, the electronic device can calculate the processed timestamp of the video frame according to the formula (3), encode the video frame according to the processed timestamp of the video frame, further obtain the encoding result of the video frame, send the encoding result of the video frame and the encoding result of the processed audio frame to the encapsulator for encapsulation, further obtain the video file.
Based on the above description, the audio recording method provided by the embodiment of the application can also be applied to a video recording scene. Therefore, the user can also adjust the playing speed of the recorded video file in real time in the video recording process, so that the adjustment of the playing effect of the video file can be realized under the condition that the user does not need to edit the video file for the second time, the user operation is simplified, the user time is saved, and the user creation requirement is met.
The embodiment of the application also provides an electronic device, as shown in fig. 8, which may include: a microphone 811, one or more processors 820, memory 830, one or more computer programs 840, which may be connected by one or more communication buses 850. Wherein the one or more computer programs 840 are stored in the memory 830 and configured to be executed by the one or more processors 820, the one or more computer programs 840 include instructions that can be used to perform the various steps performed by the handset as in the corresponding embodiment of fig. 3. The microphone 811 may be a built-in microphone of the electronic device or a microphone connected to the electronic device. In other examples, the electronic device may also include a camera 812. The microphone 811 and the camera 812 are optional, and the electronic device may also receive audio frames or video frames transmitted by other devices, and so on.
The embodiment of the application can divide the functional modules of the electronic device according to the method example, for example, each functional module can be divided corresponding to each function, and two or more functions can be integrated in one processing module. The integrated modules may be implemented in hardware or in software functional modules. It should be noted that, in the embodiment of the present application, the division of the modules is schematic, which is merely a logic function division, and other division manners may be implemented in actual implementation.
Fig. 9 shows a possible schematic diagram of an electronic device as referred to above and in the embodiments, which performs the steps of any of the method embodiments of the application, in case of dividing the respective functional modules with corresponding respective functions. As shown in fig. 9, the electronic device includes: an acquisition unit 901 for acquiring an audio frame; a splitting unit 902, configured to split the audio frame into a plurality of audio segment units with equal duration; overlapping areas exist in two adjacent audio segment units in the plurality of audio segment units; an adjusting unit 903, configured to adjust a duration of the overlapping area according to a target playing speed configured by a user; a synthesizing unit 904, configured to obtain a processed audio frame according to the adjusted overlapping area and the plurality of audio segment units; an encoding unit 905, configured to obtain a processed timestamp of the processed audio frame according to the target playing speed; and encoding the processed audio frame according to the processed timestamp of the processed audio frame to obtain an audio file.
In some possible implementations, the adjusting unit 903 is specifically configured to adjust the duration of the overlapping area when the target playing speed configured by the user is greater than the preset playing speed; and when the target playing speed configured by the user is smaller than the preset playing speed, shortening the duration of the overlapped area.
In some possible implementations, the adjusting unit 903 is specifically configured to adjust the duration of the overlapping area to be the processed duration according to the target playing speed configured by the user, the duration of the overlapping area, the number of the plurality of audio segment units, and the duration of any one of the plurality of audio segment units.
In some possible implementations, the encoding unit 905 is specifically configured to obtain the processed sampling points according to the target playing speed and the preset sampling points of the microphone; acquiring a time interval between the processed audio frame and a previous audio frame of the processed audio frame according to the number of the processed sampling points, the number of channels, the sampling rate and the byte number of the sampling points; and adding the time interval and the processed time stamp of the previous audio frame of the processed audio frame to obtain the processed time stamp of the processed audio frame.
In some possible implementations, the obtaining unit 901 is specifically configured to obtain a recording request, where the recording request includes a target playing speed configured by a user; and acquiring the audio frame according to the recording request.
In some possible implementations, the adjusting unit 903 is specifically configured to obtain a pacing request, where the pacing request includes a target playing speed configured by a user; and according to the speed regulation request, adjusting the duration of the overlapped area by utilizing the target playing speed configured by the user.
It should be noted that, all relevant contents of each step related to the above method embodiment may be cited to the electronic device, so that the electronic device executes the corresponding method, which is not described herein.
The electronic device provided by the embodiment of the application can be used for executing the method of any embodiment, so that the same effect as that of the method of the embodiment can be achieved.
The present embodiment also provides a computer-readable storage medium including instructions that, when executed on an electronic device, cause the electronic device to perform the relevant method steps of fig. 3 to implement the method of the above embodiment.
The present embodiment also provides a computer program product comprising instructions which, when run on an electronic device, cause the electronic device to perform the relevant method steps as in fig. 3 to implement the method of the above embodiments.
In the several embodiments provided in this embodiment, it should be understood that the disclosed electronic device and method may be implemented in other manners. For example, the modules or units may be divided into only one logic function, and there may be other division manners in which the modules or units may be actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present embodiment may be integrated in one processing unit, each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present embodiment may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform all or part of the steps of the method described in the respective embodiments. And the aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic or optical disk, and the like.
The foregoing is merely illustrative of specific embodiments of the present application, and the scope of the present application is not limited thereto, but any changes or substitutions within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. An audio recording method, applied to an electronic device, comprising:
the electronic equipment acquires an audio frame;
the electronic equipment divides the audio frame into a plurality of audio segment units with equal duration; overlapping areas exist in two adjacent audio segment units in the plurality of audio segment units;
in the audio recording process, the electronic equipment presents a speed regulation control on a recording preview interface;
the electronic equipment acquires a speed regulation request triggered by a user through a speed regulation control in the recording preview interface according to the audio recording process, wherein the speed regulation request comprises a target playing speed configured by the user;
the electronic equipment adjusts the duration of the overlapped area according to the target playing speed;
the electronic equipment obtains a processed audio frame according to the adjusted overlapping area and the plurality of audio segment units;
The electronic device obtains the processed sampling points of the processed audio frame according to the target playing speed and the preset sampling points of the microphone, and obtains the processed timestamp of the processed audio frame through the following formula:
s is a processed timestamp of the nth processed audio frame i The number of the sampling points after processing for the ith processed audio frame is C, the number of channels is C, R is the sampling rate, and B is the byte number of each sampling point;
the electronic equipment sends the processed time stamp of the processed audio frame and the processed audio frame to an audio encoder to obtain an encoding result of the processed audio frame, and sends the encoding result of the processed audio frame to a packer to be packed to obtain an audio file;
the adjusted duration of the overlapping area is calculated by the following formula:
t x a time length of the adjusted overlapping area; v is the target playing speed configured by the user, t 0 A duration of the overlap region; n is the number of the plurality of audio segment units; t is the duration of any one of the plurality of audio segment units.
2. The method of claim 1, wherein the electronic device adjusting the duration of the overlapping region according to a target play speed configured by a user during audio recording, comprises:
When the target playing speed configured by a user in the audio recording process is greater than a preset playing speed, the electronic equipment lengthens the duration of the overlapping area;
and when the target playing speed configured by the user in the audio recording process is smaller than the preset playing speed, the electronic equipment shortens the duration of the overlapped area.
3. The method of claim 1, wherein the electronic device adjusting the duration of the overlapping region according to a target play speed configured by a user during audio recording, comprises:
the electronic equipment adjusts the duration of the overlapping area to be the processed duration according to the target playing speed configured by the user in the audio recording process, the duration of the overlapping area, the number of the plurality of audio frequency segment units and the duration of any one of the plurality of audio frequency segment units.
4. A method according to any of claims 1-3, wherein the electronic device obtaining an audio frame comprises:
the electronic equipment acquires a recording request, wherein the recording request comprises a target playing speed configured by a user;
and the electronic equipment acquires the audio frame according to the recording request.
5. A method according to any one of claims 1-3, wherein the electronic device adjusting the duration of the overlapping area according to a target playing speed configured by a user during audio recording, comprises:
the electronic equipment acquires a speed regulation request, wherein the speed regulation request comprises a target playing speed configured by a user in the audio recording process;
and the electronic equipment adjusts the duration of the overlapped area by utilizing the target playing speed configured by the user according to the speed regulation request.
6. An electronic device, comprising:
an acquisition unit configured to acquire an audio frame;
a splitting unit, configured to split the audio frame into a plurality of audio segment units with equal duration; overlapping areas exist in two adjacent audio segment units in the plurality of audio segment units;
the adjusting unit is used for presenting a speed regulation control on a recording preview interface in the audio recording process; according to the method, a speed regulation request triggered by a user through a speed regulation control in a recording preview interface is obtained in the audio recording process, the speed regulation request comprises a target playing speed configured by the user, the duration of the overlapped area is adjusted according to the target playing speed, and the adjusted duration of the overlapped area is calculated by the following formula:
t x A time length of the adjusted overlapping area; v is the target playing speed configured by the user, t 0 A duration of the overlap region; n is the number of the plurality of audio segment units; t is the duration of any one of the plurality of audio segment units;
a synthesizing unit, configured to obtain a processed audio frame according to the adjusted overlapping area and the plurality of audio segment units;
the encoding unit is used for acquiring the processed sampling points of the processed audio frame according to the target playing speed and the preset sampling points of the microphone, and acquiring the processed timestamp of the processed audio frame through the following formula:
s is a processed timestamp of the nth processed audio frame i The number of the sampling points after processing for the ith processed audio frame is C, the number of channels is C, R is the sampling rate, and B is the byte number of each sampling point;
the encoding unit is further configured to send the processed timestamp of the processed audio frame and the processed audio frame to an audio encoder to obtain an encoding result of the processed audio frame, and send the encoding result of the processed audio frame to a packager for packaging, so as to obtain an audio file.
7. An electronic device, comprising: a processor and a memory;
Wherein one or more computer programs are stored in the memory, the one or more computer programs comprising instructions; the instructions, when executed by the processor, cause the electronic device to perform the audio recording method of any one of claims 1-5.
8. A computer storage medium comprising computer instructions which, when run on an electronic device, perform the audio recording method of any of claims 1-5.
CN202110924280.5A 2021-08-12 2021-08-12 Audio recording method, electronic equipment, medium and program product Active CN113643728B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110924280.5A CN113643728B (en) 2021-08-12 2021-08-12 Audio recording method, electronic equipment, medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110924280.5A CN113643728B (en) 2021-08-12 2021-08-12 Audio recording method, electronic equipment, medium and program product

Publications (2)

Publication Number Publication Date
CN113643728A CN113643728A (en) 2021-11-12
CN113643728B true CN113643728B (en) 2023-08-22

Family

ID=78421130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110924280.5A Active CN113643728B (en) 2021-08-12 2021-08-12 Audio recording method, electronic equipment, medium and program product

Country Status (1)

Country Link
CN (1) CN113643728B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114339443B (en) * 2021-11-17 2024-03-19 腾讯科技(深圳)有限公司 Audio and video double-speed playing method and device
CN116052701B (en) * 2022-07-07 2023-10-20 荣耀终端有限公司 Audio processing method and electronic equipment
CN116074566B (en) * 2023-01-13 2023-10-20 深圳市名动天下网络科技有限公司 Game video highlight recording method, device, equipment and storage medium
CN117560514A (en) * 2024-01-11 2024-02-13 北京庭宇科技有限公司 WebRTC-based method for reducing audio and video delay

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004030908A (en) * 1997-11-28 2004-01-29 Victor Co Of Japan Ltd Audio disk and decoding device of audio signal
JP2005266571A (en) * 2004-03-19 2005-09-29 Sony Corp Method and device for variable-speed reproduction, and program
CN1902697A (en) * 2003-11-11 2007-01-24 科斯莫坦股份有限公司 Time-scale modification method for digital audio signal and digital audio/video signal, and variable speed reproducing method of digital television signal by using the same method
CN101150506A (en) * 2007-08-24 2008-03-26 华为技术有限公司 Content acquisition method, device and content transmission system
CN103258552A (en) * 2012-02-20 2013-08-21 扬智科技股份有限公司 Method for adjusting play speed
CN104660948A (en) * 2015-03-09 2015-05-27 深圳市欧珀通信软件有限公司 Video recording method and device
CN106373590A (en) * 2016-08-29 2017-02-01 湖南理工学院 Sound speed-changing control system and method based on real-time speech time-scale modification
CN109360588A (en) * 2018-09-11 2019-02-19 广州荔支网络技术有限公司 A kind of mobile device-based audio-frequency processing method and device
CN110074759A (en) * 2019-04-23 2019-08-02 平安科技(深圳)有限公司 Voice data aided diagnosis method, device, computer equipment and storage medium
CN110459233A (en) * 2019-03-19 2019-11-15 深圳壹秘科技有限公司 Processing method, device and the computer readable storage medium of voice
CN110572722A (en) * 2019-09-26 2019-12-13 腾讯科技(深圳)有限公司 Video clipping method, device, equipment and readable storage medium
CN111176748A (en) * 2015-05-08 2020-05-19 华为技术有限公司 Configuration method of setting information, terminal and server
CN111402900A (en) * 2018-12-29 2020-07-10 华为技术有限公司 Voice interaction method, device and system
WO2020172826A1 (en) * 2019-02-27 2020-09-03 华为技术有限公司 Video processing method and mobile device
WO2020192461A1 (en) * 2019-03-25 2020-10-01 华为技术有限公司 Recording method for time-lapse photography, and electronic device
WO2021052414A1 (en) * 2019-09-19 2021-03-25 华为技术有限公司 Slow-motion video filming method and electronic device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101334366B1 (en) * 2006-12-28 2013-11-29 삼성전자주식회사 Method and apparatus for varying audio playback speed

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004030908A (en) * 1997-11-28 2004-01-29 Victor Co Of Japan Ltd Audio disk and decoding device of audio signal
CN1902697A (en) * 2003-11-11 2007-01-24 科斯莫坦股份有限公司 Time-scale modification method for digital audio signal and digital audio/video signal, and variable speed reproducing method of digital television signal by using the same method
JP2005266571A (en) * 2004-03-19 2005-09-29 Sony Corp Method and device for variable-speed reproduction, and program
CN101150506A (en) * 2007-08-24 2008-03-26 华为技术有限公司 Content acquisition method, device and content transmission system
CN103258552A (en) * 2012-02-20 2013-08-21 扬智科技股份有限公司 Method for adjusting play speed
CN104660948A (en) * 2015-03-09 2015-05-27 深圳市欧珀通信软件有限公司 Video recording method and device
CN111176748A (en) * 2015-05-08 2020-05-19 华为技术有限公司 Configuration method of setting information, terminal and server
CN106373590A (en) * 2016-08-29 2017-02-01 湖南理工学院 Sound speed-changing control system and method based on real-time speech time-scale modification
CN109360588A (en) * 2018-09-11 2019-02-19 广州荔支网络技术有限公司 A kind of mobile device-based audio-frequency processing method and device
CN111402900A (en) * 2018-12-29 2020-07-10 华为技术有限公司 Voice interaction method, device and system
WO2020172826A1 (en) * 2019-02-27 2020-09-03 华为技术有限公司 Video processing method and mobile device
CN110459233A (en) * 2019-03-19 2019-11-15 深圳壹秘科技有限公司 Processing method, device and the computer readable storage medium of voice
WO2020192461A1 (en) * 2019-03-25 2020-10-01 华为技术有限公司 Recording method for time-lapse photography, and electronic device
CN110074759A (en) * 2019-04-23 2019-08-02 平安科技(深圳)有限公司 Voice data aided diagnosis method, device, computer equipment and storage medium
WO2021052414A1 (en) * 2019-09-19 2021-03-25 华为技术有限公司 Slow-motion video filming method and electronic device
CN110572722A (en) * 2019-09-26 2019-12-13 腾讯科技(深圳)有限公司 Video clipping method, device, equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《改进相位声码器的音频时长变换算法研究》;汪石农;《计算机工程与应用》;20121221;全文 *

Also Published As

Publication number Publication date
CN113643728A (en) 2021-11-12

Similar Documents

Publication Publication Date Title
CN113643728B (en) Audio recording method, electronic equipment, medium and program product
CN110109636B (en) Screen projection method, electronic device and system
KR101874895B1 (en) Method for providing augmented reality and terminal supporting the same
CN113726950B (en) Image processing method and electronic equipment
CN113572954A (en) Video recording method, electronic device, medium, and program product
WO2022262313A1 (en) Picture-in-picture-based image processing method, device, storage medium, and program product
WO2022095788A1 (en) Panning photography method for target user, electronic device, and storage medium
CN114710640A (en) Video call method, device and terminal based on virtual image
CN113935898A (en) Image processing method, system, electronic device and computer readable storage medium
CN114579076A (en) Data processing method and related device
CN110989961A (en) Sound processing method and device
CN115756268A (en) Cross-device interaction method and device, screen projection system and terminal
US11870941B2 (en) Audio processing method and electronic device
CN115048012A (en) Data processing method and related device
CN113593567B (en) Method for converting video and sound into text and related equipment
CN114185503A (en) Multi-screen interaction system, method, device and medium
CN113810589A (en) Electronic device, video shooting method and medium thereof
CN113971969B (en) Recording method, device, terminal, medium and product
EP4138381A1 (en) Method and device for video playback
CN112269554B (en) Display system and display method
CN114915834A (en) Screen projection method and electronic equipment
CN112507161A (en) Music playing method and device
CN114339429A (en) Audio and video playing control method, electronic equipment and storage medium
CN115730091A (en) Comment display method and device, terminal device and readable storage medium
CN115883893A (en) Cross-device flow control method and device for large-screen service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant