WO2022262416A1 - 音频的处理方法及电子设备 - Google Patents
音频的处理方法及电子设备 Download PDFInfo
- Publication number
- WO2022262416A1 WO2022262416A1 PCT/CN2022/088335 CN2022088335W WO2022262416A1 WO 2022262416 A1 WO2022262416 A1 WO 2022262416A1 CN 2022088335 W CN2022088335 W CN 2022088335W WO 2022262416 A1 WO2022262416 A1 WO 2022262416A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound
- picture
- electronic device
- sub
- camera
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 32
- 238000001914 filtration Methods 0.000 claims abstract description 46
- 238000000034 method Methods 0.000 claims description 114
- 238000012545 processing Methods 0.000 claims description 51
- 230000008569 process Effects 0.000 claims description 39
- 230000004044 response Effects 0.000 claims description 38
- 238000004590 computer program Methods 0.000 claims description 14
- 238000003860 storage Methods 0.000 claims description 13
- 238000001514 detection method Methods 0.000 claims description 5
- 238000007667 floating Methods 0.000 claims description 2
- 230000001360 synchronised effect Effects 0.000 abstract description 10
- 230000004438 eyesight Effects 0.000 abstract description 4
- 230000005236 sound signal Effects 0.000 description 125
- 238000004891 communication Methods 0.000 description 32
- 230000006854 communication Effects 0.000 description 32
- 238000004422 calculation algorithm Methods 0.000 description 31
- 230000006870 function Effects 0.000 description 30
- 230000008859 change Effects 0.000 description 26
- 230000000694 effects Effects 0.000 description 22
- 238000010586 diagram Methods 0.000 description 17
- 238000012549 training Methods 0.000 description 16
- 238000007726 management method Methods 0.000 description 15
- 239000013598 vector Substances 0.000 description 15
- 230000003287 optical effect Effects 0.000 description 12
- 210000000988 bone and bone Anatomy 0.000 description 10
- 238000000926 separation method Methods 0.000 description 10
- 210000005069 ears Anatomy 0.000 description 9
- 229920001621 AMOLED Polymers 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 238000010295 mobile communication Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 6
- AOQBFUJPFAJULO-UHFFFAOYSA-N 2-(4-isothiocyanatophenyl)isoindole-1-carbonitrile Chemical compound C1=CC(N=C=S)=CC=C1N1C(C#N)=C2C=CC=CC2=C1 AOQBFUJPFAJULO-UHFFFAOYSA-N 0.000 description 4
- 238000009432 framing Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 239000002096 quantum dot Substances 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 230000009977 dual effect Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000002452 interceptive effect Effects 0.000 description 3
- 239000010985 leather Substances 0.000 description 3
- 230000033001 locomotion Effects 0.000 description 3
- 238000009877 rendering Methods 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 230000003190 augmentative effect Effects 0.000 description 2
- 238000010009 beating Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000036772 blood pressure Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000005855 radiation Effects 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 210000003454 tympanic membrane Anatomy 0.000 description 2
- 241000167854 Bourreria succulenta Species 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000007175 bidirectional communication Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000013529 biological neural network Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 235000019693 cherries Nutrition 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000011514 reflex Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003238 somatosensory effect Effects 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 238000011269 treatment regimen Methods 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/4223—Cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
- H04N21/4307—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
- H04N21/43072—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of multiple content streams on the same device
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/301—Automatic calibration of stereophonic sound system, e.g. with test microphone
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/414—Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
- H04N21/41407—Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance embedded in a portable device, e.g. video client on a mobile phone, PDA, laptop
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42203—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
- H04N21/4312—Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4398—Processing of audio elementary streams involving reformatting operations of audio signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/442—Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
- H04N21/44213—Monitoring of end-user related data
- H04N21/44218—Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/63—Control of cameras or camera modules by using electronic viewfinders
- H04N23/631—Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/667—Camera operation mode switching, e.g. between still and video, sport and normal or high- and low-resolution modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/90—Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
Definitions
- the present application relates to the technical field of terminals, and in particular to an audio processing method and electronic equipment.
- the purpose of the present application is to provide an audio processing method, a graphical user interface (graphic user interface, GUI) and an electronic device.
- the electronic device can render the audio of the video according to the image interface information in the video, so that the audio and video expressiveness of the video can be synchronized.
- the audio of the video will also be adjusted accordingly according to the image interface. In this way, the picture and sound of the video can synchronously bring a three-dimensional experience to the user and provide a better user experience.
- an audio processing method comprising: displaying a first interface, the first interface including a first control; detecting a first operation on the first control; responding to the first control One operation, start shooting at the first time T1, display the second interface, the second interface includes the first display area and the second display area; after the first time length t1, at the second time T2, the electronic device displays The region displays the first picture collected by the first camera in real time, and the second display area displays the second picture collected by the second camera in real time; at the second moment T2, the microphone collects the first sound, and the first sound is The sound of the real-time environment where the electronic device is located at the first moment; the second operation on the third control is detected; in response to the second operation, stop shooting and save the first video, the first video includes the the first picture and the second picture; display a third interface, the third interface includes a third control; detect a third operation on the third control, and play the first video; At the first duration of a video, play the first picture, the second picture
- the electronic device displays the first display area and the second display area in a form of split screens up and down, and the area of the first display area is the first area, the area of the second display area is the second area; the picture weight of the first picture is the ratio of the first area to the total area, and the picture weight of the second picture is the ratio of the second area to the total area The ratio of the total area, the total area being the sum of the areas of the first area and the second area.
- the first display area is displayed on the second display area in the form of a floating window; the area of the first display area is the first area, and the The area of the display screen of the electronic device is the third area; the picture weight of the first picture is the ratio of the first area to the third area, and the picture weight of the second picture is an integer 1 and the picture weight of the first picture The difference between the weights of a frame.
- the first sound includes a first sub-sound and a second sub-sound
- the first sub-sound is the sound of the first picture
- the second sub-sound The sound is the sound of the second picture
- the processing the first sound according to the picture weights of the first picture and the second picture includes: processing the first sub-picture according to the picture weights of the first picture and the second picture Mixing the sound with the second sub-sound; when the weight of the first picture is greater than the weight of the second picture, a first mixing ratio is used to make the loudness of the first sub-sound greater than that of the second sub-sound.
- the loudness of the second sub-sound in the case that the picture weight of the first picture is smaller than the weight of the second picture, a second mixing ratio is used to make the loudness of the first sub-sound smaller than that of the second sub-sound
- the loudness of the sound when the picture weight of the first picture is equal to the weight of the second picture, a third mixing ratio is adopted to make the loudness of the first sub-sound equal to the loudness of the second sub-sound .
- the two sounds included in the first audio can match the size between the two display areas, adjust the ratio of mixing, and create a large-area display area for the user. It is also larger, and the sound coming from the small display area is also less auditory experience.
- the method before the detection of the second operation on the third control, the method further includes: saving the first sound by the electronic device;
- the picture weight of the second picture processing the first sound includes: the electronic device processing the first sound according to the picture weights of the first picture and the second picture to obtain the second sound;
- the device saves the second sound and deletes the first sound.
- the unprocessed audio is saved first, and then processed after saving, which can reduce the occupation of the processor during the audio recording process and improve the fluency of the audio recording process.
- the first sound includes a first sub-sound and a second sub-sound
- the first sub-sound is the sound of the first picture
- the second sub-sound The sound is the sound of the second picture
- the processing the first sound according to the picture weights of the first picture and the second picture includes: the electronic device according to the first viewing angle and the first viewing angle of the first camera
- the second viewing angle of the second camera respectively filters the first sound to obtain the first sub-sound and the second sub-sound;
- the first sub-sound and the second sub-sound are mixed to obtain the second sound.
- the electronic device when playing a video file recorded in the dual-view recording mode, the electronic device will display two pictures on the display screen, and the two pictures are from two different cameras when recording.
- the size of the two screens and the line of defense and range of viewing angles may be different.
- the face image of the user captured by the front camera of the electronic device is presented; landscape image.
- the user can change the face image and the landscape image presented in the first picture and the second picture by adjusting the focal length multiple of the front camera and the rear camera.
- Sound is not only divided into magnitudes, but also has directionality, and this directionality can be perceived by humans. Therefore, in this embodiment of the application, in order to match the size of the two screens and the viewing angle ranges presented to the user by the two screens, in this method, by combining the viewing angle ranges of the two screens, the electronic device
- the audio collected in the viewing direction of the first picture is over-enhanced to obtain a sound corresponding to the viewing direction of the first picture
- the audio collected in the viewing direction of the second picture is over-enhanced to obtain A sound corresponding to the viewing direction of the first picture, according to the areas of the first display area and the second display area, adjust the fusion ratio of the two sounds to obtain the sound corresponding to the first picture
- the first sub-sound and the second sub-sound corresponding to the second picture After mixing the first sub-sound and the second sub-sound, the final output audio (that is, the second sound) is obtained.
- the first sound includes a first sub-sound and a second sub-sound
- the first sub-sound is the sound of the first picture
- the second sub-sound The sound is the sound of the second picture
- the processing the first sound according to the picture weights of the first picture and the second picture includes: the electronic device according to the first viewing angle and the first viewing angle of the first camera The second viewing angle of the second camera respectively filters the first sound to obtain the first sub-sound and the second sub-sound; the electronic device obtains the relative The first orientation information of the second display area; the electronic device virtualizes the orientation of the first sub-sound according to the first orientation information to obtain the first left orientation sound and the first right orientation sound; the electronic device After the device adjusts the loudness of the first left-direction sound, the first right-direction sound, and the second sub-sound according to the picture weights of the first picture and the second picture, the first sub-sound and the The second sub-sound is mixed to obtain the second sound.
- the picture in the first display area should be directly in front or directly behind the electronic device.
- the first display area is included in the second area, and the orientation of the first display area is adjustable. Therefore, the position of the first display area in the second display area may be visually left or right.
- the electronic device virtualizes the orientation of the first orientation sound on the first orientation information, so that the user's perceived direction of the first sub-sound can be consistent with the orientation of the first display area.
- the orientation matches.
- the first sound includes a first sub-sound and a second sub-sound
- the first sub-sound is the sound of the first picture
- the second sub-sound The sound is the sound of the second picture
- the processing the first sound according to the picture weights of the first picture and the second picture includes: the electronic device according to the first viewing angle and the first viewing angle of the first camera
- the second viewing angle of the second camera respectively filters the first sound to obtain the first sub-sound and the second sub-sound
- the first sub-sound includes the first left channel sound and the first right channel sound
- the first left channel sound is obtained by filtering the first sound according to the left half angle of the first viewing angle by the electronic device
- the first right channel sound obtained by filtering the first sound according to the right half angle of the first viewing angle by the electronic device
- the second direction sound includes a second left channel sound and a second right channel sound
- the first The second left channel sound is obtained by filtering the first sound according to the left half angle of the second viewing angle by the electronic device;
- the electronic device adjusts the first left-channel sound, the first right-channel sound, the second left-channel sound, and the After the loudness of the second right channel sound, the first sub-sound and the second sub-sound are mixed to obtain the second sound.
- the electronic device in the process of enhancing the initial audio in combination with the field of view of the picture, can distinguish the field of view according to the left and right directions, and obtain the distinction between the left channel and the left channel for each picture.
- the two sounds of the right channel can make the final audio for output more stereoscopic.
- the first sound includes a first sub-sound and a second sub-sound
- the first sub-sound is the sound of the first picture
- the second sub-sound The sound is the sound of the second picture
- the method further includes: at the third moment T3, in response to the operation of switching the camera, the electronic device switches the The picture displayed in the first display area is switched from the picture taken by the first camera to the picture taken by the third camera; at the fourth moment T4, the electronic device displays the third picture taken by the third camera in the first display area.
- the fourth moment T4 is after the third moment T3; the electronic device respectively performs The first sound is filtered to obtain the historical sound and the target sound; within the time between the third moment T3 and the fourth moment T4, the electronic device according to the third moment T3 and the fourth moment T4 dynamically adjust the mixing ratio of the historical sound and the target sound, and mix the historical sound and the target sound based on the mixing ratio to obtain the first sub-sound.
- the electronic device switches the camera corresponding to the first display area
- the viewing angle of the picture in the first display area also changes accordingly
- the audio signal obtained by the electronic device filtering the audio based on the picture also changes.
- the electronic device since the electronic device often needs a certain amount of processing time when performing lens switching, the adjustment of the electronic device based on the audio of the lens switching can be completed in a very short time. In this way, the look and feel of the picture and the sense of hearing of the audio may be unbalanced.
- the electronic device can dynamically adjust the proportion of the sound obtained from the front and rear two pictures in the third audio during the process of switching the camera, so that the change of the sound direction occurs more slowly, so that all The switching of the above-mentioned audio can be carried out smoothly.
- the embodiment of the present application provides an electronic device, the electronic device includes: one or more processors and a memory; the memory is coupled with the one or more processors, and the memory is used to store computer Program code, where the computer program code includes computer instructions, and the one or more processors invoke the computer instructions to cause the electronic device to execute the method in the first aspect or any possible implementation manner of the first aspect.
- a chip system is provided, the chip system is applied to an electronic device, and the chip system includes one or more processors, and the processor is used to invoke computer instructions so that the electronic device performs the first aspect. Any possible implementation manner in the second aspect, or any possible implementation manner in the second aspect.
- a computer program product containing instructions is characterized in that, when the above computer program product is run on the electronic device, the above electronic device is made to execute any of the possible implementations as in the first aspect, or as in the second aspect. Any possible implementation of the aspect.
- a computer-readable storage medium including instructions, wherein, when the above-mentioned instructions are run on the electronic device, the above-mentioned electronic device is made to execute any possible implementation method as in the first aspect, or as in the first aspect. Either of the two possible implementations.
- FIG. 1A is a schematic diagram of a corresponding relationship between a camera and an angle of view provided by an embodiment of the present application
- FIG. 1B is a schematic diagram of a field of view range of a front/rear camera of an electronic device provided by an embodiment of the present application;
- FIG. 2A is a recording interface diagram under a dual-view recording provided in the embodiment of the present application.
- FIG. 2B is a schematic diagram of an electronic device playing audio recorded in a dual-view recording mode provided by an embodiment of the present application
- FIG. 3 is a schematic structural diagram of an electronic device 100 provided in an embodiment of the present application.
- FIG. 4 is a schematic diagram of some human-computer interaction interfaces provided by the embodiment of the present application.
- FIG. 5 is a schematic diagram of some human-computer interaction interfaces provided by the embodiment of the present application.
- FIG. 6 is a schematic diagram of some human-computer interaction interfaces provided by the embodiment of the present application.
- FIG. 7A is a schematic diagram of a recording interface provided by an embodiment of the present application.
- FIG. 7B is a schematic diagram of another recording interface provided by the embodiment of the present application.
- FIG. 7C is a schematic diagram of another recording interface provided by the embodiment of the present application.
- FIG. 8A is a schematic diagram of a scenario where an electronic device filters an audio signal in combination with the field of view of the screen according to an embodiment of the present application;
- FIG. 8B is a schematic diagram of another electronic device filtering an audio signal in combination with the field of view of the screen provided by the embodiment of the present application;
- FIG. 9 is a flowchart of an audio processing method provided by an embodiment of the present application.
- FIG. 10 is a flowchart of another audio processing method provided by the embodiment of the present application.
- FIG. 11 is a flowchart of another audio processing method provided by the embodiment of the present application.
- FIG. 12 is a flowchart of another audio processing method provided by the embodiment of the present application.
- FIG. 13 is a flowchart of a method for smoothly switching audio provided by an embodiment of the present application.
- the dual-view recording mode means that multiple cameras in an electronic device, such as a front camera and a rear camera, can simultaneously record two channels of video.
- the display screen can display two images from the two cameras simultaneously on the same interface during video preview or video recording or during playback of the recorded video. These two images can be spliced and displayed on the same interface, or displayed in a picture-in-picture manner.
- Dual-view recording includes but not limited to the following commonly used recording modes:
- the display screen of the device is divided into upper and lower display interfaces, and the upper and lower display interfaces do not overlap.
- the areas of the upper display interface and the lower display interface can be the same or different.
- the display screen of the device is divided into two display interfaces, one large and one small, and the smaller display interface is included in the larger display interface.
- the larger display area generally covers the screen of the device, and the image in the smaller display area may cover the image in the larger display area.
- the smaller display area also supports zooming and its position on the device screen can be changed.
- multiple images captured by the two images of the two cameras can be saved as multiple videos in the gallery (also called an album), or a composite video spliced from these multiple videos.
- the “dual scene recording mode” is just some names used in the embodiment of the present application, and the representative meanings thereof have been recorded in the embodiment of the present application, and the names do not constitute any limitation to the embodiment of the present application.
- the focal length is the distance from the center point of the lens to the clear image formed on the focal plane, and it is a measure of the concentration or divergence of light in an optical system.
- the size of the focal length determines the size of the viewing angle. The smaller the focal length, the larger the field of view, and the larger the observed range; the larger the focal length, the smaller the field of view, and the smaller the observation range. According to whether the focal length can be adjusted, it can be divided into two categories: fixed focus lens and zoom lens. When shooting the same subject at the same distance, the image formed by the lens with a long focal length is large, and the image formed by a lens with a short focal length is small.
- the angle formed by the two edges of the lens of the optical instrument as the vertex and the object image of the measured target through the maximum range of the lens is called the angle of view.
- the size of the field of view determines the field of view of the optical instrument. The larger the field of view, the larger the field of view and the smaller the optical magnification. In layman's terms, if the target object exceeds this angle, it will not be collected in the lens.
- the focal length is inversely proportional to the field of view, that is, the larger the focal length, the smaller the field of view, and vice versa.
- the camera can adjust the focal length of the camera during shooting, which provides six levels of 1 ⁇ (not shown in the figure), 2 ⁇ , 3 ⁇ , 4 ⁇ , 5 ⁇ and 6 ⁇
- the focal length gear of the position It is not difficult to understand that when the focal length is 1 ⁇ , the field of view that the camera can capture is the largest, that is, 180° directly in front of the camera.
- the focal length is adjusted to 2 ⁇ , as shown in Figure 1A, the field of view has changed to 84°; if the focal length is adjusted to 6 ⁇ , as shown in Figure 1A, the field of view is only 30° straight ahead.
- FIG. 1B is a schematic diagram of a field of view range of a front/rear camera of an electronic device provided by an embodiment of the present application. As shown in FIG. 1B , for the convenience of readers, FIG. 1B presents a top view of the electronic device when it is placed upright, and the electronic device can be regarded as point P. Then OPO' is the plane where the electronic equipment is located. Wherein, the left side of OPO' indicates the side where the front camera of the electronic device is located, and the right side of OPO' indicates the side where the rear camera of the electronic device 100 is located.
- the shaded part on the left side of OPO' represents the maximum field of view range that the front camera of the electronic device can obtain
- the blank part on the right side of OPO' represents the maximum field of view range that the rear camera of the electronic device can obtain. field angle range.
- the schematic view of the field of view shown in FIG. 1B is only for convenience of description, and it may be expressed in other forms in an actual scene.
- the field of view angle on the left side of OPO' is positive, and the field angle on the right side of OPO' is negative, so that the 0 in the space °As for the viewing angle of 360°, it is divided into two quadrants: 0° to +180° and -180° to 0°. That is to say, in the following embodiments of the present application, the field angles of the front cameras of the electronic device are all positive values, and the field angles of the rear cameras are all negative values.
- the line connecting each point and point P represents the boundary of the field of view of the camera at a certain focal length.
- the rear camera of the electronic device has a field of view angle of 90° when the focal length is 3 ⁇
- ray BP and ray B'P are the rear camera The boundary of the field of view at a focal length of 3 ⁇ .
- the field of view angle of the front camera of the electronic device is 30° when the focal length is 6 ⁇
- ray FP and ray F'P are the angle of view of the front camera at 6 ⁇ The boundary of the field of view angle at the focal length.
- the electronic device when the electronic device leaves the factory, the corresponding relationship between the focal length provided by the camera of the electronic device and the corresponding angle of view is fixed. That is to say, after the user selects the focal length of the camera, the electronic device can obtain the angle value of the corresponding field of view of the camera at the focal length, and the angle value can reflect the size and direction of the field of view.
- the speech signal collected by the microphone contains the target speech signal and other interference signals.
- the speech signal collected by the microphone contains the target speech signal and other interference signals.
- a speaker speaks through a microphone in daily life, in addition to the speech signal of the target speaker, it is often accompanied by the speech of other speakers.
- the signal of interfering people will seriously affect the target speaker.
- the recognition performance of human voice at this time, it is necessary to track the target voice and suppress or eliminate the interfering voice through sound source separation.
- CVX a toolbox for MATLAB
- CVX beam training can use MATLAB to select different array forms, and use convex optimization method for beam forming.
- the ratio of the area between a single viewfinder frame and the display screen of the electronic device can be calculated as the screen weight of the single viewfinder frame in the interface form.
- the area formed by splicing the two viewfinder frames covers the display area of the display screen, and at this time the sum of the picture weights of the two viewfinder frames has a value of 1.
- the sum of the picture weights of the two viewfinder frames may be less than 1 at this time.
- the embodiment of the present application may also use other methods to calculate the picture weights of the two viewfinder frames, as long as the two picture weights calculated by this method can represent the size of the area of the two viewfinder frames relationship.
- the ratio of the area of a single viewfinder frame to the sum of the areas of two viewfinder frames can be used as the picture weight of the single viewfinder frame, so as to ensure that the sum of the picture weight ratios of the two viewfinder frames is 1, It is also easier to calculate.
- the calculation method of the picture weights of the two viewing frames may also be:
- w1 represents the picture weight of the viewfinder frame with the smaller area among the two viewfinder frames
- S1 is the area of the viewfinder frame
- w2 represents the picture weight of the viewfinder frame with the larger area among the two viewfinder frames
- S2 is the area of the viewfinder frame area.
- ⁇ is a correction coefficient, which is a fixed value that has been set when the electronic equipment leaves the factory, and its value range is [1, (S1+S2)/S1]. In this way, it is possible to prevent the image weight value of the viewfinder with a smaller area from being too small due to a large difference in area between the two viewfinder frames.
- HRTF is a sound localization processing technology, which can be regarded as the frequency response of a sound at a specific location transmitted to the left and right ears. Since the sound will be reflected from the pinna or shoulder to the inside of the human ear, when we use two speakers to simulate sound positioning, we can use specific calculation methods to calculate the size and pitch of the sound produced by different directions or positions, and then Create the effect of stereo spatial sound positioning.
- HRTF is the frequency response of the sound transmitted to the left and right ears at a specific location, and the time domain response corresponding to HRTF is called HRIR.
- HRIR the time domain response corresponding to HRTF.
- CIPIC_HRIR data is a set of HRIR data provided by the CIPIC HRTF database at the University of California, Davis.
- Common mixing algorithms include direct summation and average adjustment weighting.
- FFT is a fast algorithm of discrete Fourier transform, which can transform a signal from the time domain to the frequency domain.
- IFFT is an inverse fast Fourier transform algorithm corresponding to FFT, which can transform a signal from the frequency domain to the time domain.
- Double-view recording is a new video recording method.
- the device can use two lenses for video recording at the same time, showing two different views such as close-up and panorama, front camera and rear camera. Different pictures are combined to form a huge visual contrast.
- FIG. 2A exemplarily shows a shooting interface of an electronic device under dual-view recording.
- the electronic device 200 calls the front camera and the rear camera to take pictures at the same time.
- the shooting interface 20 is divided into two display areas, the display area 201 and the display area 202, by the dividing line 203, wherein:
- What is displayed in the display area 201 is the image captured by the front camera of the electronic device 200 , and the image is the face of the user who is being recorded.
- What is displayed in the display area 202 is the image captured by the rear camera of the electronic device 200 , and the image is the landscape in front of the user. In the dual-view recording mode, two images with obvious visual differences can be presented in one image at the same time.
- the electronic device when electronically shooting a video, the electronic device will not only record the picture information of the scene through an image acquisition device such as a camera, but also record audio information in the environment through an audio acquisition device such as a microphone.
- the audio information of the dual-view video recording can be recorded and output in a normal audio recording and video recording mode.
- FIG. 2B is a schematic diagram of an electronic device playing audio recorded in a dual-view recording mode according to an embodiment of the present application. As shown in Figure 2B, the electronic device performs necessary rendering and filtering operations on the omnidirectional audio signal in the space collected by the microphone, such as changing timbre, denoising, etc., and finally plays it through the speaker.
- the electronic device simply renders and filters the audio, which may make the audio more clear.
- the recording and playback of dual-view video involves two pictures, and there is a relationship between the size and orientation between the pictures, and this relationship can change during the recording process. Therefore, the single audio recording and playback method provided in FIG. 2B cannot well fit the visual difference of the dual-view recording on the screen for the user. In this way, the user cannot fully perceive the differences between different pictures in the dual-view video recording mode through hearing, and the user experience is poor.
- an embodiment of the present application provides an audio processing method and an electronic device.
- the dual-view recording mode of the electronic device includes multiple styles of split-screen recording.
- split-screen recording when the user changes the way of split-screen recording, zooms the size of the two screens, or switches the images of the two screens, the electronic device can The focal length of the two display areas, the relative position and area size of the two display areas, etc.) process the collected audio accordingly, so that the user has a synchronous three-dimensional sense in hearing and vision.
- the electronic device is a mobile phone, a tablet computer, a wearable device, a vehicle device, an augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) device, a notebook computer, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook, personal digital assistant (personal digital assistant, PDA) or special camera (such as single-lens reflex camera, card camera), etc., this application does not impose any restrictions on the specific type of the electronic device.
- FIG. 3 exemplarily shows the structure of the electronic device.
- the electronic device 100 may have multiple cameras 193 , such as a front camera, a wide-angle camera, an ultra-wide-angle camera, a telephoto camera, and the like.
- the electronic device 100 may also include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charging management module 140, a power management module 141, a battery 142, and an antenna 1 , antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, display screen 194, and A subscriber identification module (subscriber identification module, SIM) card interface 195 and the like.
- SIM subscriber identification module
- the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, bone conduction sensor 180M, etc.
- the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100 .
- the electronic device 100 may include more or fewer components than shown in the figure, or combine certain components, or separate certain components, or arrange different components.
- the illustrated components can be realized in hardware, software or a combination of software and hardware.
- the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor ( image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. . Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
- application processor application processor, AP
- modem processor graphics processing unit
- GPU graphics processing unit
- ISP image signal processor
- controller memory
- video codec digital signal processor
- DSP digital signal processor
- baseband processor baseband processor
- neural network processor neural-network processing unit
- the processor 110 such as the controller or GPU can be used to synthesize and display the multi-frame images captured by the two cameras 193 simultaneously in the viewfinder frame by means of splicing or partial superimposition in a dual-view recording scene preview images in the image, so that the electronic device 100 can simultaneously display the images captured by the two cameras 193 .
- processors 110 such as controllers or GPUs can also be used to perform anti-shake processing on images collected by each camera 193 in a dual-view video shooting scene, and then multiple cameras 193 correspond to each other. The images after anti-shake processing are synthesized.
- the controller may be the nerve center and command center of the electronic device 100 .
- the controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
- a memory may also be provided in the processor 110 for storing instructions and data.
- the memory in processor 110 is a cache memory.
- the memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the system.
- processor 110 may include one or more interfaces.
- the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuitsound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous receiver transmitter (universal asynchronous receiver) /transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and/or Universal serial bus (universal serial bus, USB) interface, etc.
- I2C integrated circuit
- I2S integrated circuit built-in audio
- PCM pulse code modulation
- PCM pulse code modulation
- UART universal asynchronous receiver transmitter
- MIPI mobile industry processor interface
- GPIO general-purpose input and output
- subscriber identity module subscriber identity module
- SIM subscriber identity module
- USB Universal serial bus
- the I2C interface is a bidirectional synchronous serial bus, including a serial data line (serial data line, SDA) and a serial clock line (derail clock line, SCL).
- processor 110 may include multiple sets of I2C buses.
- the processor 110 can be respectively coupled to the touch sensor 180K, the charger, the flashlight, the camera 193 and the like through different I2C bus interfaces.
- the processor 110 may be coupled to the touch sensor 180K through the I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface to realize the touch function of the electronic device 100 .
- the I2S interface can be used for audio communication.
- processor 110 may include multiple sets of I2S buses.
- the processor 110 may be coupled to the audio module 170 through an I2S bus to implement communication between the processor 110 and the audio module 170 .
- the audio module 170 can transmit audio signals to the wireless communication module 160 through the I2S interface, so as to realize the function of answering calls through the Bluetooth headset.
- the PCM interface can also be used for audio communication, sampling, quantizing and encoding the analog signal.
- the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface.
- the audio module 170 can also transmit audio signals to the wireless communication module 160 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
- the UART interface is a universal serial data bus used for asynchronous communication.
- the bus can be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication.
- a UART interface is generally used to connect the processor 110 and the wireless communication module 160 .
- the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to realize the Bluetooth function.
- the audio module 170 can transmit audio signals to the wireless communication module 160 through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
- the MIPI interface can be used to connect the processor 110 with peripheral devices such as the display screen 194 and the camera 193 .
- MIPI interface includes camera serial interface (camera serial interface, CSI), display serial interface (display serial interface, DSI), etc.
- the processor 110 communicates with the camera 193 through the CSI interface to realize the shooting function of the electronic device 100 .
- the processor 110 communicates with the display screen 194 through the DSI interface to realize the display function of the electronic device 100 .
- the GPIO interface can be configured by software.
- the GPIO interface can be configured as a control signal or as a data signal.
- the GPIO interface can be used to connect the processor 110 with the camera 193 , the display screen 194 , the wireless communication module 160 , the audio module 170 , the sensor module 180 and so on.
- the GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.
- the USB interface 130 is an interface conforming to the USB standard specification, specifically, it can be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
- the USB interface 130 can be used to connect a charger to charge the electronic device 100 , and can also be used to transmit data between the electronic device 100 and peripheral devices. It can also be used to connect headphones and play audio through them. This interface can also be used to connect other electronic devices, such as AR devices.
- the interface connection relationship between the modules shown in the embodiment of the present application is only a schematic illustration, and does not constitute a structural limitation of the electronic device 100 .
- the electronic device 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
- the charging management module 140 is configured to receive a charging input from a charger.
- the charger may be a wireless charger or a wired charger.
- the charging management module 140 can receive charging input from the wired charger through the USB interface 130 .
- the charging management module 140 may receive a wireless charging input through a wireless charging coil of the electronic device 100 . While the charging management module 140 is charging the battery 142 , it can also provide power for electronic devices through the power management module 141 .
- the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
- the power management module 141 receives the input from the battery 142 and/or the charging management module 140 to provide power for the processor 110 , the internal memory 121 , the external memory, the display screen 194 , the camera 193 , and the wireless communication module 160 .
- the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
- the power management module 141 may also be disposed in the processor 110 .
- the power management module 141 and the charging management module 140 can also be set in the same device.
- the wireless communication function of the electronic device 100 can be realized by the antenna 1 , the antenna 2 , the mobile communication module 150 , the wireless communication module 160 , a modem processor, a baseband processor, and the like.
- Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in electronic device 100 may be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
- Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
- the antenna may be used in conjunction with a tuning switch.
- the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied on the electronic device 100 .
- the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
- the mobile communication module 150 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
- the mobile communication module 150 can also amplify the signals modulated by the modem processor, and convert them into electromagnetic waves through the antenna 1 for radiation.
- the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wireless Fidelity, Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite system, etc. (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
- the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
- the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
- the wireless communication module 160 can also receive the signal to be sent from the processor 110 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
- the antenna 1 of the electronic device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
- Wireless communication technologies may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband code division multiple wideband code division multiple access (WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC, FM, and / or IR technology etc.
- GNSS can include global positioning system (global positioning system, GPS), global navigation satellite system (global navigation satellite system, GLONASS), Beidou satellite navigation system (beidounavigation satellite system, BDS), quasi-zenith satellite system (quasi-zenith satellite system) , QZSS) and/or satellite based augmentation systems (SBAS).
- GPS global positioning system
- GLONASS global navigation satellite system
- Beidou satellite navigation system beidounavigation satellite system, BDS
- quasi-zenith satellite system quasi-zenith satellite system
- QZSS quasi-zenith satellite system
- SBAS satellite based augmentation systems
- the electronic device 100 realizes the display function through the GPU, the display screen 194 , and the application processor.
- the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
- Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
- the display screen 194 is used to display images, videos and the like.
- the display screen 194 includes a display panel.
- the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
- the electronic device 100 may include 1 or N display screens 194 , where N is a positive integer greater than 1.
- the electronic device 100 can realize the shooting function through the ISP, the camera 193 , the video codec, the GPU, the display screen 194 and the application processor.
- the ISP is used for processing the data fed back by the camera 193 .
- the light is transmitted to the photosensitive element of the camera through the lens, and the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
- ISP can also perform algorithm optimization on image noise, brightness, and skin color.
- ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
- the ISP may be located in the camera 193 .
- Camera 193 is used to capture still images or video.
- the object generates an optical image through the lens and projects it to the photosensitive element.
- the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
- CMOS complementary metal-oxide-semiconductor
- the photosensitive element converts the light signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
- the ISP outputs the digital image signal to the DSP for processing.
- DSP converts digital image signals into standard RGB, YUV and other image signals.
- the electronic device 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1.
- Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.
- Video codecs are used to compress or decompress digital video.
- the electronic device 100 may support one or more video codecs.
- the electronic device 100 can play or record videos in various encoding formats, for example: moving picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4 and so on.
- MPEG moving picture experts group
- the NPU is a neural-network (NN) computing processor.
- NN neural-network
- the NPU can quickly process input information and continuously learn by itself.
- Applications such as intelligent cognition of the electronic device 100 can be realized through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
- the decision-making model provided by the embodiment of the present application can also be implemented by using the NPU.
- the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, so as to expand the storage capacity of the electronic device 100.
- the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. Such as saving music, video and other files in the external memory card.
- the internal memory 121 may be used to store computer-executable program codes including instructions.
- the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 .
- the internal memory 121 may include an area for storing programs and an area for storing data.
- the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like.
- the storage data area can store data created during the use of the electronic device 100 (such as audio data, phonebook, etc.) and the like.
- the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
- the electronic device 100 can implement audio functions through the audio module 170 , the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.
- the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signal.
- the audio module 170 may also be used to encode and decode audio signals.
- the audio module 170 may be set in the processor 110 , or some functional modules of the audio module 170 may be set in the processor 110 .
- Speaker 170A also referred to as a "horn" is used to convert audio electrical signals into sound signals.
- the electronic device 100 can listen to music through the speaker 170A, listen to the sound in the video or listen to the hands-free call.
- the number of the speaker 170A may be one, or two or more than two.
- the audio processing method provided in the embodiment of the present application when the number of speakers 170A of the electronic device 100 exceeds two, it may support playing two-channel audio.
- the number of speakers 170A of the electronic device 100 is two (here the two speakers are respectively referred to as 170A-1 and 170A-2), the speakers 170A-1 and 170A-2 can be respectively arranged above the electronic device 100 and below. It should be noted that the "above” and “below” mentioned here are based on the “above” and “below” when the electronic device is placed upright.
- Receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
- the receiver 170B can be placed close to the human ear to receive the voice.
- the microphone 170C also called “microphone” or “microphone” is used to convert sound signals into electrical signals. When making a phone call or sending a voice message, the user can put his mouth close to the microphone 170C to make a sound, and input the sound signal to the microphone 170C.
- the electronic device 100 may be provided with at least one microphone 170C. In some other embodiments, the electronic device 100 may be provided with two microphones 170C, which may also implement a noise reduction function in addition to collecting sound signals. In some other embodiments, the electronic device 100 can also be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions, etc.
- the earphone interface 170D is used for connecting wired earphones.
- the earphone interface 170D can be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, or a cellular telecommunications industry association of the USA (CTIA) standard interface.
- OMTP open mobile terminal platform
- CTIA cellular telecommunications industry association of the USA
- the pressure sensor 180A is used to sense the pressure signal and convert the pressure signal into an electrical signal.
- pressure sensor 180A may be disposed on display screen 194 .
- pressure sensors 180A such as resistive pressure sensors, inductive pressure sensors, and capacitive pressure sensors.
- a capacitive pressure sensor may be comprised of at least two parallel plates with conductive material.
- the electronic device 100 determines the intensity of pressure according to the change in capacitance.
- the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
- the electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A.
- the gyro sensor 180B can be used to determine the motion posture of the electronic device 100 .
- the angular velocity of the electronic device 100 around three axes may be determined by the gyro sensor 180B.
- the gyro sensor 180B can be used for image stabilization. Exemplarily, when the shutter is pressed, the gyro sensor 180B detects the shaking angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shaking of the electronic device 100 through reverse movement to achieve anti-shake.
- the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
- the air pressure sensor 180C is used to measure air pressure.
- the electronic device 100 calculates the altitude based on the air pressure value measured by the air pressure sensor 180C to assist positioning and navigation.
- the magnetic sensor 180D includes a Hall sensor.
- the electronic device 100 may use the magnetic sensor 180D to detect the opening and closing of the flip leather case.
- the electronic device 100 can detect the opening and closing of the flip according to the magnetic sensor 180D.
- features such as automatic unlocking of the flip cover are set.
- the acceleration sensor 180E can detect the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices, and can be used in applications such as horizontal and vertical screen switching, pedometers, etc.
- the distance sensor 180F is used to measure the distance.
- the electronic device 100 may measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may use the distance sensor 180F for distance measurement to achieve fast focusing.
- Proximity light sensor 180G may include, for example, light emitting diodes (LEDs) and light detectors, such as photodiodes.
- the light emitting diodes may be infrared light emitting diodes.
- the electronic device 100 emits infrared light through the light emitting diode.
- Electronic device 100 uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it may be determined that there is an object near the electronic device 100 . When insufficient reflected light is detected, the electronic device 100 may determine that there is no object near the electronic device 100 .
- the electronic device 100 can use the proximity light sensor 180G to detect that the user is holding the electronic device 100 close to the ear to make a call, so as to automatically turn off the screen to save power.
- the proximity light sensor 180G can also be used in leather case mode, automatic unlock and lock screen in pocket mode.
- the ambient light sensor 180L is used for sensing ambient light brightness.
- the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
- the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
- the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket, so as to prevent accidental touch.
- the fingerprint sensor 180H is used to collect fingerprints.
- the electronic device 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access to application locks, take pictures with fingerprints, answer incoming calls with fingerprints, and the like.
- the temperature sensor 180J is used to detect temperature.
- the electronic device 100 uses the temperature detected by the temperature sensor 180J to implement a temperature treatment strategy. For example, when the temperature reported by the temperature sensor 180J exceeds the threshold, the electronic device 100 may reduce the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection.
- the electronic device 100 when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to prevent the electronic device 100 from being shut down abnormally due to the low temperature.
- the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
- Touch sensor 180K also known as "touch panel”.
- the touch sensor 180K can be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
- the touch sensor 180K is used to detect a touch operation on or near it.
- the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
- Visual output related to the touch operation can be provided through the display screen 194 .
- the touch sensor 180K may also be disposed on the surface of the electronic device 100 , which is different from the position of the display screen 194 .
- the bone conduction sensor 180M can acquire vibration signals. In some embodiments, the bone conduction sensor 180M can acquire the vibration signal of the vibrating bone mass of the human voice. The bone conduction sensor 180M can also contact the human pulse and receive the blood pressure beating signal. In some embodiments, the bone conduction sensor 180M can also be disposed in the earphone, combined into a bone conduction earphone.
- the audio module 170 can analyze the voice signal based on the vibration signal of the vibrating bone mass of the vocal part acquired by the bone conduction sensor 180M, so as to realize the voice function.
- the application processor can analyze the heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M, so as to realize the heart rate detection function.
- the keys 190 include a power key, a volume key and the like.
- the key 190 may be a mechanical key. It can also be a touch button.
- the electronic device 100 can receive key input and generate key signal input related to user settings and function control of the electronic device 100 .
- the motor 191 can generate a vibrating reminder.
- the motor 191 can be used for incoming call vibration prompts, and can also be used for touch vibration feedback.
- touch operations applied to different applications may correspond to different vibration feedback effects.
- the motor 191 may also correspond to different vibration feedback effects for touch operations acting on different areas of the display screen 194 .
- Different application scenarios for example: time reminder, receiving information, alarm clock, games, etc.
- the touch vibration feedback effect can also support customization.
- the indicator 192 can be an indicator light, and can be used to indicate charging status, power change, and can also be used to indicate messages, missed calls, notifications, and the like.
- the SIM card interface 195 is used for connecting a SIM card.
- the SIM card can be connected and separated from the electronic device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
- the electronic device 100 may support one or more SIM card interfaces.
- SIM card interface 195 can support Nano SIM card, Micro SIM card, SIM card etc. Multiple cards can be inserted into the same SIM card interface 195 at the same time. The types of multiple cards may be the same or different.
- the SIM card interface 195 is also compatible with different types of SIM cards.
- the SIM card interface 195 is also compatible with external memory cards.
- the electronic device 100 interacts with the network through the SIM card to implement functions such as calling and data communication.
- the electronic device 100 adopts an eSIM, that is, an embedded SIM card.
- the eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100 .
- the electronic device 100 can realize the shooting function through the ISP, the camera 193 , the video codec, the GPU, the display screen 194 and the application processor.
- the ISP is used for processing the data fed back by the camera 193 .
- the light is transmitted to the photosensitive element of the camera through the lens, and the optical signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
- ISP can also perform algorithm optimization on image noise, brightness, and skin color.
- ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. Not limited to being integrated in the processor 110 , the ISP can also be set in the camera 193 .
- the number of cameras 193 may be M, where M ⁇ 2, and M is a positive integer.
- the number of cameras enabled by the electronic device 100 in dual-view recording may be N, where N ⁇ M, and N is a positive integer.
- the camera 193 includes a lens and a photosensitive element (also called an image sensor) for capturing still images or videos.
- the object generates an optical image through the lens and projects it to the photosensitive element.
- the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
- CMOS complementary metal-oxide-semiconductor
- the photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP for conversion into a digital image signal, such as standard RGB, YUV and other image signals.
- the hardware configuration and physical location of the camera 193 may be different, therefore, the size, range, content or definition of images collected by different cameras may be different.
- the sizes of images output by the camera 193 may be different or the same.
- the image output size of the camera refers to the length and width of the image captured by the camera. Both the length and width of the image can be measured in pixels.
- the output size of the camera can also be called image size, image size, pixel size or image resolution.
- the output ratio of common cameras can include: 4:3, 16:9 or 3:2, etc.
- the image output ratio refers to the approximate ratio of the number of pixels in the length and width of the image captured by the camera.
- the cameras 193 may correspond to the same focal length, or may correspond to different focal lengths.
- the focal length may include but not limited to: a first focal length whose focal length is less than a preset value 1 (for example, 20mm); a second focal length whose focal length is greater than or equal to a preset value 1 and less than or equal to a preset value 2 (for example, 50mm) ;The third focal length whose focal length is greater than the preset value 2.
- the camera corresponding to the first focal length may be called a super wide-angle camera, the camera corresponding to the second focal length may be called a wide-angle camera, and the camera corresponding to the third focal length may be called a telephoto camera.
- the larger the focal length corresponding to the camera the smaller the field of view (FOV) of the camera.
- the field of view refers to the angle range that the optical system can image.
- the camera 193 can be arranged on both sides of the electronic device.
- the camera located on the same plane as the display screen 194 of the electronic device may be called a front camera, and the camera located on the plane where the back cover of the electronic device is located may be called a rear camera.
- the front camera can be used to collect images of the photographer facing the display screen 194, and the rear camera can be used to collect images of objects (such as people, scenery, etc.) facing the photographer.
- camera 193 may be used to collect depth data.
- the camera 193 may have a (time of flight, TOF) 3D sensing module or a structured light (structured light) 3D sensing module for acquiring depth information.
- the camera used to collect depth data may be a front camera or a rear camera.
- Video codecs are used to compress or decompress digital images.
- the electronic device 100 may support one or more image codecs. In this way, the electronic device 100 can open or save pictures or videos in various encoding formats.
- the electronic device 100 may implement a display function through a GPU, a display screen 194, an application processor, and the like.
- the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
- Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
- the display screen 194 is used to display images, videos and the like.
- the display screen 194 includes a display panel.
- the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (active-matrixorganic light-emitting diode) , AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diodes (quantum dot light emitting diodes, QLED), etc.
- electronic device 100 may include one or more display screens 194 .
- the display screen 194 can display the dual-channel images from the two cameras 193 by splicing or picture-in-picture, so that the dual-channel images from the two cameras 193 can be presented to the user at the same time.
- the processor 110 may synthesize multiple frames of images from the two cameras 193 .
- the two video streams from the two cameras 193 are combined into one video stream, and the video encoder in the processor 110 can encode the combined data of one video stream to generate a video file.
- each frame of image in the video file can contain two images from two cameras 193 .
- the display screen 194 can display two-way images from the two cameras 193, so as to show the user two images of different ranges, different resolutions, or different details at the same moment or in the same scene. image screen.
- the processor 110 can correlate the image frames from different cameras 193, so that when the captured pictures or videos are played, the display screen 194 can display the associated image frames Also displayed in the viewfinder.
- videos simultaneously recorded by different cameras 193 may be stored as different videos, and pictures simultaneously recorded by different cameras 193 may be stored as different pictures respectively.
- the two cameras 193 may collect images at the same frame rate, that is, the two cameras 193 collect the same number of image frames at the same time.
- Videos from different cameras 193 can be stored as different video files respectively, and the different video files are related to each other.
- the image frames are stored in the video file according to the order in which the image frames are collected, and the different video files include the same number of image frames.
- the display screen 194 can display the sequence of the image frames included in the associated video files according to the preset or user-instructed layout mode, so that the images corresponding to the same sequence in different video files Multiple frames of images are displayed on the same interface.
- the two cameras 193 may collect images at the same frame rate, that is, the two cameras 193 collect the same number of image frames at the same time.
- the processor 110 can stamp time stamps on each frame of images from different cameras 193, so that when the recorded video is played, the display screen 194 can simultaneously display multiple frames of images from the two cameras 193 on the same screen according to the time stamps. interface.
- the electronic device usually shoots in the user's hand-held mode, and the user's hand-held mode usually makes the picture obtained by shooting shake.
- the processor 110 may separately perform anti-shake processing on image frames captured by different cameras 193 . Then, the display screen 194 displays the image after anti-shake processing.
- FIG. 4 exemplarily shows an exemplary user interface 40 for an application program menu on the electronic device 100 .
- the electronic device 100 may be configured with a plurality of cameras 193 , and the plurality of cameras 193 may include a front camera and a rear camera. Wherein, there may be multiple front cameras, such as a front camera 193-1 and a front camera 193-2.
- the front camera 193 - 1 and the front camera 193 - 2 can be set on the top of the electronic device 100
- 170A is a speaker on the top of the electronic device 100 .
- a rear camera 193 and an illuminator 197 may also be configured on the back of the electronic device 100 .
- the home screen interface 40 includes a calendar widget (widget) 401 , a weather widget 402 , application icons 403 , a status bar 404 and a navigation bar 405 . in:
- the calendar widget 401 can be used to indicate the current time, such as date, day of the week, time and division information, and the like.
- the weather gadget 402 can be used to indicate the type of weather, such as cloudy to sunny, light rain, etc., can also be used to indicate information such as temperature, and can also be used to indicate a location.
- Application icon 403 may contain, for example (Wechat) icon, (Twitter) icon, (Facebook) icon, (Sina Weibo) icon, (Tencent QQ) icon, The (YouTube) icon, the gallery (Gallery) icon, the camera (camera) icon 1031, etc. may also include icons of other applications, which is not limited in this embodiment of the present application.
- the icon of any application may be used to respond to user operations, such as touch operations, so that the electronic device 100 starts the application corresponding to the icon.
- the status column 404 may include the name of the operator (such as China Mobile), time, WI-FI icon, signal strength and current remaining power.
- the navigation bar 405 may include: system navigation keys such as a return button 4051, a home screen button 4052, and a callout task history button 4053.
- the main screen interface 40 is an interface displayed by the electronic device 100 after any user interface detects a user operation acting on the main interface button 4052 .
- the electronic device 100 may display a previous user interface of the current user interface.
- the electronic device 100 may display the home screen interface 40 .
- the electronic device 100 may display the tasks opened by the first user recently.
- each navigation key can also be other, for example, 4051 can be called Back Button, 4052 can be called Home button, 4053 can be called Menu Button, and this application does not limit to this.
- the navigation keys in the navigation bar 405 are not limited to virtual keys, and can also be implemented as physical keys.
- FIG. 4 only exemplarily shows the user interface on the electronic device 100 and should not be construed as limiting the embodiment of the present application.
- FIG. 5 exemplarily show the process of the electronic device 100 entering the "dual-view video recording mode" in response to the detected user operation.
- the electronic device may detect a touch operation (such as a click operation on the icon 4031) acting on the icon 4031 of the camera as shown in (A) in FIG. 4 , and start the camera application program in response to the operation,
- the electronic device 100 can display the user interface 50 .
- "Camera” is an application program for capturing images on electronic devices such as smartphones and tablet computers, and this application does not limit the name of the application program. That is to say, the user can click on the icon 4031 to open the user interface 50 of “Camera”. Not limited to this, the user can also open the user interface 50 in other applications, for example, the user in Click on the capture control to open the user interface 50. It is a social application that allows users to share photos they have taken with others.
- FIG. 5 exemplarily shows a user interface 50 of a "camera” application program on an electronic device such as a smart phone.
- the user interface 50 may also include a thumbnail control 501 , a camera control 502 , a camera switch control 503 , a viewfinder frame 505 , a focus control 506A, a setting control 506B, and a flash switch 506C. in:
- the thumbnail control 501 is used for the user to view the pictures and videos that have been taken.
- the shooting control 502 is configured to make the electronic device 100 take a picture or a video in response to a user's operation.
- the time when the electronic device 100 starts to shoot video may be referred to as time T1.
- the camera switching control 503 is used to switch the camera for capturing images between the front camera and the rear camera.
- the viewfinder frame 505 is used to preview and display the collected pictures in real time, wherein the dividing line 5051 is the lower boundary of the viewfinder frame 505 , and the upper boundary of the screen of the electronic device 100 is the upper boundary of the viewfinder frame 505 .
- the focus control 506A is used to focus the camera.
- the way of focusing the camera is not limited to being realized by touching the focus control control, and the user may also realize it through a two-finger zoom operation acting on the viewfinder frame.
- the zoom factor changes with the pinch gesture.
- the pinch-to-zoom gesture is a pinch-to-zoom gesture
- the larger the magnitude of the gesture the greater the zoom factor of the corresponding camera.
- the pinch-to-zoom gesture is a pinch-to-zoom gesture
- the larger the magnitude of the gesture the smaller the zoom factor of the corresponding camera.
- the setting control 506B is used to set various parameters when capturing images.
- the flash light switch 506C is used to turn on/off the flash light.
- the function bar 504 includes a night scene shooting control 504A, a portrait shooting control 504B, a photo shooting control 504C, a short video shooting control 504D, a video recording control 504E and more shooting option controls 504F.
- any shooting option control can be used to respond to user operations, such as touch operations, so that the electronic device 100 starts the shooting mode corresponding to the icon.
- the electronic device 100 may display the user interface 60 in response to the user operation, that is, a touch operation on the more shooting option control 504F.
- the camera application interface 60 can also include a variety of controls for selecting a shooting mode, such as a slow motion mode control, a panorama shooting control, a black and white mode control, and a dual-view video recording control 601, etc., and can also include controls for other shooting modes. Examples are not limited to this.
- the dual-view video recording control 601 responds to the user operation, that is, a touch operation on the dual-view video recording control 601 , and the camera enters a dual-view video recording mode.
- the dual-view recording control 601 may be included in the interface 50, or may be included in other user interfaces of the camera application, which is not limited in this embodiment of the present application.
- the electronic device 100 may automatically enter the "dual-view recording mode" by default after starting the "camera”. In some other embodiments, after the electronic device 100 starts the “camera”, if it has not entered the "dual view recording mode", it may enter the "dual view recording mode" in response to a detected user operation. Not limited thereto, the electronic device 100 can also enter the "dual-view recording mode” in other ways, for example, the electronic device 100 can also enter the "dual-view recording mode" according to the user's voice command, which is not limited in this embodiment of the present application.
- FIG. 5 exemplarily show the scene of changing the form of the recording interface in the up-and-down split-screen mode.
- the electronic device 100 can use two cameras to collect images and display a preview interface on the display screen. As shown in the user interface 70 in (C) of FIG. 5 , when entering the dual-view recording mode at the beginning, the electronic device 100 will automatically select the recording mode of the upper and lower split screens by default. In some embodiments, when entering the dual-view video recording mode, the default split-screen mode may also be other modes, such as the picture-in-picture mode, which is not limited in this embodiment of the present application. User interface 70 may also include:
- the upper viewfinder frame 701 is used for real-time preview display of the image captured by the first camera.
- the separation line 706 is the lower boundary of the upper viewing frame 701
- the upper boundary of the screen of the electronic device 100 is the upper boundary of the viewing frame 505 .
- the viewfinder frame 701 may include: a camera switching control 701B, which is used to switch the camera for capturing images between the front camera and the rear camera.
- the user may click the camera switching control 701B to change the camera corresponding to the viewfinder frame 701 from the front camera 193-1 to the rear camera 193-3.
- the viewfinder corresponding to the front camera may not include the focus control 701A. That is to say, in this embodiment and subsequent embodiments, when the electronic device 100 performs front viewfinding, the front camera screen may not support focus adjustment, and the focal length of the front camera is fixed at wide-angle, telephoto or other focal lengths; The camera screen can also support focus like the rear camera, and the interface includes focus controls for focus.
- the lower viewfinder frame 702 is used for real-time preview display of the image captured by the second camera.
- the separation line 706 is the upper boundary of the lower viewing frame 701
- the lower boundary of the screen of the electronic device 100 is the lower boundary of the viewing frame 505 .
- the viewfinder frame 701 may include: a focus control 702A, which is used to focus the second camera.
- the camera switching control 702B is used to switch the camera for capturing images between the front camera and the rear camera.
- the thumbnail control 703 is used for the user to view the pictures and videos that have been taken.
- the shooting control 704 is configured to enable the electronic device 100 to shoot a video in response to a user's operation.
- the moment when the electronic device 100 starts to shoot video may be referred to as the first moment T1.
- a certain moment when the electronic device 100 shoots a video in the dual-view video recording mode may be referred to as a second moment T2.
- the duration between the first moment T1 and the second moment T2 may be referred to as a first duration t1.
- the time T1 is the time T2.
- the filter control 705 is used to set a filter when capturing an image.
- a flashlight switch 706 is used to turn on/off the flashlight.
- the separation line 706 is used to divide the upper viewing frame 701 and the lower viewing frame 702 .
- the screen of the electronic device is divided up and down into an upper viewfinder frame 701 and a lower viewfinder frame 702, the upper viewfinder frame 701 correspondingly displays images from the front camera 193-1, and the lower viewfinder frame Box 702 corresponds to displaying an image from the rear camera 193-3.
- the image in the upper viewfinder frame 701 is the image of the photographer facing the display screen of the electronic device 100
- the image in the lower viewfinder frame 702 is the image of the subject (such as a person, a landscape, etc.) the photographer is facing.
- the electronic device 100 uses the user interface shown in Figure 5 (C) to record audio
- the audio when the audio is played, since the upper viewfinder frame 701 and the lower viewfinder frame 702 have the same area, the user can feel that the upper viewfinder frame
- the sound of the surrounding environment of the subject in 701 that is, the side of the front camera of the electronic device, hereinafter referred to as sound 1
- the sound of the surrounding environment of the subject in the lower frame 702 that is, the side of the rear camera of the electronic device
- the sound of the surrounding environment (that is, sound 1) of the subject in the upper viewfinder frame 701 be emitted on the upper part of the electronic device, and the sound of the surrounding environment of the subject to be photographed in the lower viewfinder frame 702 (that is, the sound 2) I send it out in the lower part of the electronic equipment.
- the speakers of the electronic device such as the top speaker or/and the bottom speaker or/and the back speaker
- the speakers of the electronic device will all play the sound of the surrounding environment of the object being photographed in the upper viewfinder frame 701, and will also play the sound of the lower viewfinder frame 702 being photographed.
- the sound of the subject's surroundings, that is, the speakers of the electronic device, both sound 1 and sound 2 are played. And in this way, the feeling of stereo sound is enhanced, and the user feels that the sound of the upper viewfinder frame is emitted above, and the sound of the lower viewfinder frame is emitted below, which enhances the sense of experience and interest.
- the electronic device 100 may adjust the areas of the upper viewing frame 701 and the lower viewing frame 702 in response to a user's touch operation on the separation line 706 , such as a sliding operation.
- a user's touch operation on the separation line 706 such as a sliding operation.
- the electronic device 100 may adjust the areas of the upper viewing frame 701 and the lower viewing frame 702 in response to a user's touch operation on the separation line 706 , such as a sliding operation.
- a user's touch operation on the separation line 706 such as a sliding operation.
- the electronic device 100 records audio using the user interface shown in (D) in FIG.
- the difference between the sounds in C) is that the loudness of sound 1 is smaller than that of sound 2; optionally, the user can still feel that sound 1 is emitted from the upper part of the electronic device, and sound 2 is emitted from the lower part of the electronic device. Describe in detail Refer to the description of (C) in FIG. 5 , and no more details are given.
- the user can also make the area of the upper viewfinder frame 701 larger and the area of the lower viewfinder frame 702 smaller by sliding down on the dividing line 709.
- the audio recorded in this interface is played, it will be relatively smooth. Compared with the audio recorded on the interface before sliding the dividing line 709, the loudness of the sound 1 becomes louder, and the loudness of the sound 2 becomes smaller.
- the screen swap control 708 is used to swap the images captured by the upper viewfinder frame 701 and the lower viewfinder frame 702 .
- the electronic device 100 in response to the user operation, that is, the touch operation on the screen switching control 718 , the electronic device 100 converts the screen in the upper viewfinder frame 711 and the image in the lower viewfinder frame 712 .
- the screens are exchanged, and the final presentation effect can refer to the user interface 72 shown in (A) of FIG. 6 .
- the sound effect in (A) in FIG. 6 refer to the description in (D) in FIG. 5 , which will not be repeated here.
- the user can also exchange the images of the two viewfinder frames through other operations. For example, the user can switch the images of the two viewfinder frames through a sliding operation on the upper viewfinder frame 711 or an upward slide operation on the lower viewfinder frame 712 .
- the split-screen option control 709 is used to make the electronic device 100 switch the split-screen mode of the dual-view recording mode in response to the user's operation.
- the user can also change the recording interface by adjusting the focus, switching the front/rear lens, and changing the split-screen mode.
- the recording interface by adjusting the focus, switching the front/rear lens, and changing the split-screen mode.
- FIG. 6 exemplarily show the scene where the user switches the split-screen mode in the dual-view video recording mode.
- the electronic device 100 displays a split-screen option box 710 .
- the split screen option box 710 may contain various split screen option controls, such as up and down split screen control 701A (regular split screen), up and down split screen control 701B (irregular split screen), in-picture
- the picture control 710C the shape of the small viewfinder is square
- the picture-in-picture control 710D the shape of the small viewfinder is circular
- the split-screen option box 710 may also include controls for other split-screen options, which is not limited in this embodiment of the present application.
- Any split-screen option control can be used to respond to a user's operation, such as a touch operation, so that the electronic device 100 starts the split-screen recording mode corresponding to the control.
- the switching of the split-screen mode is not limited to switching through the split-screen option box shown in (B) in FIG.
- the electronic device 100 may directly switch to another split-screen mode different from the current split-screen mode after responding to the user's touch operation on the split-screen option control 729 of (A) in FIG. 6 ; After the split screen option control 729 in (A) of 6, the electronic device 100 switches to another split screen mode again.
- the electronic device 100 in response to the user operation, that is, the touch operation acting on the picture-in-picture control 730C, the electronic device 100 starts the dual-view video recording mode in the picture-in-picture split-screen mode, and displays the picture-in-picture
- the user interface 80 may also include a camera switching control 801, a viewfinder frame 802, and other controls such as shooting controls, filter controls, etc. I won’t go into details here), where:
- the main viewing frame 801 (also referred to as the main picture area) is used for real-time preview display of the image captured by the first camera.
- the main viewing frame 801 may include: a focusing control 801A, which is used to adjust the focus of the first camera.
- the camera for capturing images is switched between the front camera and the rear camera, and the camera switching control 801B is used for switching the camera for capturing images between the front camera and the rear camera.
- the user may click the camera switch control 801B to change the camera corresponding to the viewfinder frame 801 from the front camera 193-1 to the rear camera 193-3.
- the sub-frame 802 (also referred to as the sub-picture area) is used for real-time preview display of the image captured by the second camera.
- the sub-viewing frame 802 may include: a focus control 802A, which is used to focus the second camera.
- the camera switching control 802B is used to switch the camera for capturing images between the front camera and the rear camera.
- the user may click the camera switching control 802B to change the camera corresponding to the viewfinder frame 802 from the rear camera 193-3 to the front camera 193-1.
- the focusing control 802A and the camera switching control 802B are presented in the main viewing frame 801 .
- the focus control 802A and the camera switching control 802B are presented in the sub-frame 801 , for details, refer to (G) in FIG. 6 .
- the sub-viewfinder may not include a focus control, for details, refer to (H) in FIG. 6 .
- the loudness of the sound of the surrounding environment of the subject in the sub-viewfinder frame 802 (that is, the side of the front camera of the electronic device, hereinafter referred to as sound 1) is relative to the sound of the surrounding environment of the subject in the main viewfinder frame 801 (that is, the electronic device
- the side of the rear camera of the device, hereinafter referred to as sound 2) is less loud.
- the sound of the surrounding environment of the subject being photographed in the main viewfinder frame 801 (that is, the side of the rear camera of the electronic device, hereinafter referred to as sound 2) is emitted around the user without directionality, but the sub viewfinder frame 802
- the sound of the surrounding environment of the subject (that is, the side of the front camera of the electronic device, hereinafter referred to as sound 1) is emitted from the upper left corner of the electronic device (the user holds the mobile phone in the direction shown in (C) in Figure 6).
- the speakers of the electronic device (such as the top speaker or/and the bottom speaker or/and the rear speaker) will all play the sound of the surrounding environment of the subject in the sub-viewfinder frame 802, and will also play the sound of the main viewfinder frame 801 being captured.
- the sound of the subject's surroundings that is, the speakers of the electronic device, both sound 1 and sound 2 are played.
- the feeling of stereo sound is enhanced, and the user feels that the sound also presents a picture-in-picture effect, which enhances the sense of experience and interest.
- the user can also feel that the above-mentioned sound 1 is coming from the left side of the user.
- the screen of the electronic device is divided into a main view frame 801 and a sub view frame 802 .
- the main viewing frame 801 corresponds to displaying images from the rear camera 193-3
- the sub-viewing frame 802 corresponds to displaying images from the front camera 193-1.
- the image in the main frame 801 is the image of the subject (such as a person, a landscape, etc.) facing the photographer
- the image in the sub frame 802 is the image of the photographer facing the display screen of the electronic device 100 .
- the default area and orientation of the viewfinder frame 802 are not limited to the pattern shown in (C) in FIG. 6 .
- the area of the viewfinder frame 802 may be larger or smaller than that shown in (C) in FIG.
- the orientation is not limited in this embodiment of the present application.
- FIG. 6 exemplarily show the scene of adjusting the style of the recording interface in the "picture in picture” mode.
- the electronic device 100 may adjust the area and orientation of the viewing frame 802 in response to the detected user operation.
- the electronic device 100 may change the orientation of the viewfinder frame 802 on the screen in response to a sliding operation acting on the viewfinder frame 802 .
- the changed orientation of the viewfinder frame 802 can be referred to as shown in FIG. 6(D), where the orientation of the viewfinder frame 802 changes from upper left to lower right of the screen of the electronic device 100 . In this way, the loudness of sound 1 and sound 2 remains unchanged, but it is felt that sound 1 has moved from the upper left to the lower right of the electronic device.
- the electronic device 100 in response to the two-finger zoom-in operation acting on the viewfinder frame 812, the electronic device 100 enlarges the area of the viewfinder frame 802.
- the electronic device 100 may also reduce the area of the viewfinder frame 802 in response to the two-finger zoom-out operation acting on the viewfinder frame 812 .
- the electronic device 100 can also adjust the focal length of the camera in response to the detected user operation. As shown in (E) of FIG. 6 , in response to the sliding operation on the focus control 822A, the electronic device will increase the focal length of the camera corresponding to the viewfinder frame 822 from 2 ⁇ to 3 ⁇ .
- the adjusted user interface can refer to shown in (F) in Fig. 6, and the field of view presented by the image in the viewfinder frame 822 is changed from (E) (viewfinder frame 822) in Fig. 6 to (F) (viewfinder frame 822) in Fig. 6 832), but the image in the viewfinder frame 822 is larger than the image in the viewfinder frame 832 shown in (E) in FIG. 6 .
- Table 1 only exemplarily shows the focal length multiples that the camera in the electronic device 100 can provide, and is not limited to the focal length multiples contained in Table 1.
- the camera in the electronic device 100 can also provide users with other more focal length multiples.
- the options, such as 7 ⁇ , 8 ⁇ , etc., are not limited in this embodiment of the present application.
- the corresponding relationship between each focal length multiple and the viewing angle in the electronic device 100 may not be limited to the corresponding relationship shown in Table 1.
- the field angle corresponding to the focal length multiple of 1 ⁇ may be 170°
- the field angle corresponding to the focal length multiple of 2 ⁇ may be 160°, which is not limited in this embodiment of the present application.
- this corresponding relationship is fixed when the electronic device 100 is produced and leaves the factory. That is to say, when the electronic device 100 is shooting, the electronic device 100 can obtain the size and range of the field of view at that time according to the front/rear information and the focal length multiple of the camera used for shooting.
- FIG. 7A exemplarily shows a recording interface where the electronic device 100 splits the screen up and down and simultaneously uses the front and rear cameras to perform dual-view video recording.
- the recording interface 90 shown in FIG. 7A may include multiple controls, such as a capture control 904 , a filter control 905 and some other controls.
- the electronic device 100 can switch the recording screen in response to the user's touch operation on these controls. For details, refer to the above-mentioned related description of FIG. 5 , which will not be repeated here.
- the front camera of the electronic device 100 only provides a fixed focal length multiple of 1 ⁇ , that is, the field of view of the front camera is fixed at +180°; the rear camera of the electronic device 100 can be Six different focal length multiples from 1 ⁇ to 6 ⁇ are provided, and the user can switch the focal length multiple of the rear camera by touching the focus adjustment control 902A, so as to adjust the field of view range of the picture in the viewfinder frame 902 .
- the user can also use the camera switching control 901B to switch the camera that collects images in the upper viewfinder frame 901 between the front camera and the rear camera; 902 The camera for collecting images is switched between the front camera and the rear camera.
- the user may click the camera switching control 901B to change the camera corresponding to the viewfinder frame 901 from the front camera 193-1 to the rear camera 193-3.
- the focal length of the front camera also supports switching
- the recording interface also includes a focus control for adjusting the focus.
- the control 911A in the recording interface 91 shown in FIG. 7B which can be used to switch the focal length of the front camera.
- the electronic device 100 is using a front camera with a fixed focal length of 1 ⁇ and a rear camera with a focal length of 3 ⁇ to perform dual-view video recording in a split-screen mode.
- the picture presented by the upper viewfinder frame 901 (hereinafter referred to as picture 1) is the picture taken by the front camera (such as 193-1), which is the user's own face;
- the picture presented by the lower viewfinder frame 902 (hereinafter referred to as Frame 2) is a frame captured by a rear camera (eg 193 - 3 ), which is a landscape image in front of the electronic device 100 .
- the field of view of the front camera is +180° at this time, and the field of view of the rear camera is -90°.
- FIG. 7B shows a recording interface of the electronic device 100 in a picture-in-picture split-screen mode while using front and rear cameras to perform dual-view video recording.
- the recording interface 91 shown in FIG. 7B may include a plurality of controls, and the electronic device 100 may switch the recording screen in response to the user's touch operation on these controls.
- the electronic device 100 may switch the recording screen in response to the user's touch operation on these controls.
- the electronic device 100 is using a front camera with a focal length of 6 ⁇ and a rear camera with a focal length of 2 ⁇ to perform dual-view video recording in picture-in-picture mode.
- the picture presented by the viewfinder 911 (hereinafter referred to as the main picture) is the picture taken by the rear camera (such as 193-4) with a focal length multiple of 2 ⁇ , which is the scenery in front of the electronic device 100; the viewfinder 912 presents
- the picture (hereinafter referred to as the sub-picture) is the picture taken by the front camera (for example, 193-2) with a focal length multiple of 6 ⁇ , which is the user's own face. It can be known from the above Table 1 that at this time, the viewing angle corresponding to the main picture is -90°, and the viewing angle corresponding to the sub-picture is +30°.
- the electronic device 100 when the electronic device 100 collects audio signals in the environment, it still collects audio signals in all directions (and 360° in space). However, in order to make the recorded audio match the range of viewing angles presented to the user by the two images, after collecting audio signals transmitted from all angles, the electronic device 100 can combine the respective viewing angles of the two images to analyze the received audio signals. The audio signals are filtered at the same angle to obtain audio signals enhanced in the directions of the two viewing angles respectively.
- FIG. 8A is a schematic diagram of a scenario in which an electronic device filters an audio signal in combination with a viewing angle of a screen according to an embodiment of the present application.
- FIG. 8A it presents a top view of the electronic device 100 when it is placed upright, and the electronic device can be regarded as a point P.
- OPO' is the plane where the electronic device is located
- QQ' is the normal of the plane.
- the left side of OPO' represents the side where the front camera of the electronic device is located
- the right side of OPO' represents the side where the rear camera of the electronic device 100 is located.
- the shaded part on the left side of OPO' indicates the field of view range (0° to +180°) that the front camera of the electronic device 100 can obtain at a focal length of 1 ⁇
- the blank part on the right side of OPO' indicates The focal length of the rear camera of the electronic device 100 is the field of view range (-180° to 0°) that can be obtained at 1 ⁇ .
- ⁇ OPO' on the right side
- ⁇ APA', ⁇ BPB' and ⁇ CPC' are the field angles corresponding to the rear camera of the electronic device 100 at 1 ⁇ , 2 ⁇ , 3 ⁇ and 6 ⁇ respectively
- ⁇ OPO’ right side
- ⁇ DPD', ⁇ EPE', ⁇ FPF' are the field of view angles corresponding to the front camera of the electronic device 100 at 1 ⁇ , 2 ⁇ , 3 ⁇ and 6 ⁇ respectively, and the values of each angle It can be obtained by referring to the aforementioned Table 1, and will not be repeated here.
- the size and direction of the field of view angle of the front camera of the electronic device 100 is the same as that of ⁇ OPO' (left side) in FIG. 8A .
- the size is +180°
- the boundary of the field of view is the ray where PO and PO' are located.
- the size and direction of the viewing angle of the rear camera of the electronic device 100 are consistent with ⁇ BPB' in FIG. 8A, the size is -90°, and the boundary of the viewing angle is where PB and PB' Rays. Therefore, when the user uses the recording interface shown in FIG.
- the electronic device 100 will, according to the field of view angle ⁇ OPO' (left side) of the above-mentioned picture 1, view images other than ⁇ OPO' (left side)
- the audio signal collected in the angular direction is suppressed to obtain the audio signal 1 in the same angular direction as ⁇ OPO' (left side); and according to the field angle ⁇ BPB' of the above picture 2, the The audio signal collected above is suppressed to obtain the audio signal 2 in the same angle direction as ⁇ BPB'.
- the above-mentioned picture 1 contains sound source A, and there is sound source A' in the scene other than picture 1
- the above-mentioned picture 2 contains sound source B, and there is sound source B' in the scene other than picture 1;
- the sound signal obtained through the viewing angle of picture 1 will make people feel that the sound source A is mainly making the sound, and the user can perceive that the sound 'corresponds to the orientation of picture 1 on the screen.
- the effect of orientation; the sound signal obtained through the field of view of picture 2 will make people feel that the sound source B is mainly making the sound, and the user can perceive that the sound has the same orientation as picture 2 presented on the screen Corresponding orientation effect.
- the angular directions of the two viewing angles in the recording interface may also change accordingly.
- the angular direction selected by the electronic device 100 when filtering the audio signal is also will change accordingly.
- the size and direction of the field angle of the front camera (that is, the camera corresponding to the sub-screen) of the electronic device 100 is the same as that of ⁇ FPF' (left side) in FIG. 8A Consistent, the size is +30°, and the boundary of the field of view is the ray where PF and PF' are located.
- the size and direction of the viewing angle of the rear camera of the electronic device 100 are consistent with ⁇ APA' in FIG. 8A, and the size is -120°. Rays. Therefore, when the user uses the recording interface shown in FIG.
- the electronic device 100 will, according to the field of view angle ⁇ FPF' of the above-mentioned sub-picture, analyze the audio signals collected from angles other than ⁇ FPF' Suppress to obtain the audio signal 3 in the same angle direction as ⁇ FPF'; and according to the field of view ⁇ APA' of the above-mentioned main picture, suppress the audio signal collected in the direction of the angle other than ⁇ APA' to obtain Audio signal 4 in the same angular direction as APA' (not shown in FIG. 8A ).
- the recording interface 92 shown in FIG. 7C when the electronic device 100 is using a front-facing camera with a focal length of 3 ⁇ and a front-facing camera with a focal length of 6 ⁇ for dual-view recording, the front-facing camera with a focal length of 3 ⁇
- the corresponding field angle in FIG. 8A is ⁇ EPE'
- the corresponding field angle of the 6 ⁇ focal length front camera in FIG. 8A is ⁇ FPF'.
- the corresponding field angle of the 1 ⁇ focal length rear camera in FIG. 8A is ⁇ OPO' (right side), the corresponding field angle of the 2 ⁇ focal length rear camera in FIG. 8A is ⁇ BPB′.
- the electronic device 100 can also use other different combinations of cameras and focal lengths to perform dual-view video recording.
- the angle and direction selected when it filters the audio signal will change accordingly, and the two audio signals obtained by filtering will also be different, which will not be listed here.
- the electronic device 100 can select a specific filtering method (such as the CVX beam training method), based on the The field of view is filtered in two angle directions, left and right, which will be described next with reference to FIG. 8B .
- a specific filtering method such as the CVX beam training method
- FIG. 8B is a schematic diagram of another scene where an electronic device filters an audio signal through a CVX beam training method in combination with a field of view angle of a picture provided by an embodiment of the present application.
- the electronic device 100 presents a top view when the electronic device 100 is placed upright, and the electronic device can be regarded as a point P.
- FIG. 7B For specific meanings, reference may be made to the foregoing description of FIG. 7B , which will not be repeated here.
- the electronic device 100 is using the recording interface 90 shown in FIG. 7A to perform dual-view recording.
- the viewing angle of picture 1 is ⁇ OPO’ (left side) in FIG. 8B, and the boundary is the ray where PO and PO’ are located.
- the field of view of picture 2 is ⁇ BPB', and the boundary of the field of view is the ray where PB and PB' are located. It can be seen from FIG.
- the normal QQ' of the plane where the electronic device 100 is located is the angle bisector of ⁇ OPO' (left side) and ⁇ BPB', and the normal QQ' divides ⁇ OPO' (left side) into ⁇ OPQ' And ⁇ O'PQ', divide ⁇ BPB' into ⁇ BPQ and ⁇ B'PQ. It is stipulated here that, for the electronic device 100, the upper side of the normal line QQ' is the left side of the electronic device 100, and the lower side of the normal line QQ' is the right side of the electronic device 100.
- the electronic device 100 can record the audio collected from angles other than ⁇ OPQ' and ⁇ O'PQ' according to the field of view angle ⁇ OPO' (left side) of picture 1 above.
- the signal is suppressed to obtain the left channel audio signal 11 in the same angle direction as ⁇ OPQ', and the right channel audio signal 21 in the same angle direction as ⁇ O'PQ'; and according to the field of view angle ⁇ BPB' of picture 2, Respectively suppress the audio signals collected in angle directions other than ⁇ BPQ and ⁇ B'PQ, and obtain the left channel audio signal 21 in the same angle direction as ⁇ BPQ, and the right channel audio signal in the same angle direction as ⁇ B'PQ channel audio signal 22.
- the output audio can create a more three-dimensional sense of hearing for the user.
- the method for filtering audio signals shown in FIG. 8B is also applicable to other recording interfaces.
- four audio signals with differences between left and right channels can be obtained.
- the audio filtering scenarios performed by the electronic device 100 under each recording field interface will not be described one by one here.
- FIG. 8A and FIG. 8B are only used to distinguish the audio signals, and do not represent the actual waveforms of the audio signals when they propagate in space.
- FIG. 9 is a flowchart of an audio processing method provided by an embodiment of the present application. This method combines the field of view angles of the two pictures in the dual-view video, takes the field of view as the input of the DSB algorithm, filters the audio based on the DSB algorithm, and then remixes the audio, so that the obtained audio can have a synchronous stereoscopic effect with the picture. feel.
- the method provided in the embodiment of this application may include:
- the electronic device turns on a dual-view video recording mode.
- the electronic device may detect a touch operation (such as a click operation on the icon 4031 ) acting on the icon 4031 of the camera as shown in (A) in FIG. 4 , and start a camera application program in response to the operation.
- a touch operation such as a click operation on the icon 4031
- start a camera application program in response to the operation.
- the dual-view video recording mode is turned on.
- the user operation may be a touch operation (such as a click operation) on the dual-view recording mode option 601 shown in (B) of FIG. 5 .
- the user operation may also be other types of user operations such as voice instructions.
- the electronic device 100 may select the "dual-view recording mode" by default after starting the camera application.
- the electronic device displays a corresponding recording interface according to adjustments made by the user to the recording interface.
- the electronic device Before starting to record, the electronic device can detect the user's setting of the interface style in the dual-view recording mode.
- the recording interface Adjustments include but are not limited to the following:
- the electronic device may detect a touch operation (such as a click operation on the control 709) acting on the split screen option control 709 as shown in (A) in FIG. switch.
- a touch operation such as a click operation on the control 709 acting on the split screen option control 709 as shown in (A) in FIG. switch.
- the electronic device may detect a pinch-to-zoom operation acting on the viewfinder frame 802 as shown in (D) in FIG. 6 , and zoom in on the viewfinder frame 802 in response to the operation; 5 (C), and in response to the sliding operation on the separation line 706 shown in (C), the viewfinder frame 701 is zoomed out, and the viewfinder frame 702 is enlarged at the same time.
- the electronic device can detect the two-finger sliding operation on the focus control 802A in (D) (as shown in FIG. ⁇ adjusted to 3 ⁇ .
- the above-mentioned adjustment of the focal length may be an adjustment of the focal length of the rear camera, or an adjustment of the front camera.
- the adjustment of the focal length may also be to adjust the focal lengths of the front camera and the rear camera at the same time.
- the electronic device may detect a click operation on the control 901B as shown in FIG. 7A, and switch the camera corresponding to the viewfinder frame 901 in FIG.
- the control 901B as shown in FIG. 7A
- the recording interface 92 shown in FIG. 7C For the user interface, reference may be made to the recording interface 92 shown in FIG. 7C .
- the electronic device may detect a click operation on the screen switching control 708 as shown in (D) in FIG.
- the switching of the framing images of the two framing frames is actually an exchange of the cameras corresponding to the two framing frames. Therefore, the front/rear and focal length information of the cameras corresponding to the two viewfinder frames are switched accordingly.
- the electronic device collects audio.
- the electronic device detects a user operation indicating to start recording a video, such as a click operation on the control 704 shown in (C) in FIG. 5 , and an audio collection device (such as a microphone) in the electronic device collects audio signals in the environment.
- a user operation indicating to start recording a video such as a click operation on the control 704 shown in (C) in FIG. 5
- an audio collection device such as a microphone
- the electronic device may have M microphones, M>1, and M is a positive integer.
- the M microphones can collect audio signals in the environment at the same time to obtain M channels of audio signals.
- the audio signal is collected, that is, the collected sound is used as the input sound source of the electronic device.
- the collection of the sound source can be determined according to the performance of the microphone.
- it can be an all-round 360° space environment.
- the sound may also be other, such as directional spatial sound, which is not limited in this application.
- the above-mentioned M channels of audio signals may also be referred to as real-time environmental sounds.
- the electronic device records an image.
- the electronic device detects a user operation indicating to start recording video, such as a click operation on the control 704 shown in (C) in FIG. 5 , and the electronic device starts to simultaneously use two cameras to collect and shoot images.
- a user operation indicating to start recording video such as a click operation on the control 704 shown in (C) in FIG. 5 .
- the electronic device may have N cameras, where N ⁇ 2, and N is a positive integer.
- the N cameras may be a combination of a front camera and a rear camera.
- the N cameras may also be a wide-angle camera, an ultra-wide-angle camera, a telephoto camera or a combination of cameras with any focal length in the front camera.
- the present application does not limit the camera combination method of the N cameras.
- the electronic device will use two viewfinder frames on the screen to present the two images collected by the two cameras according to the user's camera selection in S102 (such as the selection of the front/rear camera and the selection of the focal length of the camera). road image.
- the display screen of the electronic device can display the two-way images from the two cameras through splicing (refer to the upper and lower screens in the above description) or picture-in-picture, so that the two-way images from the two cameras can be presented simultaneously to the user.
- steps S105-S107 in the embodiment of the present application will be described with reference to the recording interface 90 shown in FIG. 7A.
- the electronic device acquires the field of view angle of the picture.
- the recording interface 90 shown in FIG. 7A Electronic equipment is using a front camera with a fixed focal length of 1 ⁇ and a rear camera with a focal length of 3 ⁇ to perform dual-view video recording in the upper and lower split-screen mode.
- the picture presented by the upper viewfinder frame 901 (hereinafter referred to as the first picture) is the picture taken by the front camera, which is the user's own face;
- the picture presented by the lower viewfinder frame 902 (hereinafter referred to as the second picture) is a picture captured by the rear camera, which is a landscape image in front of the electronic device 100 .
- the electronic device calculates the picture weight.
- the calculation method of the picture weight of the two display areas under the upper and lower split-screen interface will be described in detail in combination with the recording interface shown in FIG. 7A.
- the length of the display screen of the electronic device is d0
- the width is (d1+d2)
- the width of the first picture is d1
- the width of the display area of the first picture is d1
- the width of the display area of the second picture is d2.
- w1 is the picture weight of the first picture
- w2 is the picture weight of the second picture
- the electronic device filters the audio based on the DSB algorithm.
- the electronic device will use the field angle information of the picture obtained in S105 to filter the audio collected by the audio acquisition device in S103 to obtain the corresponding two pictures.
- This process may be implemented by using an algorithm such as a blind source separation algorithm or a beamforming algorithm, which is not limited in this embodiment of the present application.
- the M microphones will collect M channels of audio signals. Based on the FFT algorithm, the electronic device can convert the M-channel audio signal from a time-domain signal to a frequency-domain signal, and then according to the front/rear information and focal length (ie, field of view information) of the two cameras in the dual-view video, The M channel audio signal is filtered, and the calculation formula used for filtering is:
- x i ( ⁇ ) represents the audio signal collected by the i (i ⁇ M) microphone in the electronic device
- w i ( ⁇ ) can pass the DSB algorithm and the CVX wave training method etc., which represents the weight vector of the beamformer for the i-th microphone when the frequency of the audio signal is ⁇ . It should be understood that no matter what algorithm implements audio filtering, w i ( ⁇ ) is a necessary parameter in the algorithm that is strongly related to the filtering direction.
- w i ( ⁇ ) is obtained based on the DSB algorithm.
- the input of the DSB algorithm includes the distance between the i-th microphone and other (M-1) microphones, the front/rear information of the camera, and the focal length. Therefore, when wi ( ⁇ ) is used for filtering, the audio signal collected by the i-th microphone can be enhanced to a certain extent in a specific direction, which is roughly determined by the front/rear information and focal length of the camera.
- the range and direction of the corresponding field of view, and the range and direction of the field of view determine the picture content presented by the viewfinder. In this way, the sense of orientation of the picture can be synchronized with the sense of orientation of the auditory sense.
- step S105 the viewing angle information of two pictures has been calculated, wherein the viewing angle of the first picture is +180°, and the viewing angle of the second picture is -90°.
- the electronic device can calculate the weight vector required for filtering the first picture and the second picture through the above-mentioned DSB algorithm with And according to these two weight vectors, the collected audio is filtered to obtain the beams y1( ⁇ ) and y2(t ⁇ ) corresponding to the first picture and the second picture, which can be expressed as:
- beam y1( ⁇ ) is audio signal 1 in FIG. 8A
- beam y2( ⁇ ) is audio signal 2 in FIG. 8A . It should be noted that neither the beam y1( ⁇ ) nor the beam y2( ⁇ ) obtained here distinguishes the left channel from the right channel.
- the electronic device remixes the sound source.
- the electronic device After obtaining the corresponding beams and picture weights of the two pictures, the electronic device will combine the picture weights of the two pictures to mix the two beams.
- the electronic device can virtualize the above-mentioned beam y1( ⁇ ) and beam y2( ⁇ ) in two channels:
- outl( ⁇ ) and outr( ⁇ ) can be distinguished as the audio of the left channel and the audio of the right channel in formula, these two audio data are actually the same.
- the actual sense of hearing may be the same when playing.
- the electronic device determines whether the interface changes.
- the electronic device detects in real time whether the recording interface changes. And the electronic device may change the recording interface in response to the detected user operation. It should be understood that when the recording interface changes, the picture weight, front/rear information, and focal length information of the two pictures of the electronic device in the dual-view recording mode may all change.
- the electronic device when the recording interface of the electronic device changes, if the electronic device does not terminate or end the dual-view recording mode at this time, the electronic device will re-execute steps S103 to S104, and update steps S105-S108 in time according to the changed recording interface. For some parameters involved, the audio is filtered and remixed according to the viewing angle and area of the two screens in the updated recording interface.
- the way of changing the above-mentioned recording interface can refer to the changing ways shown in the relevant user interface shown in the above-mentioned Fig. 5 and Fig. 6, and can also refer to the six types 1 to 6 of setting the user interface style in step S102. The scene will not be repeated here.
- the electronic device saves the processed audio.
- the electronic device may stop or close the dual-view recording mode in response to the user's operation.
- the electronic device detects a user operation indicating to stop recording video, such as another click operation on the control 704 shown in (C) in FIG. 5 , and the electronic device stops collecting and processing audio.
- the user operation may also be other types of user operations such as voice instructions.
- the electronic device can convert the audio signals outl( ⁇ ) and outr( ⁇ ) obtained in step S108 into time-domain signals outl(t) and outr(t) together with the recorded video in the local memory based on the IFFT algorithm middle.
- outl(t) and outr(t) can be output through two speakers on the electronic device respectively.
- the two audio signals are essentially different, the audio heard by the user's left and right ears may be slightly different.
- the sizes of the two display areas are almost the same or equal (as shown in FIG. 5(C))
- the user will feel that the loudness of the sound in the two images is approximately equal when listening to the audio.
- the electronic device may first save the recorded video file and initial audio in the memory. Afterwards, even if the recording interface changes, the electronic device can first save the initial audio recorded under the interface. It should be understood that the audio obtained at this time has not been processed in steps S105-S108. After the entire recording process is over, the electronic device will combine the above video files to obtain the moment when the recording interface changes and the field of view information of the recording interface, and perform the processing shown in steps S105 to S108 on the above initial audio, and obtain Finally, the target audio is used for output; optionally, the target audio and the above video file are synthesized to obtain a recorded file and saved for subsequent playback by the user. Optionally, after the electronic device saves the target audio, the above-mentioned initial audio can be deleted to save the storage space of the device, or both can be saved for subsequent use by the user.
- the audio processing method provided by this embodiment is also applicable to recording interfaces shown in FIGS. 5, 6, 7B and 7C, and other recording interfaces.
- the recorded audio is processed.
- the recording interface 91 is a dual-view recording mode of the picture-in-picture mode, and the electronic device can also obtain the front/rear information, focal length information and picture weight of the two pictures under the recording interface 91, and use this implementation
- FIG. 10 is a flowchart of another audio processing method provided by the embodiment of the present application.
- the method can virtualize the orientation of the audio in the dual-view recording mode of the picture-in-picture mode, and can further enhance the stereoscopic effect of the audio in conjunction with the sense of orientation between the two pictures.
- the virtual sound source azimuth in this method will be briefly explained.
- the larger image among the two images can be called the main image, and the smaller image can be called the sub image.
- the sub-picture and the specific position of the main picture visually have a sense of deviation from left to right, for example, whether the sub-picture is located on the left side or the right side of the main picture.
- the azimuth virtualization technology in the embodiment can be used.
- the azimuth virtual technology is only applicable to the recording interface where the relative positions of the two images have such a sense of deviation. That is to say, when the recording interface of the upper and lower split screens is used for recording in the dual-view recording mode, if the audio is processed using the method of this embodiment, the final audio obtained is the same as the audio obtained by processing the method in the aforementioned FIG. 9 There is no difference in essence. Therefore, the method flow of the embodiment of the present application will be described in conjunction with the recording interface 91 shown in FIG. 7B .
- the method provided in the embodiment of this application may include:
- the electronic device starts a dual-view video recording mode.
- the electronic device displays a corresponding recording interface according to adjustments made by the user to the recording interface.
- the electronic device collects audio.
- the electronic device records an image.
- step S201-step S204 can refer to the description of the step S101-step S104 in the above-mentioned embodiment corresponding to FIG. 9 , which will not be repeated here.
- the electronic device acquires the field of view angle of the picture.
- the user is using a front camera with a focal length of 6 ⁇ and a rear camera with a focal length of 2 ⁇ to perform dual-view video recording in picture-in-picture mode.
- the picture presented by the viewfinder frame 911 (hereinafter referred to as the main picture) is the picture taken by the rear camera with a focal length multiple of 2 ⁇ , which is the scenery in front of the electronic device 100;
- the picture presented by the viewfinder frame 912 (hereinafter referred to as The sub-picture) is the picture taken by the front camera with a focal length multiple of 6 ⁇ , which is the user's own face. It can be known from the above Table 1 that at this time, the viewing angle corresponding to the main picture is -90°, and the viewing angle corresponding to the sub-picture is +30°.
- the electronic device calculates the picture weight.
- the electronic device can calculate the picture weights wm and ws of the two pictures as follows:
- ⁇ indicates that the multiplication operation ws is the picture weight of the sub-picture, wm is the picture weight of the main picture, and ⁇ is the correction coefficient, which is a fixed value that has been set when the electronic equipment leaves the factory, and its value range is [1 , (Dl ⁇ Dw)/(d1 ⁇ dw)], in this way, it can prevent the value of the picture weight of the viewfinder frame with a smaller area from being too small due to the large gap between the areas of the two pictures.
- the electronic device calculates the orientation information of the sub-picture.
- the recording interface 91 shown in FIG. 7B is taken as an example for description.
- the center point O of the main picture is taken as the origin, and a plane Cartesian coordinate system is established on the plane where the mobile phone screen is located.
- the horizontal and vertical coordinate values of the central point F of the small picture are (a, b). Cherry blossoms are enough, a and b are values with positive and negative signs. It is stipulated that the abscissa of the point on the left side of the Y axis is negative, and the abscissa of the point on the right side of the Y axis is positive.
- the vertical coordinates of points on the lower side of the X-axis are negative values, and the vertical coordinates of points on the upper side of the X-axis are positive values.
- the length of the main screen is D1
- the width is Dw
- the unit is the same as the unit of the coordinate axis.
- the orientation of the sub-picture relative to the main picture can be represented by the azimuth z and the elevation angle e, where:
- the electronic device filters the audio based on the DSB algorithm.
- step S208 the electronic device collects the audio and performs frequency domain conversion on the audio for a specific manner, which may refer to the above-mentioned related description of step S107 in FIG. 9 , which will not be repeated here.
- the electronic device After collecting the audio signal in the environment, the electronic device will respectively filter the audio signal according to the field angle information of the two images to obtain audio signals corresponding to the two images.
- the recording interface 91 shown in FIG. 7B is also take as an example for illustration.
- the viewing angle corresponding to the main picture is -90°
- the viewing angle corresponding to the sub-picture is +30°.
- the electronic device can calculate the weight vector required for filtering the first picture and the second picture through the above-mentioned DSB algorithm with And filter the collected audio according to these two weight vectors to obtain beams ys( ⁇ ) and ym( ⁇ ) corresponding to the first picture and the second picture, which can be expressed as:
- the beam ys( ⁇ ) is the audio signal 3 in the same angle direction as ⁇ FPF' in Fig. 8A; Audio signal 4 in angular direction (not shown in Fig. 8A). It should be noted that neither the beam ys( ⁇ ) nor the beam ym( ⁇ ) obtained here distinguishes the left channel from the right channel.
- the electronic device virtualizes the orientation of the sound source of the sub-picture.
- the sound source orientation virtualization can be realized by adjusting the sound mixing ratio or HRTF filtering.
- the embodiment of the present application exemplarily provides the process of virtualizing the azimuth of the audio of the sub-picture in this embodiment by using the HRTF filtering method.
- the database required for the HRTF filtering method can be selected such as the CIPIC HRTF database of the University of California, Davis, and the HRTF database of Peking University.
- the database can also be calculated by HRTF modeling. This embodiment of the present application does not limit it.
- This embodiment adopts the open-source CIPIC database, and selects the data for convolution according to the azimuth z and the elevation angle e, and the selection method is:
- CIPIC_HRIR for convolution is:
- step S201-step S209 The processing of the audio of the main screen by the electronic device is the same as that of step S101-step S107 in FIG. 9 . After remixing, the output of the audio corresponding to the main picture on the left channel is actually the same as the output on the right channel.
- this embodiment virtualizes the orientation of the audio signal of the sub-picture. Since the ysl(t) obtained in step S209 for the sub-picture is an audio signal biased towards the left channel, The output ysr(t) is an audio signal that is biased toward the right channel; after remixing, the output of the audio corresponding to the main picture on the left channel is different from that on the right channel.
- the electronic device remixes the sound source.
- the electronic device After obtaining the corresponding beams and picture weights of the two pictures, the electronic device will combine the picture weights of the two pictures to mix the two beams.
- outl(t) and outr(t) can be respectively output through two speakers on the electronic device.
- outr( ⁇ ) since in the recording interface 91 shown in FIG. ), outr( ⁇ ), the user will feel that the sound coming out of the sub-picture is coming from the left side of the user.
- the area of the sub-picture is smaller than that of the main picture, the user will feel that the sound in the main picture is louder than the sound in the sub-picture when listening to the audio.
- the electronic device determines whether the interface changes.
- the electronic device saves the processed audio.
- step S211-step S212 can refer to the description of the step S101-step S104 in the above-mentioned embodiment corresponding to FIG. 9 , which will not be repeated here.
- the audio processing method provided by this embodiment is also applicable to recording interfaces shown in FIGS. 5, 6, 7A and 7C, and other recording interfaces.
- the recorded audio is processed.
- the electronic device can also obtain the front/rear information, focal length information and picture weight of the two pictures in the recording interface, and use the audio processing method provided by this embodiment to process the two pictures in the recording interface.
- the audio recorded under the recording interface 91 is filtered and fused. No more examples here.
- Step S210 That is to say, when using the method shown in FIG. 10 to process the audio collected under the recording interface with the upper and lower split screens, the specific processing flow can be changed to the processing flow shown in FIG. 9 .
- FIG. 11 is a flowchart of another audio processing method provided by the embodiment of the present application.
- This method filters the collected audio signal based on the CVX beam training method, and can use the field angle information of the two pictures to make a more specific selection of the filtering direction of the audio signal, so that the beam obtained by each picture can have a left sound The difference between channel and right channel. Can further enhance the three-dimensional audio.
- the method provided in the embodiment of this application may include:
- the electronic device starts a dual-view recording mode.
- the electronic device displays a corresponding recording interface according to the user's adjustment to the recording interface.
- the electronic device collects audio.
- the electronic device records an image.
- the electronic device acquires the field of view angle of the picture.
- the recording interface 90 shown in FIG. 7A Electronic equipment is using a front camera with a fixed focal length of 1 ⁇ and a rear camera with a focal length of 3 ⁇ to perform dual-view video recording in the upper and lower split-screen mode.
- the picture presented by the upper viewfinder frame 901 (hereinafter referred to as the first picture) is the picture taken by the front camera, which is the user's own face;
- the picture presented by the lower viewfinder frame 902 (hereinafter referred to as the second picture) is a picture captured by the rear camera, which is a landscape image in front of the electronic device 100 . It can be known from the above Table 1 that the viewing angle of the first picture is +180° at this time, and the viewing angle of the second picture is -90°.
- the two beams corresponding to the picture with the difference between the left and right channels can be obtained.
- the electronic device obtains the first picture and the field of view of the second picture, the field of view will be divided.
- ⁇ OPO' (on the left side) is the viewing angle of the first picture
- ⁇ BPB' is the viewing angle of the first picture
- the normal line QQ' of the plane where the electronic device 100 is located is the angle bisector of ⁇ OPO' (left side) and ⁇ BPB'
- the normal line QQ' divides ⁇ OPO' (left side) into ⁇ OPQ' (hereinafter referred to as the left side) Field of view 1) and ⁇ O'PQ' (hereinafter referred to as right field of view 1)
- ⁇ BPB' is divided into ⁇ BPQ (hereinafter referred to as left field of view 2) and ⁇ B'PQ (hereinafter referred to as right Field of view 2). That is, the viewing angle of the first picture can be divided into left viewing angle 1 and right viewing angle 1; the viewing angle of the second picture can be divided into left viewing angle 2 and right viewing angle 2.
- the electronic device calculates the picture weight.
- step S306 please refer to the description of step S106 in the embodiment corresponding to FIG. 9 above.
- the picture weights w1 and w2 of the two pictures are respectively:
- w1 is the picture weight of the first picture
- w2 is the picture weight of the second picture
- the electronic device filters the audio based on the CVX beam training method.
- the beamforming algorithm is taken as an example to further illustrate the process of sound source separation.
- step S107 in Figure 9 the calculation formula used for filtering is:
- x i ( ⁇ ) represents the audio signal collected by the i (i ⁇ M) microphone in the electronic device
- w i ( ⁇ ) is obtained by the CVX beam training method, which represents The weight vector of the beamformer for the i-th microphone at the frequency of the audio signal ⁇ . It should be understood that no matter what algorithm implements audio filtering, w i ( ⁇ ) is a necessary parameter in the algorithm that is strongly related to the filtering direction.
- w i ( ⁇ ) is obtained based on the CVX beam training method.
- its input includes the distance between the i-th microphone and other (M ⁇ 1) microphones, and the angle of field of view (ie, the filtering direction).
- the input filtering direction can be changed flexibly.
- step S105 the viewing angle information of the two frames has been calculated.
- the field of view of a picture can be divided according to the left and right as the input of the method to obtain two different weight vectors w il ( ⁇ ) and w ir ( ⁇ ) right,
- two beams corresponding to the picture can be obtained, and these two beams have the difference between the left channel and the right channel.
- the electronic device After collecting the audio signal in the environment, the electronic device will respectively filter the audio signal according to the field angle information of the two images to obtain audio signals corresponding to the two images.
- the recording interface 90 shown in FIG. 7A is used for description. It can be seen from the foregoing description of step S305 that the recording interface 90 shown in FIG. 7A is in a top-bottom split-screen mode.
- the viewing angles of the two images in the recording interface 90 are divided into left viewing angle 1 and right viewing angle 1 , left viewing angle 2 and right viewing angle 2 .
- the electronic device can use the left field of view 1 and the right field of view 1 as input to calculate the weight vector required for filtering the first picture, with And filter the collected audio according to these two weight vectors to obtain the left channel beam corresponding to the first picture and the right channel beam It can be expressed as:
- the electronic device can use the left field of view 2 and the right field of view 2 as inputs to calculate the weight vector required for filtering the first picture, with And filter the collected audio according to these two weight vectors to obtain the left channel beam corresponding to the second picture and the right channel beam It can be expressed as:
- the electronic device remixes the sound source.
- the electronic device After obtaining the corresponding beams and picture weights of the two pictures, the electronic device will combine the picture weights of the two pictures to mix the two beams.
- outl(t) and outr(t) can be output through two speakers on the electronic device respectively.
- the audio signal distinguishes the left and right directions. Therefore, the difference between the audio heard by the user's left and right ears may be more obvious.
- the areas of the two display areas are almost the same or equal (shown in (C) in Figure 5)
- the user will feel that the loudness of the sound in the two images is approximately equal when listening to the audio. It is not difficult to see that in this embodiment, when outl( ⁇ ) and outr( ⁇ ) are output by the user, the user can better perceive the difference between the left and right audio channels, making the sound more stereoscopic.
- the electronic device determines whether the interface changes.
- the electronic device saves the processed audio.
- step S309-step S310 refers to the description of the step S109-step S110 in the embodiment corresponding to FIG. 9 above, which will not be repeated here.
- the electronic device in the recording interface 91 shown in FIG. 7B , in the dual-view recording mode of picture-in-picture, the electronic device can also obtain the front/rear information, focal length information and picture weight of the two pictures under the recording interface (specifically Refer to the description of step S205-step S207 in FIG. 10), and then refer to the manner of step S305-step S308 to filter and fuse the audio. No more examples here.
- FIG. 12 is a flow chart of another audio processing method provided by the embodiment of the present application.
- the method can virtualize the orientation of the audio in the picture-in-picture mode and the dual-view video recording mode, and can further enhance the stereoscopic effect of the audio in conjunction with the orientation sense between the two pictures.
- the method provided in the embodiment of this application may include:
- the electronic device starts a dual-view video recording mode.
- the electronic device displays a corresponding recording interface according to the user's adjustment to the recording interface.
- the electronic device collects audio.
- the electronic device records an image.
- step S401-step S404 can refer to the description of the step S101-step S104 in the above-mentioned embodiment corresponding to FIG. 9 , which will not be repeated here.
- the electronic device acquires an angle of view of a picture.
- the user is using a front camera with a focal length of 6 ⁇ and a rear camera with a focal length of 2 ⁇ to perform dual-view video recording in picture-in-picture mode.
- the picture presented by the viewfinder frame 911 (hereinafter referred to as the main picture) is the picture taken by the rear camera with a focal length multiple of 2 ⁇ , which is the scenery in front of the electronic device 100;
- the picture presented by the viewfinder frame 912 (hereinafter referred to as The sub-picture) is the picture taken by the front camera with a focal length multiple of 6 ⁇ , which is the user's own face. It can be known from the above Table 1 that at this time, the viewing angle corresponding to the main picture is 90°, and the viewing angle corresponding to the sub-picture is 30°.
- the two beams corresponding to the picture with the difference between the left and right channels can be obtained.
- the electronic device acquires the main picture After the field of view of the main screen is divided, the field of view of the main screen will be divided.
- ⁇ FPF' is the field angle of the above-mentioned sub-picture (hereinafter referred to as field angle 3)
- ⁇ APA' is the field angle of the above-mentioned main picture (hereinafter referred to as field angle 3). 4).
- the normal line QQ' of the plane where the electronic device 100 is located is the angle bisector of ⁇ APA', and the normal line QQ' divides ⁇ APA' into ⁇ APQ (hereinafter referred to as the left viewing angle 4) and ⁇ A'PQ (hereinafter referred to as For the right field of view 4).
- ⁇ APQ hereinafter referred to as the left viewing angle 4
- ⁇ A'PQ hereinafter referred to as For the right field of view 4
- the electronic device calculates the picture weight.
- the electronic device calculates the orientation information of the sub-picture.
- step S406-step S407 can refer to the description of the step S206-step S207 in the above-mentioned embodiment corresponding to FIG. 9 , which will not be repeated here.
- the electronic device may calculate the picture weights wm and ws of the two pictures as follows:
- ws is the picture weight of the sub-picture
- wm is the picture weight of the main picture
- ⁇ is the correction coefficient, which is a fixed value that has been set when the electronic device leaves the factory, and its value range is [1, (Dl ⁇ Dw) /(d1 ⁇ dw)], in this way, it can prevent the image weight value of the viewfinder frame with a smaller area from being too small due to the large difference in the area of the two images.
- step S406 the electronic device can calculate that the azimuth z and the elevation angle e of the sub-picture relative to the main picture are:
- the electronic device filters the audio based on the CVX beam training method.
- step S408 the electronic device collects the audio and performs frequency domain conversion on the audio for a specific manner, which may refer to the above-mentioned related description of step S107 in FIG. 9 , which will not be repeated here.
- the electronic device After collecting the audio signal in the environment, the electronic device will respectively filter the audio signal according to the field angle information of the two images to obtain audio signals corresponding to the two images.
- step S405 it can be seen that in the recording interface 91 shown in FIG. 7B, the viewing angle of the sub-screen is 3, and the viewing angle 4 of the main screen is divided into left viewing angle 4 and right viewing angle. 4.
- the electronic device field of view angle 3 is used as an input to calculate the weight vector required for sub-picture filtering, And filter the collected audio according to the weight vector to obtain the beam corresponding to the sub-picture It can be expressed as:
- the electronic device can use the left field of view 4 and the right field of view 4 as input to calculate the weight vector required for filtering the main picture, with And filter the collected audio according to these two weight vectors to get the left channel beam corresponding to the main picture and the right channel beam It can be expressed as:
- the electronic device virtualizes the orientation of the sound source of the sub-picture.
- step S409 in FIG. 10 This embodiment adopts the open-source CIPIC database, and selects the data for convolution according to the azimuth z and the elevation angle e, and the selection method is:
- CIPIC_HRIR for convolution is:
- the electronic device remixes the sound source.
- the electronic device After obtaining the corresponding beams and picture weights of the two pictures, the electronic device will combine the picture weights of the two pictures to mix the two beams.
- outl(t) and outr(t) can be respectively output through two speakers on the electronic device.
- outr( ⁇ ) since in the recording interface 91 shown in FIG. ), outr( ⁇ ), the user will feel that the sound coming out of the sub-picture is coming from the left side of the user. And when the audio signal is filtered, the left and right directions are distinguished. Therefore, the difference between the audio heard by the user's left and right ears may be more obvious, and the stereoscopic effect of the audio is stronger.
- the area of the sub-picture is smaller than that of the main picture, the user will feel that the sound in the main picture is louder than the sound in the sub-picture when listening to the audio.
- the electronic device determines whether the interface changes.
- the electronic device saves the processed audio.
- step S211-step S212 can refer to the description of the step S101-step S104 in the above-mentioned embodiment corresponding to FIG. 9 , which will not be repeated here.
- the audio processing method provided by this embodiment is also applicable to recording interfaces shown in FIGS. 5, 6, 7A and 7C, and other recording interfaces.
- the recorded audio is processed.
- the electronic device can also obtain the front/rear information, focal length information and picture weight of the two pictures in the recording interface, and use the audio processing method provided by this embodiment to process the two pictures in the recording interface.
- the audio recorded under the recording interface 91 is filtered and fused. No more examples here.
- step S411 the electronic device detects that the recording interface has changed, and the changed recording interface is no longer the recording interface of the picture-in-picture mode, for example, when it is changed to a dual-view video recording mode with upper and lower split screens , since the relative orientations of the two display areas in the recording interface with the upper and lower split screens do not differ from left to right, the electronic device does not need to perform step S410 during the subsequent audio processing. That is to say, when using the method shown in FIG. 12 to process the audio collected under the recording interface with the upper and lower split screens, the specific processing flow can be changed to the processing flow shown in FIG. 11 .
- the processor in the electronic device may execute the audio processing method shown in FIG. 9 / FIG. 10 / FIG. 11 / FIG. 12 when audio and video recording starts. That is, after responding to the user operation of the control 704 in the user interface shown in FIG. 5 , the processor in the electronic device executes the audio processing method shown in FIG. 9 / FIG. 10 / FIG. 11 / FIG. 12 by default.
- the processor in the electronic device may also execute the audio processing method shown in Fig. 9/Fig.
- the processor in the electronic device executes the audio processing method shown in Figure 9/ Figure 10/ Figure 11/ Figure 12 when it finishes recording audio and video and stores the recorded audio and video signals into the memory, which can reduce the number of processors in the process of recording audio occupancy, improve the fluency of the audio recording process.
- the audio processing method shown in FIG. 9/FIG. 10/FIG. 11/FIG. 12 is only executed on the audio signal when the recorded audio signal needs to be saved, thereby saving processor resources.
- the electronic device can perform smooth processing based on the user's operations, such as adjusting the focal length of the camera and zooming the area of the viewfinder (including adjusting the position of the dividing line between the two viewfinders in the upper and lower split-screen mode, and now in-picture
- smooth processing based on the user's operations, such as adjusting the focal length of the camera and zooming the area of the viewfinder (including adjusting the position of the dividing line between the two viewfinders in the upper and lower split-screen mode, and now in-picture
- the electronic device when the user switches the camera corresponding to a viewfinder frame, the electronic device often requires a certain amount of processing time. For example, when the user switches the camera corresponding to a viewfinder from the front camera to the rear camera, or when switching the images between two viewfinders, the user often feels that the images are abrupt.
- the embodiment of the present application provides a method for smoothly switching audio.
- the application scene of this method is not limited to the dual-view camera mode, and may also be a single-view camera (ordinary camera) mode.
- the method may include:
- the electronic device switches the camera in the viewfinder frame from the historical camera to the target camera.
- the specific scenario of switching from the historical camera to the target camera includes but is not limited to switching the images between the two view frames when switching the camera corresponding to the view frame from the front camera to the rear camera.
- the electronic device may detect a click operation on the control 901B as shown in FIG. 7A, and switch the camera corresponding to the viewfinder frame 901 in FIG.
- the control 901B as shown in FIG. 7A
- the recording interface 92 shown in FIG. 7C For the user interface, reference may be made to the recording interface 92 shown in FIG. 7C .
- the electronic device may detect a click operation on the screen switching control 708 as shown in (D) in FIG.
- the electronic device acquires a historical audio signal and a target audio signal.
- the historical audio signal is the audio signal obtained by the electronic device filtering the audio based on the historical camera picture (field of view).
- the historical audio signal is the moment before the electronic device detects the operation of switching the camera, based on the picture
- the audio signal obtained by filtering is, for example, the audio signal at the moment when the user clicks the front camera switch button (such as 911B, 912B, screen switching control 708 ) or double-clicks the sub-viewfinder frame 802 .
- the target audio signal is an audio signal yb( ⁇ ) obtained by filtering the audio based on the picture (field of view) of the target camera by the electronic device.
- the camera corresponding to the viewfinder frame 901 in Figure 7A is switched from the front camera to the rear camera, and the switched user interface is as shown in Figure 7C
- the recording interface 92 is shown
- the historical signal ya( ⁇ ) is the audio signal 1 shown in FIG. 8A
- the target signal yb( ⁇ ) is the audio signal 3 shown in FIG. 8A.
- the electronic device dynamically adjusts the mixing ratio of the historical audio signal and the target audio signal according to the duration of switching from the historical camera to the target camera.
- the proportion of the historical signal yb( ⁇ ) in the mixing is represented by ⁇ , then the proportion of the target signal ya( ⁇ ) in the mixing is (1- ⁇ ), then a dynamic adjustment of ⁇
- the method can be expressed as:
- T is the time required for the electronic device to switch the historical camera to the target camera, the unit is ms, and its specific value is determined by the performance of the electronic device.
- the value of T is the time it takes for the camera to switch, for example, the front camera switches the rear;
- T1 is the frame length of the audio processing by the electronic device, which means that when the electronic device collects or processes the audio signal, it processes The frame count of the audio signal; for different electronic equipment, T and T 1 are related to the performance of the electronic equipment, and different electronic equipment may have different T and T 1 , but for fixed electronic equipment, T and T 1 are fixed values.
- t is the frame count, and the value range of t is [0, T/T 1 -1]. After the electronic device triggers the action of switching the historical camera to the target camera, the first frame of t is recorded as 0, and then accumulated until T/T 1 -1.
- the electronic device remixes the historical audio signal and the target audio signal according to the mixing ratio.
- the audio mixing ratio calculated by the electronic device for each frame remixes the above-mentioned historical audio signal and the target audio signal to obtain audio for other subsequent operations.
- the remixing method can be expressed as:
- yc( ⁇ ) is the remixed audio signal.
- the proportion of the target signal yb( ⁇ ) in the audio mixing will become larger and larger, and the proportion of the historical signal ya( ⁇ ) will become smaller and smaller.
- the proportion of the target signal yb( ⁇ ) is 1, and the proportion of the historical signal ya( ⁇ ) is 0.
- the audio can be smoothly switched in conjunction with the change of the picture, so that the user can feel that the direction of the sound also changes smoothly with the change of the picture.
- the embodiment of the present application also provides an electronic device, the electronic device includes: one or more processors and memory;
- the memory is coupled with the one or more processors, and the memory is used to store computer program codes, the computer program codes include computer instructions, and the one or more processors call the computer instructions to make the electronic device execute the aforementioned Methods shown in the Examples.
- the term “when” may be interpreted to mean “if” or “after” or “in response to determining" or “in response to detecting".
- the phrases “in determining” or “if detected (a stated condition or event)” may be interpreted to mean “if determining" or “in response to determining" or “on detecting (a stated condition or event)” or “in response to detecting (a stated condition or event)”.
- all or part of them may be implemented by software, hardware, firmware or any combination thereof.
- software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
- the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server, or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
- the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
- the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, solid state hard disk), etc.
- the processes can be completed by computer programs to instruct related hardware.
- the programs can be stored in computer-readable storage media.
- When the programs are executed may include the processes of the foregoing method embodiments.
- the aforementioned storage medium includes: ROM or random access memory RAM, magnetic disk or optical disk, and other various media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Social Psychology (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Computer Networks & Wireless Communication (AREA)
- Databases & Information Systems (AREA)
- Studio Devices (AREA)
Abstract
一种音频的处理方法及电子设备。在双景录像模式下,该电子设备可以根据两个显示区域中画面的焦距、两个显示区域的相对位置以及两个显示区域的面积大小对采集的音频进行过滤和方位虚拟和重混,使画面和声音的音画表现力得到同步,使用户对听觉和视觉上有同步的立体感受。
Description
本申请要求于2021年06月16日提交中国专利局、申请号为202110667735.X、申请名称为“音频的处理方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及终端技术领域,尤其涉及音频的处理方法及电子设备。
如今视频的拍摄和制作已经成为我们日常生活和娱乐中不可或缺的一环。在视频的拍摄以及播放场景中,用户对视频画面的观感以及视频音频的听感的要求日渐提高。现如今,在视觉方面,立体视觉技术已经趋于成熟,可以令视频的画面具有立体感。但是,视频的听感却无法配合视觉的观感去营造出同步的空间感受,影响了用户的体验感。
因此,如何配合视频画面的立体感为用户营造音频的立体感,是亟待解决的问题。
发明内容
本申请的目的在于提供一种音频的处理方法、图形用户界面(graphic user interface,GUI)及电子设备。电子设备可根据视频中图像界面信息对视频的音频进行渲染,使视频的音画表现力得到同步。在视频的图像界面发生变化时,视频的音频也会相应的根据图像界面进行调整。这样,视频的画面和声音可以同步地为用户带来立体感受,提供更好的用户体验感。
上述目标和其他目标将通过独立权利要求中的特征来达成。进一步的实现方式在从属权利要求、说明书和附图中体现。
第一方面,提供一种音频的处理方法,所述方法包括:显示第一界面,所述第一界面包括第一控件;检测到对所述第一控件的第一操作;响应于所述第一操作,在第一时刻T1开始拍摄,显示第二界面,所述第二界面包括第一显示区域和第二显示区域;经过第一时长t1,在第二时刻T2,电子设备在第一显示区域显示第一摄像头实时采集的第一画面,在第二显示区域显示第二摄像头实时采集的第二画面;在所述第二时刻T2,麦克风采集第一声音,所述第一声音为在所述第一时刻所述电子设备所处的实时环境的声音;检测到对第三控件的第二操作;响应于所述第二操作,停止拍摄,保存第一视频,所述第一视频包括所述第一画面和所述第二画面;显示第三界面,所述第三界面包括第三控件;检测到对所述第三控件的第三操作,播放所述第一视频;在所述第一视频的所述第一时长处,播放所述第一画面、所述第二画面和第二声音,所述第二声音是根据第一画面和第二画面的画面权重对所述第一声音进行处理后得到的。
实施第一方面提供的方法,可以在双景录像视频的播放过程中,使得用户所听声音和所见画面之间具备同步的立体感。
结合第一方面,在一种可能的实现方式中,所述电子设备以上下分屏的形式显示所述 第一显示区域和所述第二显示区域,所述第一显示区域的面积为第一面积,所述第二显示区域的面积为第二面积;所述第一画面的画面权重为所述第一面积与总面积的比值,所述第二画面的画面权重为所述第二面积与所述总面积的比值,所述总面积为所述第一面积和所述第二面积的面积之和。
结合第一方面,在一种可能的实现方式中,所述第一显示区域以悬浮窗的形式显示于所述第二显示区域上;所述第一显示区域的面积为第一面积,所述电子设备的显示屏的面积为第三面积;所述第一画面的画面权重为所述第一面积与所述第三面积的比值,所述第二画面的画面权重为整数1与所述第一画面的权重的差值。
结合第一方面,在一种可能的实现方式中,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:根据第一画面和第二画面的画面权重对所述第一子声音和第二子声音进行混音;在所述第一画面的画面权重大于所述第二画面的权重的情况下,采用第一混音比例,使所述第一子声音的响度大于所述第二子声音的响度;在所述第一画面的画面权重小于所述第二画面的权重的情况下,采用第二混音比例,使所述第一子声音的响度小于所述第二子声音的响度;在所述第一画面的画面权重等于所述第二画面的权重的情况下,采用第三混音比例,使所述第一子声音的响度等于所述第二子声音的响度。
在本实现方式中,用于所述第一音频包含的两个声音可以配合两个显示区域之间的大小,调整混音时的比例,为用户营造出面积大的显示区域中传出来的声音也更大的,面积小的显示区域中传出来的声音也更小的听觉感受。
结合第一方面,在一种可能的实现方式中,所述检测到对第三控件的第二操作之前,所述方法还包括:所述电子设备保存所述第一声音;根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:所述电子设备根据第一画面和第二画面的画面权重对所述第一声音进行处理,得到所述第二声音;所述电子设备保存所述第二声音,并将所述第一声音删除。
在本实现方式中,先对未处理的音频进行保存,在保存后在进行处理,可以在录制音频过程中减少处理器的占用,提高音频录制过程的流畅度。
结合第一方面,在一种可能的实现方式中,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:所述电子设备根据所述第一摄像头的第一视场角和所述第二摄像头的第二视场角,分别对所述第一声音进行滤波,得到所述第一子声音和所述第二子声音;所述电子设备根据所述第一画面和所述第二画面的画面权重,调整所述第一子声音和所述第二子声音的响度之后,将所述第一子声音和第二子声音进行混音,得到所述第二声音。
可理解的,在对双景录像模式录制的视频文件进行播放时,所述电子设备将在显示屏中显示两个画面,这两个画面在录制时是来自两个不同的摄像头的。在观看该视频文件时,两个画面的大小和视角的防线和范围可能是不一样的。比如说,所述第一画面中,呈现的是所述电子设备的前置摄像头拍摄的用户的人脸图像;所述第二画面中,呈现的是所述电 子设备局的后置摄像拍摄的风景图像。此外,用户可以通过调节所述前置摄像头和所述后置摄头的焦距倍数,来改变所述人脸图像和所述风景图像在所述第一画面和所述第二画面中所呈现的视角,当焦距倍数变小时,图像的大小在画面中的就会按比例缩小,用户在视觉上会感觉自己离画面中事物的距离更远了,但是视角范围变大;当焦距倍数变大时,则图像的大小在画面中的就会按比例方法,用户在视觉上会感觉自己离画面中事物的距离更近了,但是视角范围变小。
声音不仅有大小之分,声音也具备方向性,且这种方向性是可以被人类感知的。因此,在本申请实施例中,为了配合两个面积画面大小和两个画面所呈现给用户的视角范围带给用户的观感,本方法中,通过结合两个画面的视角范围,电子设备对在所述第一画面视角方向上采集的音频进行过增强,得到一个和所述第一画面的视角方向对应的声音;并对在所述第二画面的视角方向上采集的音频进行过增强,得到一个和所述第一画面的视角方向对应的声音,在根据所述第一显示区域和所述第二显示区域的面积,调整这两个声音融合的比例,得到与所述第一画面对应的第一子声音以及与所述第二画面对应的第二子声音。将所述第一子声音和所述第二子声音混合后,得到最后用输出的音频(即所述第二声音)。
结合第一方面,在一种可能的实现方式中,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:所述电子设备根据所述第一摄像头的第一视场角和所述第二摄像头的第二视场角,分别对所述第一声音进行滤波,得到所述第一子声音和所述第二子声音;所述电子设备获取所述第一显示区域相对于所述第二显示区域的第一方位信息;所述电子设备根据所述第一方位信息对所述第一子声音进行方位虚拟,得到第一左方位声音和第一右方位声音;所述电子设备根据所述第一画面和所述第二画面的画面权重,调整所述第一左方位声音、第一右方位声音以及所述第二子声音的响度之后,将所述第一子声音和第二子声音进行混音,得到所述第二声音。
在实际拍摄时,第一显示区域的画面应为所述电子设备的正前方或者正后方。但是由于在画中画模式下,所述第一显示区域包含在所述第二区域中,且所述第一显示区域的方位是可调整的。因此,第一显示区域在第二显示区域中的位置在视觉上可能为偏左和偏右。在本实现方式中,所述电子设备将对所述第一方位信息对所述第一方位声音进行方位虚拟,使用户对所述第一子声音的感知方向能和所述第一显示区域的方位相契合。
结合第一方面,在一种可能的实现方式中,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:所述电子设备根据所述第一摄像头的第一视场角和所述第二摄像头的第二视场角,分别对所述第一声音进行滤波,得到所述第一子声音和所述第二子声音;所述第一子声音包括第一左声道声音和第一右声道声音,所述第一左声道声音由所述电子设备根据所述第一视场角的左半角对所述第一声音进行滤波得到;所述第一右声道声音由所述电子设备根据所述第一视场角的右半角对所述第一声音进行滤波得到;所述第二方向声音包括第二左声道声音和第二右声道声音,所述第二左声道声音由所述电子设备根据所述第二视场角的左半角对所述第一声音进行滤波得到;所述第二右声道声音由所述电子设备根据所述第二视场角的右半角对所述第 一声音进行滤波得到。
所述电子设备根据所述第一画面和所述第二画面的画面权重,调整所述第一左声道声音、所述第一右声道声音、所述第二左声道声音以及所述第二右声道声音的响度之后,将所述第一子声音和第二子声音进行混音,得到所述第二声音。
在本实现方式中,所述电子设备在结合画面的视场所述初始音频进行增强的过程中,可以将视场角按左右两个方向进行区分,对每个画面均得到区分左声道和右声道的两个声音,能使最后的得到的用于输出的音频更具立体感。
结合第一方面,在一种可能的实现方式中,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述检测到对第三控件的第二操作之前,所述方法还包括:在第三时刻T3,响应于切换摄像头的操作,所述电子设备将所述第一显示区域显示的画面由所述第一摄像头拍摄的画面切换为第三摄像头拍摄的画面;在第四时刻T4,所述电子设备在第一显示区域显示所述第三摄像头拍摄的第三画面,所述第四时刻T4在所述第三时刻T3之后;所述电子设备根据所述第三摄像头的第三视场角和所述第一摄像头的第一视场角,分别对所述第一声音进行滤波,得到历史声音和目标声音;在所述第三时刻T3和所述第四时刻T4的时间内,所述电子设备根据将所述第三时刻T3和所述第四时刻T4之间的时间间隔,动态调整所述历史声音和所述目标声音的混音比例,并基于所述混音比例将所述历史声音和所述目标声音混音,得到所述第一子声音。
所述电子设备切换所述第一显示区域对应的摄像头时,该第一显示区域中画面的视场角也随之改变,电子设备基于该画面对音频进行滤波所得的音频信号也会改变。但是,由于电子设备进行镜头切换时在往往需要一定的处理时间,而电子设备基于镜头切换音频的调整短在极短的时间内就能完成。这样,画面的观感和音频的听感可能会有不平衡性。因此,在本实现方式中,所述电子设备可以在切换摄像头的过程中,动态调整前后两个画面所得的声音在第三音频中的比例,将声音方向的转变更为缓慢地发生,使所述音频的切换能够平滑进行。
第二方面,本申请实施例提供一种电子设备,所述电子设备包括:一个或多个处理器和存储器;所述存储器与所述一个或多个处理器耦合,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,所述一个或多个处理器调用所述计算机指令以使得所述电子设备执行第一方面或第一方面的任意可能的实现方式中的方法。
第三方面,提供一种芯片系统,所述芯片系统应用于电子设备,所述芯片系统包括一个或多个处理器,所述处理器用于调用计算机指令以使得所述电子设备执行如第一方面中任一可能的实现方式,或如第二方面中任一可能的实现方式。
第四方面,一种包含指令的计算机程序产品,其特征在于,当上述计算机程序产品在电子设备上运行时,使得上述电子设备执行如第一方面中任一可能的实现方式,或如第二方面中任一可能的实现方式。
第五方面,提供一种计算机可读存储介质,包括指令,其特征在于,当上述指令在电子设备上运行时,使得上述电子设备执行如第一方面中任一可能的实现方式,或如第二方面中任一可能的实现方式。
图1A为本申请实施例提供的一种摄像头和视场角之间对应关系的示意图;
图1B为本申请实施例提供的一种电子设备前/后置摄像头的视场角范围的示意图;
图2A为本申请实施例提供的一种双景录像下的一种录制界面图;
图2B为本申请实施例提供的一种电子设备播放双景录像模式下所录音频的示意图;
图3为本申请实施例提供的电子设备100的结构示意图;
图4为本申请实施例提供的一些人机交互界面示意图;
图5为本申请实施例提供的一些人机交互界面示意图;
图6为本申请实施例提供的一些人机交互界面示意图;
图7A为本申请实施例提供的一种录制界面的示意图;
图7B为本申请实施例提供的另一种录制界面的示意图;
图7C为本申请实施例提供的又一种录制界面的示意图;
图8A为本申请实施例提供的一种电子设备结合画面的视场角对音频信号进行过滤的一种场景示意图;
图8B为本申请实施例提供的另一种电子设备结合画面的视场角对音频信号进行过滤的一种场景示意图;
图9为本申请实施例提供的一种音频的处理方法的流程图;
图10为本申请实施例提供的另一种音频的处理方法的流程图;
图11为本申请实施例提供的又一种音频的处理方法的流程图;
图12为本申请实施例提供的又一种音频的处理方法的流程图;
图13为本申请实施例提供的一种平滑切换音频的方法的流程图。
本申请以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“所述”、“上述”、“该”和“这一”旨在也包括复数表达形式,除非其上下文中明确地有相反指示。还应当理解,本申请中使用的术语“和/或”是指并包含一个或多个所列出项目的任何或所有可能组合。
由于本申请实施例涉及神经网络的应用,为了便于理解,下面先对本申请实施例涉及的相关术语及神经网络等相关概念进行介绍。
(1)双景录像模式
双景录像模式是指,电子设备中的多个摄像头,例如前置摄像头和后置摄像头,可以同时录制两路视频。在双景录像模式下,在录像预览或者录像过程中或者在对已录制视频的播放过程中,显示屏可以在同一界面上同时显示来自这两个摄像头的两幅图像。这两幅图像可以在同一界面上拼接显示,或者以画中画的方式显示。双景录像包括但不限于以下几种常用的录像模式:
1.上下分屏
即将设备的显示屏分为上下两个显示界面,上下两个显示界面不交叠。上显示界面和 下显示界面的面积可以相同,也可以不同。
2.画中画
即将设备的显示屏分为一大一小的两个显示界面,较小的显示界面包含于较大的显示界面中。其中,较大的显示区域一般铺满设备的屏幕,较小显示区域中的图像可以覆盖于较大的显示区域中的图像之上。在一些情况中,较小的显示区域还支持进行缩放,其在设备屏幕中的位置还可以变换。关于该显示方式,后续实施例会详细说明。
另外,在双景录像模式下,这两个摄像头的两幅图像拍摄的多幅图像可以保存为图库(又可称为相册)中的多个视频,或者这多个视频拼接成的合成视频。
其中,“录像”也可以称为“录制视频”。在本申请以下实施例中,“录像”和“录制视频”表示相同的含义。
“双景录像模式”只是本申请实施例所使用的一些名称,其代表的含义在本申请实施例中已经记载,其名称并不能对本实施例构成任何限制。
(2)焦距和视场角
焦距是从镜头的中心点到焦平面上所形成的清晰影像之间的距离,光学系统中衡量光的聚集或发散的度量方式。焦距的大小决定着视角的大小,焦距数值小,视场角大,所观察到的范围也大;焦距数值大,视场角小,观察范围小。根据焦距能否调节,可分为定焦镜头和变焦镜头两大类。当对同一距离远的同一个被摄目标拍摄时,镜头焦距长的所成的像大,镜头焦距短的所成的像小。在光学仪器中,以光学仪器的镜头为顶点,以被测目标的物象可通过镜头的最大范围的两条边缘构成的夹角,称为视场角。视场角的大小决定了光学仪器的视野范围,视场角越大,视野就越大,光学倍率就越小。通俗地说,目标物体超过这个角就不会被收在镜头里。焦距与视场角成反比,即焦距越大,视场角越小,反之视场角越大。以图1A为例,摄像头在拍摄的过程中可以对相机的焦距进行调节,其提供了1×(图中未给出)、2×、3×、4×、5×以及6×六个档位的焦距档位。不难理解,当焦距为1×的时候,摄像头所能拍摄的视场角为最大的,即摄像头正前方的180°。当对焦距进行调节后为2×后,如图1A所示,其视场角已经变为84°了;如果继续将焦距调整为6×,如图1A所示,其视场角则只剩下正前方30°了。
(3)视场角范围大小和正负
图1B为本申请实施例提供的一种电子设备前/后置摄像头的视场角范围的示意图。如图1B所示,为方便读者理解,图1B所呈现的为电子设备直立放置时的俯视图,电子设备可以视作点P。则OPO’为电子设备所在的平面。其中OPO’左侧表示电子设备前置摄像头所在的一侧,OPO’右侧表示电子设备100后置摄像头所在的一侧。在图1B中,OPO’左侧的阴影部分表示电子设备前置摄像头所能获取到的最大的视场角范围,OPO’右侧的空白部分表示电子设备后置摄像头所能获取到的最大视场角范围。图1B中示出的视场角示意图仅为了方便说明,在实际的场景中也为表现为其他形式。
在本申请以下的实施例中,以电子设备所在的平面OPO’为界限,规定:OPO’左侧的视场角为正,OPO’右侧的视场角为负,以此将空间中0°至于360°的视角分为0°至+180°以及-180°至0°两个象限。也就是说,在本申请以下实施例中,电子设备的前置摄像头的视场角均为正值,后置摄像头的视场角均为负值。在图1B中,各点与点P的连线表示在 某焦距下,摄像头视场角的边界。例如,假设电子设备的后置摄像头在焦距为3×时视场角为90°,则在图1B中,即为∠BPB’=-90°,射线BP和射线B’P即为后置摄像头在3×焦距下视场角的边界。假设电子设备的前置摄像头在焦距为6×时视场角为30°,则在图1B中,即为∠FPF’=+30°,射线FP和射线F’P即为前置摄像头在6×焦距下视场角的边界。
应理解,在电子设备出厂时,电子设备相机所提供的焦距与其对应的视场角之间的对应关系是固定的。也就是说,在用户对摄像头的焦距进行选择后,电子设备即能获取该摄像头在该焦距下的对应的视场角的角度值,该角度值可以反映视场角的大小和方向。
(4)声源分离
即跟踪目标语音并抑制或者消除干扰语音。在复杂的声学环境下,麦克风采集的语音信号包含目标语音信号以及其他的干扰信号。例如,在日常生活中演讲者通过麦克风说话时,除了目标说话人的语音信号之外,还经常会伴随者其他说话人的语音,比如在室外、街道等场景,干扰人的信号会严重影响目标人语音的识别性能,这个时候就需要通过声源分离来跟踪目标语音并抑制或者消除干扰语音。
(5)凸优化(convex,CVX)波束训练
CVX是MATLAB的一个工具箱,是用于构建和解决约束凸规划(DCP)的建模系统。CVX波束训练可以利用MATLAB选择不同的阵列形式,利用凸优化的方法进行波束形成。
(6)延时求和波束形成算法(delay and sum beamforming,DSB)
用于对不同麦克风信号之间的相对延迟进行补偿,然后叠加延时后的信号形成一个单一的输出,使波束指向某一空间方向。
(7)画面权重
双景录像模式下,在既定的界面形式下,可以计算出单个取景框与电子设备显示屏之间的面积之比,作为该单个取景框在该界面形式下的画面权重。
在一些实施例中,双景录像模式下两个取景框拼接后形成的区域布满显示屏的显示区域,此时两个取景框的画面权重之和的值为1。例如图5中(C)所示,通过计算可以得出,取景框701的画面权重为1248/2340,取景框702的画面权重为1092/2340,二者占比之和为1。同理,在一些实施例中,当双景录像模式下两个取景框拼接后形成的区域未布满显示屏的显示区域时,此时两个取景框的画面权重之和可能小于1。
不限于上述计算画面权重比例的方式,本申请实施例也可以采用其他的方式计算两个取景框的画面权重,只要该给方式计算得到的两个画面权重能够表征两个取景框的面积的大小关系即可。例如,在一些实施例中,可以将单个取景框的面积与两个取景框面积之和的比例作为该单个取景框的画面权重,这样可以确保两个取景框的画面权重比例之和为1,也更便于计算。
例如,在一些实施例中,两个取景框的画面权重的计算方式还可以为:
w1=α×S1/(S1+S2)
w2=1-w1
其中,w1表示两个取景框中面积较小的取景框的画面权重,S1为该取景框的面积;w2表示两个取景框中面积较大的取景框的画面权重,S2为该取景框的面积。α为校正系数,是在电子设备出厂时已经设定好的定值,其取值范围为[1,(S1+S2)/S1]。这样,可以防 止两个取景框面积差距太大而导致面积较小的取景框的画面权重的值过小。
(8)声源方位虚拟技术
人有两个耳朵,却能定位来自三维空间的声音,这得力于人耳对声音信号的分析系统。因为声音信号从音源到达人耳(鼓膜前)的过程中可能会有反射、叠加等过程,因此,从空间任意一点传到人耳的信号都可以用一个滤波系统来描述,音源经滤波器处理得到的就是两耳鼓膜前的声音信号。这个传输系统是一个黑盒子,我们不必关心声音是如何传递到双耳的,而只需关心音源和双耳信号的差别。如果我们得到这组描述空间信息的滤波器(也可称为传递函数),就能还原来自空间这个方位的声音信号(如通过双声道耳机)。如果我们有空间所有方位到双耳的滤波器组,就能得到一个滤波矩阵,从而还原来自整个空间方位的声音信号,这就是声源方位虚拟技术的作用。
(9)头部反应传送函数(head-response transfer function,HRTF)
HRTF是一种声音定位的处理技术,可以看成是一个特定位置的声音传输到左右耳的频率响应。由于声音会从耳廓、或肩膀反射到人耳内部,于是当我们用两个音箱模拟声音定位时,可以利用特定的运算方式,来计算不同方向或位置声音所产生的大小和音调等,进而制造出立体空间声音定位的效果。
(10)CIPIC_HRIR数据
HRTF是一个特定位置的声音传输到左右耳的频率响应,HRTF对应的时域响应叫HRIR。要使音源具备HRTF的特性,只需要将音源与HRIR数据做卷积运算即可。CIPIC_HRIR数据就是美国加利福尼亚大学戴维斯分校CIPIC HRTF数据库提供的一组HRIR数据。
(11)卷积
卷积是分析数学中一种重要的运算。假设f(x),g(x)是R1上的两个可积函数,作积分:
可以证明,关于几乎所有的实数x,上述积分是存在的。这样,随着x的不同取值,这个积分就定义了一个新函数h(x),称为函数f与g的卷积,记为h(x)=(f*g)(x)。
(12)声音重混
把多种来源的声音,整合至一个立体音轨或单音音轨中。常见的混音算法有直接加和法以及平均调整权重法等。
(13)快速傅立叶变换算法(fast fourier transformation,FFT)
FFT是离散傅立叶变换的快速算法,可以将一个信号由时域变换到频域。
(14)快速傅里叶逆变换算法(inverse fast fourier transform,IFFT)
IFFT是与FFT对应的快速傅立叶反变换的算法,可以将一个信号频域由变换到时域。
如今视频的拍摄和制作已经成为我们日常生活和娱乐中不可或缺的一环。各种各样的摄影设备以及拍摄模式称为各电子厂商的研发热点。“双景录像”是一种新兴的录像方式,在录制时,设备可以同时调用两个镜头进行视频录制,同时呈现例如特写和全景、前置摄像和后置摄像这样这两种不同的视野下的图景,做出不同的画面组合,在视觉上形成的巨大反差。
图2A示例性的给出了电子设备在双景录像下的一种拍摄界面。如图2A所示,电子 设备200同时调用了前置摄像头和后置摄像头进行拍摄。此时,拍摄界面20中被分隔线203分为显示区域201和显示区域202两个显示区域,其中:
显示区域201显示的为电子设备200的前置摄像头摄取到的图像,其图像为正在录像的用户的人脸。显示区域202显示的为电子设备200的后置摄像头摄取到的图像,其图像为用户前方的风景图。在双景录像模式下,两种具备明显视觉差异的画面可以被同时呈现在一个画面中。
应理解,在电子拍摄视频时,电子设备除了会通过摄像头等图像采集设备录取场景的画面信息之外,还会通过如麦克风等音频采集设备录取环境中的音频信息。
在一种实施方式中,双景录像的音频信息可以按照普通录音录像的模式录制和输出音频。图2B为本申请实施例提供的一种电子设备播放双景录像模式下所录音频的示意图。如图2B所示,电子设备将麦克风采集到的空间中全方位的音频信号进行必要的渲染和过滤操作,例如改变音色、去噪等,最后将通过扬声器播放。
图2B中,电子设备仅仅对的音频进行了简单的渲染和过滤,可能会让音频的听感变的更加清晰一些。但是双景录像视频的录制和播放涉及两个画面,画面与画面之间存在大小和方位的关系,且这种关系在录制的过程中是可以发生变化的。因此,图2B所提供的这种单一的音频录播方式不能很好的契合双景录像在画面上给用户的视觉差异。这样,用户在听觉上不能充分感知双景录像模式下不同画面之间的差异性,用户体验较差。
基于上述问题,本申请实施例提供了一种音频的处理方法和电子设备,该电子设备的双景录像模式包含多个样式的分屏录像方式。在分屏录像时,当用户更改分屏录像的方式、对两个屏幕的大小进行缩放或对两个屏幕的画面进行切换时,该电子设备可以根据图像界面信息(例如两个显示区域录制时的焦距、两个显示区域的相对位置和面积大小等)对采集的音频进行相应的处理,使用户在听觉和视觉上有同步的立体感。
首先,介绍本申请实施例提供的电子设备。
该电子设备以是手机、平板电脑、可穿戴设备、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personaldigital assistant,PDA)或专门的照相机(例如单反相机、卡片式相机)等,本申请对该电子设备的具体类型不作任何限制。
图3示例性示出了该电子设备的结构。
如图3所示,电子设备100可具有多个摄像头193,例如前置摄像头、广角摄像头、超广角摄像头、长焦摄像头等。此外,电子设备100还可包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。
其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器 180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processingunit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
在一些实施例中,控制器或GPU等处理器110,可以用于在双景录像场景下,将两个摄像头193同时采集到的多帧图像,通过拼接或局部叠加等方式合成显示于取景框中的预览图像,以便电子设备100可以同时显示这两个摄像头193采集到的图像。
在另一些实施例中,控制器或GPU等处理器110,还可以用于在双景录像拍摄场景下,对每个摄像头193采集到的图像进行防抖处理后,再将多个摄像头193对应的防抖处理后的图像进行合成。
其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuitsound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purposeinput/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器110可以包含多组I2C总线。处理器110可以通过不同的I2C总线接口分别耦合触摸传感器180K,充电器,闪光灯,摄像头193等。例如:处理器110可以通过I2C接口耦合触摸传感器180K,使处理器110与触摸传感器180K通过I2C总线接口通信,实现电子设备100的触摸功能。
I2S接口可以用于音频通信。在一些实施例中,处理器110可以包含多组I2S总线。 处理器110可以通过I2S总线与音频模块170耦合,实现处理器110与音频模块170之间的通信。在一些实施例中,音频模块170可以通过I2S接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。
PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频模块170与无线通信模块160可以通过PCM总线接口耦合。在一些实施例中,音频模块170也可以通过PCM接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。所述I2S接口和所述PCM接口都可以用于音频通信。
UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它将要传输的数据在串行通信与并行通信之间转换。在一些实施例中,UART接口通常被用于连接处理器110与无线通信模块160。例如:处理器110通过UART接口与无线通信模块160中的蓝牙模块通信,实现蓝牙功能。在一些实施例中,音频模块170可以通过UART接口向无线通信模块160传递音频信号,实现通过蓝牙耳机播放音乐的功能。
MIPI接口可以被用于连接处理器110与显示屏194,摄像头193等外围器件。MIPI接口包括摄像头串行接口(camera serial interface,CSI),显示屏串行接口(display serial interface,DSI)等。在一些实施例中,处理器110和摄像头193通过CSI接口通信,实现电子设备100的拍摄功能。处理器110和显示屏194通过DSI接口通信,实现电子设备100的显示功能。
GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器110与摄像头193,显示屏194,无线通信模块160,音频模块170,传感器模块180等。GPIO接口还可以被配置为I2C接口,I2S接口,UART接口,MIPI接口等。
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如AR设备等。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理 模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wirelesslocal area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(codedivision multiple access,CDMA),宽带码分多址(wideband code division multipleaccess,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidounavigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellitesystem,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes, QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。通过NPU还可以实现本申请实施例提供的决策模型。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信 号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备100可以通过扬声器170A收听音乐,收听视频中的声音或收听免提通话。本申请实施例中,扬声器170A的数量可以为一个,也可以为两个或者超过两个。在本申请实施例提供的音频处理方法中,当电子设备100的扬声器170A的数量超过两个时,可以支持播放双声道的音频。此外,当电子设备100的扬声器170A的数量为两个时(这里将两个扬声器分别称为170A-1和170A-2),扬声器170A-1和170A-2可以分别设置于电子设备100的上方和下方。应注意,这里提到的“上方”和“下方”是基于电子设备被正立放置时的“上方”和“下方”。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。电子设备100可以设置至少一个麦克风170C。在另一些实施例中,电子设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,电子设备100根据压力传感器180A检测触摸操作强度。电子设备100也可以根据压力传感器180A的检测信号计算触摸的位置。
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测电子设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
气压传感器180C用于测量气压。在一些实施例中,电子设备100通过气压传感器180C测得的气压值计算海拔高度,辅助定位和导航。
磁传感器180D包括霍尔传感器。电子设备100可以利用磁传感器180D检测翻盖皮套的开合。在一些实施例中,当电子设备100是翻盖机时,电子设备100可以根据磁传感 器180D检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。电子设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备100可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备100通过发光二极管向外发射红外光。电子设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备100附近有物体。当检测到不充分的反射光时,电子设备100可以确定电子设备100附近没有物体。电子设备100可以利用接近光传感器180G检测用户手持电子设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器180L用于感知环境光亮度。电子设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。在一些实施例中,电子设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值,电子设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备100对电池142加热,以避免低温导致电子设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
骨传导传感器180M可以获取振动信号。在一些实施例中,骨传导传感器180M可以获取人体声部振动骨块的振动信号。骨传导传感器180M也可以接触人体脉搏,接收血压跳动信号。在一些实施例中,骨传导传感器180M也可以设置于耳机中,结合成骨传导耳机。音频模块170可以基于骨传导传感器180M获取的声部振动骨块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于骨传导传感器180M获取的血压跳动信号解析心率信息,实现心率检测功能。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。 电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏194不同区域的触摸操作,马达191也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和电子设备100的接触和分离。电子设备100可以支持一个或多个SIM卡接口。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口195可以同时插入多张卡。多张卡的类型可以相同,也可以不同。SIM卡接口195也可以兼容不同类型的SIM卡。SIM卡接口195也可以兼容外部存储卡。电子设备100通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,电子设备100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在电子设备100中,不能和电子设备100分离。
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。不限于集成于处理器110中,ISP也可以设置在摄像头193中。
在本申请实施例中,摄像头193的数量可以为M个,M≥2,M为正整数。电子设备100在双景录像中开启的摄像头的数量可以为N,N≤M,N为正整数。
摄像头193包括镜头和感光元件(又可称为图像传感器),用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号,如标准的RGB,YUV等格式的图像信号。
摄像头193的硬件配置以及物理位置可以不同,因此,不同摄像头采集到的图像的大小、范围、内容或清晰度等可能不同。
摄像头193的出图尺寸可以不同,也可以相同。摄像头的出图尺寸是指该摄像头采集到的图像的长度与宽度。该图像的长度和宽度均可以用像素数来衡量。摄像头的出图尺寸也可以被叫做图像大小、图像尺寸、像素尺寸或图像分辨率。常见的摄像头的出图比例可包括:4:3、16:9或3:2等等。出图比例是指摄像头所采集图像在长度上和宽度上的像素数的大致比例。
摄像头193可以对应同一焦段,也可以对应不同的焦段。该焦段可以包括但不限于:焦长小于预设值1(例如20mm)的第一焦段;焦长大于或者等于预设值1,且小于或者等于预设值2(例如50mm)的第二焦段;焦长大于预设值2的第三焦段。对应于第一焦段的摄像头可以被称为超广角摄像头,对应第二焦段的摄像头可以被称为广角摄像头,对应于第三焦段的摄像头可以被称为长焦摄像头。摄像头对应的焦段越大,该摄像头的视场角(field of view,FOV)越小。视场角是指光学系统所能够成像的角度范围。
摄像头193可以设置于电子设备的两面。和电子设备的显示屏194位于同一平面的摄像头可以被称为前置摄像头,位于电子设备的后盖所在平面的摄像头可以被称为后置摄像头。前置摄像头可用于采集面对显示屏194的拍摄者自己的图像,后置摄像头可用于采集拍摄者所面对的拍摄对象(如人物、风景等)的图像。
在一些实施例中,摄像头193可以用于采集深度数据。例如,摄像头193可以具有(time of flight,TOF)3D感测模块或结构光(structured light)3D感测模块,用于获取深度信息。用于采集深度数据的摄像头可以为前置摄像头,也可为后置摄像头。
视频编解码器用于对数字图像压缩或解压缩。电子设备100可以支持一种或多种图像编解码器。这样,电子设备100可以打开或保存多种编码格式的图片或视频。
电子设备100可以通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emittingdiode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrixorganic light emitting diode的,AMOLED),柔性发光二极管(flex light-emittingdiode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot lightemitting diodes,QLED)等。在一些实施例中,电子设备100可以包括一个或多个显示屏194。
在一些实施例中,在双景录像模式下,显示屏194可以通过拼接或画中画等方式对来自两个摄像头193双路图像进行显示,以使得来自该两个摄像头193的双路图像可以同时呈现给用户。
在一些实施例中,在双景录像模式下,处理器110(例如控制器或GPU)可以对来自两个摄像头193的多帧图像进行合成。例如,将来自两个摄像头193的两路视频流合并为一路视频流,处理器110中的视频编码器可以对合成的一路视频流数据进行编码,从而生成一个视频文件。这样,该视频文件中的每一帧图像可以包含来自两个摄像头193的两个图像。在播放该视频文件的某一帧图像时,显示屏194可以显示来自两个摄像头193的双路图像,以为用户展示同一时刻或同一场景下,不同范围、不同清晰度或不同细节信息的两个图像画面。
在一些实施例中,在双景录像模式下,处理器110可以分别对来自不同摄像头193的图像帧进行关联,以便在播放已拍摄的图片或视频时,显示屏194可以将相关联的图像帧同时显示在取景框中。该种情况下,不同摄像头193同时录制的视频可以分别存储为不同的视频,不同摄像头193同时录制的图片可以分别存储为不同的图片。
在一些实施例中,在双景录像模式下,两个摄像头193可以采用相同的帧率分别采集图像,即两个摄像头193在相同时间内采集到的图像帧的数量相同。来自不同摄像头193的视频可以分别存储为不同的视频文件,该不同视频文件之间相互关联。该视频文件中按照采集图像帧的先后顺序来存储图像帧,该不同视频文件中包括相同数量的图像帧。在播放已录制的视频时,显示屏194可以根据预设的或用户指示的布局方式,按照相关联的视频文件中包括的图像帧的先后顺序进行显示,从而将不同视频文件中同一顺序对应的多帧图像显示在同一界面上。
在一些实施例中,在双景录像模式下,两个摄像头193可以采用相同的帧率分别采集图像,即两个摄像头193在相同时间内采集到的图像帧的数量相同。处理器110可以分别为来自不同摄像头193的每一帧图像打上时间戳,以便在播放已录制的视频时,显示屏194可以根据时间戳,同时将来自两个摄像头193的多帧图像显示在同一界面上。
为使用方便,电子设备通常在用户的手持模式下进行拍摄,而用户手持模式下通常会使得拍摄获得的画面发生抖动。在一些实施例中,在双景录像模式下,处理器110可以分别对不同摄像头193采集到图像帧分别进行防抖处理。而后,显示屏194再根据防抖处理后的图像进行显示。
下面介绍本申请实施例提供的用户界面。
首先,介绍开启双景录像模式所涉及的用户界面。
如图4中的(A)所示,图4中(A)示例性示出了电子设备100上的用于应用程序菜单的示例性用户界面40。如图4中(A)所示,电子设备100可以配置有多个摄像头193,这多个摄像头193可包括前置摄像头和后置摄像头。其中,前置摄像头也可以是多个,例如前置摄像头193-1、前置摄像头193-2。如图4中(A)所示,前置摄像头193-1、前置摄像头193-2中可设置于电子设备100的顶端,170A为位于电子设备100顶部的扬声器。可以知道,在一些实施例中,如图4中(B)所示,电子设备100的背面也可以配置有后置摄像头193,以及照明器197。后置摄像头193也可以是多个,例如后置的摄像头193-3、193-4、193-5以及193-5。
如图4中的(A)所示,主屏幕界面40包括日历小工具(widget)401、天气小工具402、应用程序图标403、状态栏404以及导航栏405。其中:
日历小工具401可用于指示当前时间,例如日期、星期几、时分信息等。
天气小工具402可用于指示天气类型,例如多云转晴、小雨等,还可以用于指示气温等信息,还可以用于指示地点。
应用程序图标403可以包含例如
(Wechat)的图标、
(Twitter)的图标、
(Facebook)的图标、
(Sina Weibo)的图标、
(Tencent QQ)的图标、
(YouTube)的图标、图库(Gallery)的图标和相机(camera)的图标1031等,还可以包含其他应用的图标,本申请实施例对此不作限定。任一个应用的图标可用于响应用户的操作,例如触摸操作,使得电子设备100启动图标对应的应用。
状态栏404中可以包括运营商的名称(例如中国移动)、时间、WI-FI图标、信号强度和当前剩余电量。
导航栏405可以包括:返回按键4051、主界面(home screen)按键4052、呼出任务历史按键4053等系统导航键。其中,主屏幕界面40为电子设备100在任何一个用户界面检测到作用于主界面按键4052的用户操作后显示的界面。当检测到用户点击返回按键4051时,电子设备100可显示当前用户界面的上一个用户界面。当检测到用户点击主界面按键4052时,电子设备100可显示主屏幕界面40。当检测到用户点击呼出任务历史按键4053时,电子设备100可显示第一用户最近打开过的任务。各导航键的命名还可以为其他,比如,4051可以叫Back Button,4052可以叫Home button,4053可以叫Menu Button,本申请对此不做限制。导航栏405中的各导航键不限于虚拟按键,也可以实现为物理按键。
可以理解的是,图4中的(A)仅仅示例性示出了电子设备100上的用户界面,不应构成对本申请实施例的限定。
图5中的(A)至(B)示例性地给出了电子设备100响应于检测到的用户操作进入“双景录像模式”的过程。
示例性地,电子设备可以检测到作用于如图4中(A)所示的相机的图标4031的触控操作(如在图标4031上的点击操作),并响应于该操作启动相机应用程序,电子设备100可显示用户界面50。“相机”是智能手机、平板电脑等电子设备上的一款图像拍摄的应用程序,本申请对该应用程序的名称不做限制。也即是说,用户可以点击图标4031来打开“相机”的用户界面50。不限于此,用户还可以在其他应用程序中打开用户界面50,例如用户在
中点击拍摄控件来打开用户界面50。
是一款社交类应用程序,可支持用户向他人分享所拍摄的照片等。
图5中的(A)示例性示出了智能手机等电子设备上的“相机”应用程序的一个用户界面50。用户界面50还可以包含缩略图控件501、拍摄控件502、摄像头切换控件503、取景框505、调焦控件506A、设置控件506B和闪光灯开关506C。其中:
缩略图控件501,用于供用户查看已拍摄的图片和视频。
拍摄控件502,用于响应于用户的操作,使得电子设备100拍摄图片或者视频。在本实施例以及本申请的其他实施例中,使得电子设备100开始拍摄视频的时刻可以称为T1时刻。
摄像头切换控件503,用于将采集图像的摄像头在前置摄像头和后置摄像头之间切换。
取景框505,用于对所采集图片进行实时预览显示,其中分界线5051即为取景框505的下边界,电子设备100屏幕上边界即为取景框505的上边界。
调焦控件506A,用于对摄像头进行调焦。
在本申请实施例以及后续实施例中,摄像头调焦方式不限于通过触摸调焦控件实现,用户也可以通过作用于取景框的双指缩放操作实现。变焦倍数随着双指缩放手势的变化而变化。当该双指缩放手势为双指放大手势时,该手势的幅度越大,该对应摄像头的变焦倍数越大。当该双指缩放手势为双指缩小手势时,该手势的幅度越大,该对应摄像头的变焦倍数越小。
设置控件506B,用于设置采集图像时的各类参数。
闪光灯开关506C,用于开启/关闭闪光灯。
功能栏504,包括夜景拍摄控件504A、人像拍摄控件504B、拍照控件504C、短视频拍摄控件504D、录像控件504E以及更多拍摄选项控件504F。控件504A-504E中,任一个拍摄选项控件可用于响应用户的操作,例如触摸操作,使得电子设备100启动图标对应的拍摄模式。
更多拍摄选项控件504F,可以响应于该用户操作,即作用在更多拍摄选项控件504F上的触摸操作,电子设备100可显示用户界面60。相机应用界面60还可以包含多种用于选择拍摄模式的控件,例如慢动作模式控件,全景拍摄控件、黑白模式控件以及双景录像控件601等,还可以包含其他拍摄模式的控件,本申请实施例对此不作限定。
如图5中(B)所示,双景录像控件601,响应于该用户操作,即作用在双景录像控件601上的触摸操作,相机进入双景录像模式。在一些实施例中,双景录像控件601可以包含于界面50中,也可以包含于相机应用的其他用户界面中,本申请实施例不做限定。
在一些实施例中,电子设备100可以在启动“相机”后默认自动进入“双景录像模式”。在另一些实施例中,电子设备100启动“相机”后,若未进入“双景录像模式”,则可以响应于检测到的用户操作进入“双景录像模式”。不限于此,电子设备100还可以通过其他方式进入“双景录像模式”,例如电子设备100还可以根据用户的语音指令进入“双景录像模式”等,本申请实施例对此不作限制。
图5中的(C)至(D)示例性地给出了上下分屏模式下改变录制界面形式的场景。
进入双景录像模式后,电子设备100可以使用两个摄像头采集图像,并在显示屏中显示预览界面。如图5中的(C)所示的用户界面70,刚开始进入双景录像模式时,电子设备100将默认自动选择上下分屏的录像方式。在一些实施例中,进入双景录像模式时,默认的分屏方式也可以是其他方式,例如画中画模式,本申请实施例不做限定。用户界面70还可以包括:
上取景框701,用于对第一摄像头所采集图像进行实时预览显示。其中,其中分隔线706即为上取景框701的下边界,电子设备100屏幕上边界即为取景框505的上边界。取景框701中可包括:摄像头切换控件701B,用于将采集图像的摄像头在前置摄像头和后置摄像头之间切换。示例性地,用户可以点击摄像头切换控件701B,将取景框701对应的摄像头由前置摄像头193-1更改为后置的摄像头193-3。
在一些实施中,前置摄像头对应的取景框中也可以不包含调焦控件701A。也就是说,在本实施例以及后续实施例中,电子设备100进行前置取景时,前置摄像画面可以不支持调焦,前置摄像时焦距固定为广角、长焦或者其他焦距;前置摄像画面也可以像后置摄像一样支持调焦,并且在界面中包含用于调焦的调焦控件。
下取景框702,用于对第二摄像头所采集图像进行实时预览显示。其中,其中分隔线706即为下取景框701的上边界,电子设备100屏幕下边界即为取景框505的下边界。取景框701中可包括:调焦控件702A,用于对第二摄像头进行调焦。摄像头切换控件702B,用于将采集图像的摄像头在前置摄像头和后置摄像头之间切换。
缩略图控件703,用于供用户查看已拍摄的图片和视频。
拍摄控件704,用于响应于用户的操作,使得电子设备100拍摄视频。在本实施例以 及本申请的其他实施例中,使得电子设备100开始拍摄视频的时刻可以称为第一时刻T1。在本实施例以及本申请的其他实施例中,电子设备100在双景录像模式下拍摄视频的某一时刻可以称为第二时刻T2。其中,第一时刻T1和第二时刻T2之间的时长可以称为第一时长t1。当第一时长t1为0时,T1时刻即为T2时刻。
滤镜控件705,用于设置拍摄图像时的滤镜。
闪光灯开关706,用于开启/关闭闪光灯。
分隔线706,用于分割上取景框701和下取景框702。
应理解,如图5中(C)所示,电子设备的屏幕被上下分为上取景框701和下取景框702,上取景框701对应显示来自前置的摄像头193-1的图像,下取景框702对应显示来自后置摄像头193-3的图像。上取景框701中的图像为面对电子设备100的显示屏的拍摄者自己的图像,下取景框702中的图像为拍摄者所面对的拍摄对象(如人物、风景等)的图像。
当电子设备100采用如图5中(C)所示的用户界面录制音频后,在该音频播放时,由于上取景框701和下取景框702的面积一样大,用户能感觉到,上取景框701中被拍摄对象的周围环境的声音(也就是电子设备前置摄像头那侧,以下称为声音1)和下取景框702被拍摄对象的周围环境的声音(也就是电子设备后置摄像头那侧,以下称为声音2)的响度一样大。可选的,还可以使,上取景框701中被拍摄对象的周围环境的声音(也就是声音1)在电子设备的上部发出,下取景框702被拍摄对象的周围环境的声音(也就是声音2)在电子设备的下部发出。但是实际上电子设备的扬声器(例如顶部扬声器或/和底部扬声器或/和背部扬声器)均会播放出上取景框701中被拍摄对象的周围环境的声音,也均会播放出下取景框702被拍摄对象的周围环境的声音,也就是电子设备的扬声器均会播放声音1和声音2。且这样,增强立体声的感觉,让用户感觉上面的取景框的声音在上面发出,下面取景框的声音在下面发出,增强体验感和趣味性。
在一些实施例中,电子设备100可响应于用户对分隔线706的触摸操作,例如滑动操作,对上取景框701和下取景框702的面积进行调节。示例性的,如图5中(C)所示,响应于该用户操作,即作用在分隔线709上的上滑动操作,上取景框701的面积减小,下取景框702的面积变大,最终呈现效果可参考如图5中(D)所示的用户界面71。当电子设备100采用如图5中(D)所示的用户界面录制音频后,且由于上取景框701叫和下取景框702的面积更小,则在该音频播放时,与图5中(C)中的声音不同在于,声音1的响度小声音2的响度;可选的,还可以给用户的感觉仍然是声音1在电子设备的上部发出,声音2在电子设备的下部发出,具体描述参见图5中(C)的描述,不再赘述。
当然,用户还可以通过作用在分隔线709上的下滑动操作,使上取景框701的面积变大,下取景框702的面积变小,在这种界面下所录制的音频在播放时,相比于在对分隔线709进行滑动之前的界面下录制的音频,声音1的响度变大,声音2的响度则变小。
画面调换控件708,用于调换上取景框701和下取景框702所采集的图像。
示例性的,如图5中(D)所示,响应于该用户操作,即作用在画面调换控件718上的触摸操作,电子设备100将上取景框711中的画面和下取景框712中的画面进行交换,最终呈现效果可参考图6中(A)所示的用户界面72。图6中(A)的声音效果参见图5 中(D)的描述,在此不再赘述。不限于图5中(C)至(D)所示的通过控件708将两个画面进行调换,用户还可以通过其他操作对两个取景框的图像进行调换。例如用户可以通过作用在上取景框711中的下滑操作,或者作用在下取景框712中的上滑操作来实现两个取景框的图像的调换。
分屏选项控件709,用于响应于用户的操作,使电子设备100对双景录像模式的分屏方式进行切换。
应理解,上下分屏模式下,用户还能通过调焦、切换前/后置镜头以及改变分屏方式来对录制界面进行改变,具体可参考后续实施例的描述。
接下来结合图6中(A)至(F)对电子设备100进入“画中画”模式以及在“画中画”模式下调整录制界面样式的方式进行说明。需要注意的是,图6中(A)至(F)示出的实施例中,电子设备100的姿态未发生改变。也就是说,用户在调整电子设备的录制界面样式时,并未移动该电子设备100,图6中(A)至(F)中录制界面的变换与电子设备的姿态变化无关。
图6中(A)至(C)示例性的展示了用户切换双景录像模式下分屏方式的场景。
如图6中(A)所示,响应于该用户操作,即作用在分屏选项控件709上的触摸操作,电子设备100显示分屏选项框710。如图6中(B)所示,分屏选项框710可以包含多种分屏选项控件,例如上下分屏控件701A(规则分屏),上下分屏控件701B(不规则分屏)、画中画控件710C(小取景框形状为方形)和画中画控件710D(小取景框形状为圆形)。在一些实施例中,分屏选项框710还可以包含其他分屏选项的控件,本申请实施例对此不作限定。任一个分屏选项控件可用于响应于用户的操作,例如触摸操作,使得电子设备100启动该控件对应的分屏录像模式。
分屏方式的切换不限于图6中(B)所示的通过分屏选项框进行切换,用户还可以通过其他方式进行分屏方式的切换,本申请实施例不做限定。例如,电子设备100可以在响应用户对图6中(A)的分屏选项控件729上的触摸操作后,直接切换到与当前分屏方式不同的另一种分屏方式;当用户再次触摸图6中(A)的分屏选项控件729后,电子设备100再次切换到另一种分屏方式。
如图6中(B)所示,响应于该用户操作,即作用在画中画控件730C上的触摸操作,电子设备100启动画中画分屏方式的双景录像模式,并显示如图6中(C)所示的用户界面80。如图6中的(C)所示,用户界面80还可以包含摄像头切换控件801、取景框802以及例如拍摄控件、滤镜控件等其他控件(具体参考前述对图5中(C)的说明,这里不再赘述),其中:
主取景框801(也称作主画面区域),用于对第一摄像头所采集图像进行实时预览显示。其中,主取景框801中可包括:调焦控件801A,用于对第一摄像头进行调焦。用于将采集图像的摄像头在前置摄像头和后置摄像头之间切换,摄像头切换控件801B,用于将采集图像的摄像头在前置摄像头和后置摄像头之间切换。示例性地,用户可以点击摄像头切换控件801B,将取景框801对应的摄像头由前置摄像头193-1更改为后置的摄像头193-3。子取景框802(也称作子画面区域),用于对第二摄像头所采集图像进行实时预览 显示。其中,子取景框802中可包括:调焦控件802A,用于对第二摄像头进行调焦。摄像头切换控件802B,用于将采集图像的摄像头在前置摄像头和后置摄像头之间切换。示例性地,用户可以点击摄像头切换控件802B,将取景框802对应的摄像头由后置摄像头193-3更改为前置的摄像头193-1。为了便于说明,如图6中(C)所示,调焦控件802A以及摄像头切换控件802B呈现于主取景框801中。在一些实施例中,调焦控件802A以及摄像头切换控件802B呈现子取景框801中,具体可参考图6中(G)。在一些实施中,子取景框中也可以不包含调焦控件,具体可参考图6中(H)。
当电子设备100采用如图6中(C)所示的用户界面录制音频后,在该音频播放时,由于子取景框802在主取景框801的左上角,且面积较小,用户能感觉到,子取景框802被拍摄对象的周围环境的声音(也就是电子设备前置摄像头那侧,以下称为声音1)的响度相对于主取景框801被拍摄对象的周围环境的声音(也就是电子设备后置摄像头那侧,以下称为声音2)的响度较小。可选的,还可以使,主取景框801被拍摄对象的周围环境的声音(也就是电子设备后置摄像头那侧,以下称为声音2)在用户周围发出没有方向性,但子取景框802被拍摄对象的周围环境的声音(也就是电子设备前置摄像头那侧,以下称为声音1)在电子设备的左上角发出(用户以图6中(C)所示的方向握持手机)。但是实际上电子设备的扬声器(例如顶部扬声器或/和底部扬声器或/和背部扬声器)均会播放出子取景框802中被拍摄对象的周围环境的声音,也均会播放出主取景框801被拍摄对象的周围环境的声音,也就是电子设备的扬声器均会播放声音1和声音2。且这样,增强立体声的感觉,让用户感觉声音也呈现出画中画的效果,增强体验感和趣味性。在一些实施例中,用户还能感觉到上述声音1是从自己左侧传来的。
应理解,主取景框801和子取景框802的画面同样可通过作用在画面调换控件上的触摸操作进行交换,在交换后所得的界面下录制的音频在播放时也会相应改变(可参考前述对图5中(D)和图6中(A)的说明,这里不再赘述)。
应理解,如图6中(C)所示,电子设备的屏幕被分为主取景框801和子取景框802。主取景框801对应显示来自后置摄像头193-3的图像,子取景框802对应显示来自前置的摄像头193-1的图像。主取景框801中的图像为拍摄者所面对的拍摄对象(如人物、风景等)的图像,子取景框802中的图像为面对电子设备100的显示屏的拍摄者自己的图像。
在刚开始进入画中画模式的双景录像模式时,取景框802默认的面积和方位不限于图6中(C)所示的样式。在一些实施例中,取景框802的面积可以比图6中(C)所示出的更大或更小,其相对于取景框801的方位也可以为电子设备100的屏幕的右上角或者其他方位,本申请实施例对此不作限定。
图6中(C)至(F)示例性的展示了在“画中画”模式下调整录制界面样式的场景。
在一些实施例中,电子设备100可以响应于检测到的用户操作,对取景框802的面积和方位进行调节。
示例性的,参考图6中(C),电子设备100可以响应于作用在取景框802上的滑动操作,对取景框802在屏幕中的方位进行变化。取景框802变化后的方位可参考图6中(D)所示,取景框802的方位由电子设备100的屏幕的左上方变化为右下方。这样,声音1和 声音2的响度不变,但是感觉到声音1从电子设备的左上移动到了右下方向。
示例性的,参考图6中(D),响应于作用在取景框812上的双指放大操作,电子设备100将取景框802的面积放大,放大后的取景框可参考图6中(E)所示的取景框822。同理,电子设备100还可以响应于作用在取景框812上的双指缩小操作,将取景框802的面积缩小。
相比在图6中(D)所示的界面下录制的音频而言,当电子设备100采用如图6中(E)所示的用户界面录制音频在播放时,则声音1的响度将变大。此外,电子设备还可以响应于检测到的用户操作,对摄像头的焦距进行调整。如图6中(E)所示,响应于作用在调焦控件822A上的滑动操作,电子设备将增大该取景框822对应的摄像头的焦距,由2×调整至3×。调整后的用户界面可参考图6中(F)所示,取景框822中的图像所呈现的视野范围由图6中(E)(取景框822)变化为图6中(F)(取景框832),但取景框822中的图像相比图6中(E)所示的取景框832中的图像要大一些。
基于前述内容介绍的电子设备100以及前述用户界面相关的实施例,接下来介绍电子设备100在执行本申请实施例提供的音频的处理方法时,在不同的用户界面(即双景录像时的录制界面)下,对音频信号进行过滤的一些场景。
首先,先介绍本申请实施例中电子设备100提供的一些焦距倍数与视场角大小之间的对应关系,请参考下列表1。
表1
焦距 | 1× | 2× | 3× | 4× | 5× | 6× |
视场角 | 180° | 120° | 90° | 60° | 45° | 30° |
应理解,表1中的视场角的数值仅表示视场角的范围,并未体现视场角的方向性。具体的,结合表1和前述规定,当前置摄像头的焦距为1×时,其视场角为180°,当后置摄像头的焦距为1×时,其视场角为-180°,以此类推。
其次,表1仅示例性地示出了电子设备100中相机所能提供的焦距倍数,不限于表1中包含的焦距倍数,电子设备100中的相机还可以为用户提供其他更多的焦距倍数的选项,例如7×、8×等,本申请实施例不做限定。
此外,电子设备100中每个焦距倍数与视场角大小之间的对应关系可以不限于表1中示出的对应关系。例如在一些实施例中,1×的焦距倍数对应的视场角可以为170°,2×的焦距倍数对应的视场角可以为160°,本申请实施例对此不做限定。但是,需注意,无论焦距倍数和视场角大小之间的对应关系如何,这种对应关系是在电子设备100在生产出厂时就被固定下来的。也就是说,在电子设备100进行拍摄时,电子设备100可以根据拍摄使用的摄像头的前/后置信息和焦距倍数获取到此时的视场角的大小和范围。
图7A示例性示出了电子设备100在上下分屏且同时采用前、后置摄像头进行双景录像的录制界面。
参考图7A,图7A示出的录制界面90可以包括多个控件,例如拍摄控件904、滤镜控件905及一些其他的控件。电子设备100可以响应于用户对这些控件的触摸操作,对录 制画面进行切换,具体可以参照前述对图5的相关描述,这里不再赘述。
在图7A示出的录制界面90中,电子设备100的前置摄像头只提供1×的固定焦距倍数,即前置摄像头的视场角固定为+180°;电子设备的100的后置摄像头可提供1×至6×六个不同焦距倍数,用户可以通过触摸调焦控件902A切换后置摄像头的焦距倍数,以此来调整取景框902中的画面的视场角范围。用户还可以通过摄像头切换控件901B,用于将上取景框901采集图像的摄像头在前置摄像头和后置摄像头之间切换;同理,用户还可以通过摄像头切换控件902B,用于将下取景框902采集图像的摄像头在前置摄像头和后置摄像头之间切换。示例性地,用户可以点击摄像头切换控件901B,将取景框901对应的摄像头由前置摄像头193-1更改为后置的摄像头193-3。在一些实施例中,前置摄像头的焦距也支持切换,并且在录制界面中也包含用于对其进行调焦的调焦控件。具体可参考图7B所示出的录制界面91中的控件911A,即可用于对前置摄像头的焦距进行切换。
如图7A所示,电子设备100正采用焦距倍数为固定1×的前置摄像头和焦距倍数为3×的后摄像头在上下分屏模式下进行双景录像。此时,上取景框901呈现的画面(以下称为画面1)为前置摄像头(例如193-1)所拍摄的画面,其为用户自己的人脸;下取景框902呈现的画面(以下称为画面2)为后置摄像头(例如193-3)所拍摄的画面,其为电子设备100前方的风景图像。则由上述表1可知,此时前置摄像头的视场角为+180°,后置摄像头的视场角为-90°。
图7B示出了电子设备100在画中画分屏方式下且同时采用前、后置摄像头进行双景录像的录制界面。
参考图7B,图7B示出的录制界面91可以包括多个控件,电子设备100可以响应于用户对这些控件的触摸操作,对录制画面进行切换,具体可以参照前述对图5的相关描述,这里不再赘述。
在图7B所示出的录制界面91中,电子设备100正采用焦距倍数为6×的前置摄像头和焦距倍数为2×的后摄像头在画中画模式下进行双景录像。此时,取景框911呈现的画面(以下称为主画面)为焦距倍数为2×的后置摄像头(例如193-4)所拍摄的画面,其为电子设备100前方的风景;取景框912呈现的画面(以下称为子画面)为焦距倍数为6×的前置摄像头(例如193-2)所拍摄的画面,其为用户自己的人脸。则由上述表1可知,此时主画面对应的视场角为-90°,子画面对应的视场角为+30°。
本申请实施例中,电子设备100对环境中的音频信号进行采集时,仍然会对空间中全方位(及空间中360°)的音频信号进行采集。但为了使所录制的音频能配合两个画面呈现给用户的视角范围,在采集到全方位角度传递来的音频信号后,电子设备100可以结合两个画面各自的视场角,对收取到的音频信号进行同角度的过滤,得到分别在两个视场角的方向上被增强的音频信号。
参考图8A,图8A为本申请实施例提供的电子设备结合画面的视场角对音频信号进行过滤的一种场景示意图。
在图8A中,所呈现的为电子设备100被直立放置时的俯视图,电子设备可以视作点P。OPO’为电子设备所在的平面,QQ’为该平面的法线。其中OPO’左侧表示电子设备前 置摄像头所在的一侧,OPO’右侧表示电子设备100后置摄像头所在的一侧。不难看出,OPO’左侧的阴影部分表示电子设备100前置摄像头在焦距为1×下所能获取到的视场角范围(0°至+180°),OPO’右侧的空白部分表示电子设备100后置摄像头焦距为1×下所能获取到的视场角范围(-180°至0°)。∠OPO’(右侧)、∠APA’、∠BPB’和∠CPC’分别为电子设备100的后置摄像头在1×、2×、3×和6×时对应的视场角,∠OPO’(右侧)、∠DPD’、∠EPE’、∠FPF’分别为电子设备100的前置摄像头在1×、2×、3×和6×时对应的视场角,其每个角的数值可以参考前述表1得出,这里不再赘述。
结合前述对图7A说明可知,在录制界面为90的情况下,电子设备100的前置摄像头(即画面1对应的摄像头)的视场角的大小和方向与图8A中∠OPO’(左侧)一致,大小为+180°,视场角的边界为PO和PO’所在的射线。电子设备100的后置摄像头(即画面2对应的摄像头)的视场角的大小和方向与图8A中∠BPB’一致,大小为-90°,视场角的边界为PB和PB’所在的射线。因此,当用户采用如图7A所示的录制界面进行双景录像时,电子设备100将根据上述画面1的视场角∠OPO’(左侧),对∠OPO’(左侧)之外的角度方向上采集到的音频信号进行抑制,得到与∠OPO’(左侧)同角度方向的音频信号1;并根据上述画面2的视场角∠BPB’,对∠BPB’之外的角度方向上采集到的音频信号进行抑制,得到与∠BPB’同角度方向的音频信号2。举例说明,上述画面1中包含有声源A,画面1之外的场景中有声源A’,上述画面2中包含有声源B,画面1之外的场景中存在声源B’;在进行上述滤波过程后,通过画面1的视场角所得的声音信号在听觉会让人感觉到主要是声源A在发声,且能让用户感知到该声音‘有与画面1在屏幕上呈现的方位相对应的方位性的效果;通过画面2的视场角所得的声音信号在听觉会让人感觉到主要是声源B在发声,且能让用户感知到该声音有与画面2在屏幕上呈现的方位相对应的方位性的效果。
应理解,当用户对录制界面进行更改时,录制界面中的两个视场角的角度方向可能也会发生相应的改变,此时,电子设备100对音频信号进行过滤时所选取的角度方向也会相应的改变。
结合对图7B说明可知,在录制界面为91的情况下,电子设备100的前置摄像头(即子画面对应的摄像头)的视场角的大小和方向与图8A中∠FPF’(左侧)一致,大小为+30°,视场角的边界为PF和PF’所在的射线。电子设备100的后置摄像头(即主画面对应的摄像头)的视场角的大小和方向与图8A中∠APA’一致,大小为-120°,视场角的边界为PA和PA’所在的射线。因此,当用户采用如图8A所示的录制界面进行双景录像时,电子设备100将根据上述子画面的视场角∠FPF’,对∠FPF’之外的角度方向上采集到的音频信号进行抑制,得到与∠FPF’同角度方向的音频信号3;并根据上述主画面的视场角∠APA’,对∠APA’之外的角度方向上采集到的音频信号进行抑制,得到与∠APA’同角度方向的音频信号4(图8A中未画出)。
可理解的,根据双屏录像中两个取景摄像头在前/后置以及焦距倍数等不同情况进行组合,双景录像时的录制界面可以有多种,例如:
图7C所示出的录制界面92中,电子设备100正采用一个3×焦距的前置摄像头和一个6×焦距的前置的摄像头进行双景录像时,则该3×焦距的前置摄像头在图8A中对应的视场角为∠EPE’,该6×焦距的前置摄像头在图8A中对应的视场角为∠FPF’。
当电子设备100采用一个1×的后置摄像头和一个2×焦距的后置的摄像头进行双景录像时,则该1×焦距的后置摄像头在图8A中对应的视场角为∠OPO’(右侧),该2×焦距的后置摄像头在图8A中对应的视场角为∠BPB’。
当然,电子设备100还可以采用其他不同的摄像头和焦距组合进行双景录像。当电子设备100采用不同的摄像头和焦距组合录像时,其对音频信号进行过滤时所选取的角度方向会相应的改变,过滤所得到的两个音频信号也会不同,这里不再一一列举。
在一些实施例中,为能使音频具备更好的双声道立体感,在对音频信号进行过滤时,电子设备100可以选择特定的滤波方法(例如CVX波束训练方法),基于每个画面的视场角向左右两个角度方向进行滤波,接下来结合图8B进行说明。
图8B为本申请实施例提供的电子设备结合画面的视场角,通过CVX波束训练方法对音频信号进行过滤的另一种场景示意图。如图8B所示,电子设备100所呈现的为电子设备100被直立放置时的俯视图,电子设备可以视作点P。图8B中的平面OPO’、QQ’、∠OPO’(右侧)、∠APA’、∠BPB’和∠CPC’∠OPO’(右侧)、∠DPD’、∠EPE’、∠FPF’的具体含义可以参考前述对图7B中的说明,这里不再赘述。
在图8B中,电子设备100正图采用如图7A所示的录制界面90进行双景录像。结合前述对图7A的说明可知,在录制界面为90的情况下,画面1的视场角为图8B中的∠OPO’(左侧),边界为PO和PO’所在的射线。画面2的视场角为∠BPB’,视场角的边界为PB和PB’所在的射线。由图8B可知,电子设备100所在平面的法线QQ’为∠OPO’(左侧)以及∠BPB’的角平分线,则法线QQ’将∠OPO’(左侧)分为∠OPQ’以及∠O’PQ’,将∠BPB’分为∠BPQ以及∠B’PQ。这里规定,对于电子设备100而言,法线QQ’上方为电子设备100的左侧,法线QQ’下方为电子设备100的右侧。
在双景录像的过程中,电子设备100可以将根据上述画面1的视场角∠OPO’(左侧),分别对∠OPQ’以及∠O’PQ’之外的角度方向上采集到的音频信号进行抑制,得到与∠OPQ’同角度方向的左声道音频信号11,以及与∠O’PQ’同角度方向的右声道音频信号21;并根据画面2的视场角∠BPB’,分别对∠BPQ以及∠B’PQ之外的角度方向上采集到的音频信号进行抑制,得到与∠BPQ同角度方向的左声道音频信号21,以及与∠B’PQ同角度方向的右声道音频信号22。这样,在对上述得到的四个音频信号进行混音后输出时,输出的音频能为用户营造更为立体的听感。
当然,图8B所示出的对音频信号进行过滤的方法也适用于其他录制界面。对于既定的录制界面,只需根据每个录制界面中两个画面的视场角进行左右的划分,得到四个区分了左右侧的视场角,再根据四个视场角的角度方向分别进行音频过滤,即可得到四个具备左右声道差异的音频信号。这里不再对每个录制场界面下电子设备100对音频的过滤场景进行一一描述。
此外,图8A和图8B中所示出的音频信号的形状仅用于对音频信号进行区分,并不代表音频信号在空间中传播时实际的波形。
图9为本申请实施例提供的一种音频的处理方法的流程图。该方法结合双景录像中两 个画面的视场角,将视场角作为DSB算法的输入,基于DSB算法分别对音频进行过滤,再进行音频重混,得到的音频可以和画面具有同步的立体感。如图9所示,本申请实施例提供的方法可以包括:
S101、电子设备开启双景录像模式。
示例性地,电子设备可以检测到作用于如图4中(A)所示的相机的图标4031的触控操作(如在图标4031上的点击操作),并响应于该操作启动相机应用程序。
之后,电子设备检测到选择“双景录像模式”的用户操作后,开启双景录像模式。示例性地,该用户操作可以是图5中(B)所示的在双景录像模式选项601上的触控操作(例如点击操作)。该用户操作也可以是语音指令等其他类型的用户操作。
不限于用户选择,电子设备100可以在启动相机应用程序后默认选定“双景录像模式”。
S102、电子设备根据用户对录制界面的调整显示相应的录制界面。
在开始录像之前,电子设备可以检测到用户对双景录像模式下的界面样式的设置,参考上述图5、图6、图7A、图7B以及图7C中所示的相关用户界面,对录制界面的调整包括但不限于以下情况:
①切换双景录像模式的分屏方式
示例性地,电子设备可以检测到作用于如图6中(A)所示分屏选项控件709上的触摸操作(如在控件709上的点击操作),并响应于该操作对分屏方式进行切换。
②对取景框的面积进行缩放
示例性地,电子设备可以检测到作用于如图6中(D)所示取景框802上的双指放大操作,并响应于该操作对取景框的802进行放大;或者检测到作用于如图5中(C)所示的分隔线706上的滑动操作,并响应于该操作对取景框701进行缩小,同时对取景框702进行放大。
③摄像头的焦距的调整
示例性地,电子设备可以检测到作用于如图5中(D)(中调焦控件802A上的双指滑动操作,并响应于该操作增大该取景框802对应的摄像头的焦距,由2×调整至3×。
上述焦距的调整可以是对后置摄像头的焦距的调整,也可以是对前置摄像头的调整。当双景了录像时采用的两个摄像头分别为前置摄像头和后置摄像头时,该焦距的调整还可以是同时对前置摄像头和后置摄像的焦距进行调整。
④前/后置摄像头的切换
示例性地,电子设备可以检测到作用于如图7A中控件901B上的点击操作,并响应于该操作对图7A中取景框901对应的摄像头由前置摄像头切换为后置摄像头,切换后的用户界面可以参考图7C示出的录制界面92。
⑤两个取景框取景画面的对调
示例性地,电子设备可以检测到作用于如图5中(D)所示的画面调换控件708上的点击操作,并响应于该操作将取景框的701和取景框702的画面内容进行对调。
应理解,两个取景框取景画面的切换实际上是将两个取景框对应的摄像头进行对调。因此,在对调两个取景框对应的摄像头的前/后置和焦距信息也随之全部切换。
S103、电子设备采集音频。
电子设备检测到指示开始录制视频的用户操作,例如在图5中(C)所示的控件704上的点击操作,电子设备中的音频采集装置(例如麦克风)采集环境中的音频信号。
具体的,以麦克风为例,电子设备可具备M个麦克风,M>1,M为正整数。在双景录像模式下,这M个麦克风可以同时采集环境中的音频信号,得到M路的音频信号。
需要进行说明的,采集音频信号,也就是采集到的声音作为电子设备的输入音源,该音源的采集可以是根据麦克风的性能进行确定的,可选的,可以是全方位的360°空间环境的声音,也可以是其他的,例如定向的空间的声音,本申请不进行限定。在本实施例以及其它实施例中,上述M路音频信号也可以称为实时环境的声音。
需注意,在电子设备出厂时,电子设备上的麦克风在电子设备中的位置已经固定,在电子设备后续的使用过程中,麦克风在电子设备中的位置并不会发生变化。也就是说,当M>1时,电子设备中每个麦克风之间的相对位置是固定的,因此,每个麦克风与另往外(M-1)个麦克风的间距也是固定的。
S104、电子设备录制图像。
电子设备检测到指示开始录制视频的用户操作,例如在图5中(C)所示的控件704上的点击操作,电子设备开始同时采用两个摄像头进行图像的采集和拍摄。
具体的,电子设备可具有N个摄像头,N≥2,N为正整数。这N个摄像头可以为前置摄像头和后置摄像头的组合。这N个摄像头也可以为广角摄像头、超广角摄像头、长焦摄像头或前置摄像头中任意焦距的摄像头的组合。本申请对这N个摄像头的摄像头组合方式不作限制。在录制时,电子设备将根据用户在S102中对摄像头的选择(例如前/后置摄像头的选择和摄像头焦距的选择),在屏幕中采用两个取景框分别呈现两个摄像头所采集到的两路图像。
电子设备的显示屏可以通过拼接(可参考前述说明中的上下分屏)或画中画等方式对来自2个摄像头的两路图像进行显示,使来自该两个摄像头的两路图像可以同时呈现给用户。
接下来结合图7A所示出的录制界面90对本申请实施例中步骤S105-S107进行说明。
S105、电子设备获取画面的视场角。
参考图7A所示的录制界面90。电子设备正采用焦距倍数为固定1×的前置摄像头和焦距倍数为3×的后摄像头在上下分屏模式下进行双景录像。此时,上取景框901呈现的画面(以下称为第一画面)为前置摄像头所拍摄的画面,其为用户自己的人脸;下取景框902呈现的画面(以下称为第二画面)为后置摄像头所拍摄的画面,其为电子设备100前方的风景图像。
则由上述表1可知,该第一画面视场角大小为180°,该第二画面的视场角大小为90°。
S106、电子设备计算画面权重。
参考前述对“画面权重”的概念的相关说明,接下来结合图7A所示的录制界面对上下分分屏界面下两个显示区域的画面权重的计算方式进行详细说明。在图7A所示的录制界面90中,电子设备显示屏的长为d0,宽为(d1+d2),且上述第一画面的宽为d1,所述第一画面的显示区域的宽为d1,第二画面的显示区域的宽为d2。则电子设备可计算得到两个画面的画面权重w1以及w2分别为:
w1=d1/(d1+d2)
w2=d2/(d1+d2)
其中,w1为第一画面的画面权重,w2为第二画面的画面权重。
S107、电子设备基于DSB算法对音频进行过滤。
为了能够使双景录像的画面和音频具有同步的立体感,电子设备将利用在S105中所得的画面的视场角信息,对S103中音频采集装置采集的音频进行过滤,得到两个画面对应的波束。该过程可采用盲源分离算法或波束形成算法等算法实现,本申请实施例不做限定。
接下来以波束形成算法为例对音频过滤的过程进行进一步说明。
假设电子设备中设有M个麦克风,其中M大于1。在双景录像过程中,该M个麦克风将采集到M路音频信号。电子设备可以基于FFT算法,将该M路音频信号由时域信号转化为到频域信号后,再根据双景录像中两个摄像头的前/后置信息以及焦距(即视场角信息),对该M路音频信号进行滤波,滤波所采用的计算公式为:
其中i=1,2,3……M,x
i(ω)表示电子设备中第i(i≤M)个麦克风采集到的音频信号,w
i(ω)可以通过DSB算法、CVX波训练方法等方法得出,其表示第i个麦克风在音频信号的频率为ω时波束形成器的权值向量。应理解,无论何种算法实现对音频的过滤,w
i(ω)都是该算法中与滤波方向强相关的一个必要的参数。
本实施例基于DSB算法得到w
i(ω)。应理解,在通过DSB算法得到w
i(ω)时,DSB算法的输入包括该第i个麦克风与其他(M-1)个麦克风的间距、摄像头的前/后置信息以及焦距。因此,利用w
i(ω)进行滤波时,可以在一定程度使第i个麦克风采集的音频信号在一个特定的方向得到增强,该特定的方向大致为该摄像头的前/后置信息以及焦距所对应的视场角的范围和方向,而视场角的范围和方向决定了取景框所呈现的画面内容。这样,可以将画面的方位感与听觉的方位感进行同步。
同样以图7A示出的录制界面90为例进行说明。在步骤S105中,已经计算出了两个画面的视场角信息,其中第一画面视场角为+180°,该第二画面的视场角为-90°。基于这两个视场角,电子设备可以通过上述DSB算法计算得到第一画面和第二画面进行滤波时所需的权值向量
和
并根据这两个权值向量对采集的音频进行过滤,得到第一画面和第二画面对应的波束y1(ω),y2(tω),其可以表示为:
结合前述对图7A和图8A的相关描述可知,波束y1(ω)即为图8A中的音频信号1,波束y2(ω)即为图8A中的音频信号2。需注意,这里得出的波束y1(ω)和波束y2(ω)均没有进行左声道和右声道的区分。
S108、电子设备对声源进行重混。
在得到两个画面相应的波束和画面权重后,电子设备将结合两个画面的画面权重对两个波束进行混音。
同样以图7A示出的录制界面90为例进行说明,为了让音频能通过双声道输出,电子设备可以将上述波束y1(ω)和波束y2(ω)在进行双声道虚拟:
y1l(ω)=y1(ω)
y1r(ω)=y1(ω)
y2l(ω)=y2(ω)
y2r(ω)=y2(ω)
在重混后,最终电子设备在录制界面90下输出的音频的具体方式可表示为:
outl(ω)=y2l(ω)×w2+y1l(ω)×w1
outr(ω)=y2r(ω)×w2+y1r(ω)×w1
不难看出,在本实施例中,outl(ω),outr(ω)虽然可以在公式上区别为左声道的音频和右声道的音频,但这两个音频数据其实是一样的,在播放时实际上听感可能也是一样的。
S109、电子设备判断界面是否发生变化。
在双景录像过程中,电子设备实时检测录制界面是否发生变化。且电子设备可以响应于检测到的用户操作,对录制界面进行变化。应理解,当录制界面发生变化时,电子设备在双景录像模式下两个画面的画面权重、前/后置信息以及焦距信息都可能发生变化。
因此,当电子设备的录制界面变化时,若此时电子设备并未终止或者结束双景录像模式,则电子设备将重新执行步骤S103至S104,及时根据变化后的录制界面更新步骤S105-S108中的涉及到的一些参数,根据更新后的录制界面中的两个画面的视角和面积进行音频的过滤和重混。
上述录制界面的变化的方式可以参考上述图5、图6中的所示的相关用户界面所示出的变化方式,也可以参阅步骤S102中对用户界面样式的进行设置的①至⑥这六种场景,这里不再赘述。
S110、电子设备保存处理后的音频。
当用户录制完毕后,电子设备可以响应于用户的操作,停止或关闭双景录像模式。示例性地,电子设备检测到指示停止录制视频的用户操作,例如在图5中(C)所示的控件704上的又一次的点击操作,电子设备停止对音频的采集和处理。该用户操作也可以是语音指令等其他类型的用户操作。
之后,电子设备可以将基于IFFT算法,将步骤S108中得到的音频信号outl(ω),outr(ω)转化为时域信号outl(t),outr(t)连同录制的视频一起保存在本地存储器中。
这里假设电子设备支持双声道输出,则在播放上述音频信号outl(t),outr(t)时,outl(t),outr(t)可以分别通过电子设备上的两个扬声器输出。但由于这两个音频信号其实本质上并差别,用户左右耳收听到的音频可能差别很微小。此外,当两个显示区域的面积大小相差无几或者相等时(在图5中(C)所示),用户在收听该音频时会感觉两个画面中的声音响度大致相等。
在一些实施例中,电子设备在执行完步骤S103和步骤S104之后,可以先将录制所得的视频文件和初始音频保存在存储器中。之后即使录制界面变化,电子设备也可以先保存该界面下录制的初始音频。应理解,这时候得到的音频还没有经过步骤S105-步骤S108的处理。在整个录制过程结束后,电子设备再将结合上述视频文件,获取录制界面发生变化的时刻以及该录制界面的视场角信息,来对上述初始音频进行步骤S105-步骤S108所示的处理,得到最后用于输出的目标音频;可选的,将目标音频和上述视频文件进行合成得到录制文件保存下来,以供用户后续播放。可选的,电子设备在保存该目标音频之后,可以将上述初始音频删除,节约设备的存储空间,也可以均保存下来,以供用户后续使用。
应理解,不限于图7A示出的录制界面90,本实施例提供的音频处理的方法也同样适用于对图5、图6以及图7B和图7C中示出的录制界面以及其他录制界面下的所录制的音频进行处理。例如图7B中,录制界面91为画中画模式的双景录像模式,电子设备同样能获取到该录制界面91下两个画面的前/后置信息、焦距信息和画面权重,并使用本实施例提供的音频的处理方法对在录制界面91下录制的音频进行过滤和融合。这里不再一一举例说明。
图10为本申请实施例提供的另一种音频的处理方法的流程图。该方法在能在图9的基础上,对画中画模式的双景录像模式的音频进行方位虚拟,能配合两个画面之间的方位感的使音频立体感进一步增强。
在对本实施例的具体流程进行说明之前,先说明本方法中的声源方位虚拟进行简单的额说明。在画中画的双景录像模式下,可以将两个画面中面积较大的画面称为主画面,将面积较小的画面称为子画面。由于子画面在视觉上和主画面的具体位置在视觉上有左右上面的偏差感,例如子画面是位于主画面的偏左一侧还是偏右一侧。为了使这种左右的偏差感能够被同步到音频中,则可采用实施例中的方位虚拟技术实现。
不难理解,上下分屏的双景录像模式下,两个画面相对位置并不会产生这种左右方位的偏差感。因此,方位虚拟技术仅适用于两个画面在相对位置上由这种偏差感的录制界面。 也就是说,双景录像模式下采用上下分屏的录制界面进行录制时,若使用本实施例的方法对音频进行处理,其最终所得的音频与使用前述图9中的方法进行处理得到的音频在本质上无差异。因此,接下来将结合图7B所示出的录制界面91对本申请实施例的方法流程进行说明。
如图10所示,本申请实施例提供的方法可以包括:
S201、电子设备开启双景录像模式。
S202、电子设备根据用户对录制界面的调整显示相应的录制界面。
S203、电子设备采集音频。
S204、电子设备录制图像。
其中,所述步骤S201-步骤S204的具体实现方式可参见上述图9所对应实施例中对步骤S101-步骤S104的描述,这里将不再进行赘述。
S205、电子设备获取画面的视场角。
参考图7B所示的录制界面91。如图7B所示出的录制界面91,用户正采用焦距倍数为6×的前置摄像头和焦距倍数为2×的后摄像头在画中画模式下进行双景录像。此时,取景框911呈现的画面(以下称为主画面)为焦距倍数为2×的后置摄像头所拍摄的画面,其为电子设备100前方的风景;取景框912呈现的画面(以下称为子画面)为焦距倍数为6×的前置摄像头所拍摄的画面,其为用户自己的人脸。则由上述表1可知,此时主画面对应的视场角为-90°,子画面对应的视场角为+30°。
S206、电子设备计算画面权重。
参考前述对“画面权重”的概念的相关说明,在图7B所示的录制界面91中,主画面的显示区域长为Dl,宽为Dw,所述子画面的显示区域的宽为d1,宽为dw。则电子设备可计算得到两个画面的画面权重wm以及ws分别为:
ws=α(d1×dw)/(Dl×Dw)
wm=1-ws
其中,“×”表示乘法运算ws为子画面的画面权重,wm为主画面的画面权重,α为校正系数,是在电子设备出厂时已经设定好的定值,其取值范围为[1,(Dl×Dw)/(d1×dw)],这样,可以防止两个画面面积差距太大而导致面积较小的取景框的画面权重的值过小。
S207、电子设备计算子画面的方位信息。
接着以在图7B所示的录制界面91为例进行说明。
在图7B中所示的画中画模式下的录制界面为例,将主画面中心点O为原点,在手机屏幕所在的平面上做平面直角坐标系。其中,小画面中心点F的横纵坐标值为(a,b)。樱花足以,a,b均为带有正负符号的数值。规定,Y轴左侧的点横坐标为负值,Y轴右侧的点横坐标为正值。X轴下侧的点纵坐标为负值,X轴上侧的点纵坐标为正值。
由上述说明可知,主画面的长为Dl,宽为Dw,单位与坐标轴单位统一。则子画面相对于主画面的方位可由方位角z和俯仰角e表示,其中:
z=a/(Dw/2)×90°
e=b/(Dl/2)×90°
S208、电子设备基于DSB算法对音频进行过滤。
步骤S208中,电子设备采集音频以及对音频进行频域转化的具体方式可以参考前述对图9中步骤S107的相关描述,这里不再赘述。
在采集到环境中的音频信号后,电子设备将根据两个画面的视场角信息分别对该音频信号进行滤波,得到两个画面相对应的音频信号。
同样以图7B示出的录制界面91为例进行说明。参考前述对步骤S205的说明可知,录制界面91中,主画面对应的视场角为-90°,子画面对应的视场角为+30°。基于这两个视场角,电子设备可以通过上述DSB算法计算得到第一画面和第二画面进行滤波时所需的权值向量
和
并根据这两个权值向量对采集的音频进行过滤,得到第一画面和第二画面对应的波束ys(ω),ym(ω),其可以表示为:
结合前述对图7B和图8A的相关描述可知,波束ys(ω)即为图8A中与∠FPF’同角度方向的音频信号3;波束ym(ω)即为图8A中与∠APA’同角度方向的音频信号4(图8A中未画出)。需注意,这里得出的波束ys(ω)和波束ym(ω)均没有进行左声道和右声道的区分。
S209、电子设备对子画面的声源进行方位虚拟。
在本申请实施例中,声源方位虚拟可以采用调整混音比例或HRTF滤波等方法实现。本申请实施例示例性的给出了以HRTF滤波的方法对本实施例中子画面的音频进行方位虚拟的过程。HRTF滤波方法所需的数据库可选择如美国加利福尼亚大学戴维斯分校CIPIC HRTF数据库,北大HRTF数据库等。另外,该数据库也可通过HRTF建模计算得到。本申请实施例对此不做限定。
本实施例采用开源的CIPIC数据库,并根据方位角z和俯仰角e选择用于卷积的数据,选择的方法为:
zimut=z/90×12+13
elevation=e/90×16+9
相应的,用于卷积的CIPIC_HRIR为:
data_l=hrir_l(zimut,elevation,:)
data_r=hrir_r(zimut,elevation,:)
在对子画面的音频进行方位虚拟时,只需结合上述CIPIC_HRIR数据对ys(ω)进行卷积,即可得到具备虚拟方位听感的音频输出ysl(ω)和ysr(ω),其中卷积用“*”表示:
ysl(ω)=ys(ω)*hrir_l(zimut,elevation,:)
ysr(ω)=ys(ω)*hrir_r(zimut,elevation,:)
主画面进行双声道虚拟:
yml(ω)=ym(ω)
ymr(ω)=ym(ω)
需注意,这里得出的yml(ω)、ymr(ω)与ym(ω)之间并无差异,也就是说,在本实施例中,在画中画的双景录像模式下,从步骤S201-步骤S209电子设备对于主画面的音频的处理过程和图9中步骤S101-步骤S107的处理方式无差别。在重混后,主画面对应的音频在左声道的输出和右声道的输出实际上是一样的。
但是本实施例与图9所示方法的不同之处在于,本实施例对子画面的音频信号进行方位虚拟,由于步骤S209对于子画面得到的ysl(t)为偏向左声道的音频信号,输出的ysr(t)为偏向右声道的音频信号;在重混后,主画面对应的音频在左声道的输出和右声道的输出是不一样的。
S210、电子设备对声源进行重混。
在得到两个画面相应的波束和画面权重后,电子设备将结合两个画面的画面权重对两个波束进行混音。
在重混后,最终电子设备在录制界面90下输出的音频的具体方式可表示为:
outl(ω)=yml(ω)×wm+ysl(ω)×ws
outr(ω)=ymr(ω)×wm+ysr(ω)×ws
不难看出,在本实施例中,虽然在混音时,yml(ω)×wm和ymr(ω)×wm其实是一样的,但是ysl(ω)×ws和ysr(ω)×ws是有差别的。
在播放上述音频信号outl(t),outr(t)时,outl(t),outr(t)可以分别通过电子设备上的两个扬声器输出。在通过双声道输出分别outl(t),outr(t)时,由于图7B所示的录制界面91中,子画面是位于主画面的偏左一侧,则用户的双耳对outl(ω),outr(ω)进行收听时,用户会感觉到子画面中传出的声音是从用户的左侧传来的。此外,由于子画面的面积小于主画面的面积,用户在收听该音频时会感觉到主画面中声音的响度较子画面中的声音响度更大。
S211、电子设备判断界面是否发生变化。
S212、电子设备保存处理后的音频。
其中,所述步骤S211-步骤S212的具体实现方式可参见上述图9所对应实施例中对步骤S101-步骤S104的描述,这里将不再进行赘述。
应理解,不限于图7B示出的录制界面90,本实施例提供的音频处理的方法也同样适用于对图5、图6以及图7A和图7C中示出的录制界面以及其他录制界面下的所录制的音频进行处理。在上下分屏的双景录像模式下,电子设备同样能获取到该录制界面中两个画面的前/后置信息、焦距信息和画面权重,并使用本实施例提供的音频的处理方法对在录制界面91下录制的音频进行过滤和融合。这里不再一一举例说明。只不过,在上下分屏的双景录像模式时,由于上下分屏的录制界面中,两个显示区域的相对方位并无左右之分,则电子设备在的对音频的处理过程中,无需执行步骤S210。也就是说,在采用图10示出的方法对上下分屏的录制界面下采集的音频进行处理时,其具体的处理流程可以变为图9中示出的处理流程。
图11为本申请实施例提供的又一种音频的处理方法的流程图。该方法基于CVX波束训练的方法对采集的音频信号进行滤波,能够利用两个画面的视场角信息对音频信号的滤波方向进行更具体的选择,使每个画面得到的波束都能具有左声道和右声道的区别。能进一步增强音频的立体感。如图11所示,本申请实施例提供的方法可以包括:
S301、电子设备开启双景录像模式。
S302、电子设备根据用户对录制界面的调整显示相应的录制界面。
S303、电子设备采集音频。
S304、电子设备录制图像。
其中,所述步骤S301-步骤S304的具体实现方式可参见上述图9所对应实施例中对步骤S101-步骤S104的描述,这里将不再进行赘述。
S305、电子设备获取画面的视场角。
参考图7A所示的录制界面90。电子设备正采用焦距倍数为固定1×的前置摄像头和焦距倍数为3×的后摄像头在上下分屏模式下进行双景录像。此时,上取景框901呈现的画面(以下称为第一画面)为前置摄像头所拍摄的画面,其为用户自己的人脸;下取景框902呈现的画面(以下称为第二画面)为后置摄像头所拍摄的画面,其为电子设备100前方的风景图像。则由上述表1可知,此时该第一画面视场角为+180°,该第二画面的视场角为-90°。
为了在步骤307中对音频进行波束训练时,能够基于画面的视场角信息得到该与画面对应的具备左右声道区别的两个波束,在本实施例中,电子设备在获取到第一画面和第二画面的视场角后,将对该视场角进行划分。
结合前述对图8B的说明可知,在图8B中,∠OPO’(左侧)即为上述第一画面的视场角,∠BPB’即为上述第一画面的视场角。电子设备100所在平面的法线QQ’为∠OPO’(左侧)以及∠BPB’的角平分线,则法线QQ’将∠OPO’(左侧)分为∠OPQ’(以下称为左视场角1)以及∠O’PQ’(以下称为右视场角1),将∠BPB’分为∠BPQ(以下称为左视 场角2)以及∠B’PQ(以下称为右视场角2)。即,上述第一画面的视场角可划分为左视场角1和右视场角1;上述第二画面的视场角可划分为左视场角2和右视场角2。
S306、电子设备计算画面权重。
步骤S306的具体实现方式可参见上述图9所对应实施例中对步骤S106的描述,两个画面的画面权重w1以及w2分别为:
w1=d1/(d1+d2)
w2=d2/(d1+d2)
其中,w1为第一画面的画面权重,w2为第二画面的画面权重。
S307、电子设备基于CVX波束训练方法对音频进行过滤。
接下来以波束形成算法为例对声源分离的过程进行进一步说明。
由前述对图9中步骤S107的说明可知,滤波所采用的计算公式为:
其中i=1,2,3……M,x
i(ω)表示电子设备中第i(i≤M)个麦克风采集到的音频信号,w
i(ω)通过CVX波束训练方法得到,其表示第i个麦克风在音频信号的频率为ω时波束形成器的权值向量。应理解,无论何种算法实现对音频的过滤,w
i(ω)都是该算法中与滤波方向强相关的一个必要的参数。
本实施例基于CVX波束训练方法得到w
i(ω)。在本实施例中,CVX波束训练方法得到w
i(ω)时,其输入包括该第i个麦克风与其他(M-1)个麦克风的间距、视场角的角度(即滤波方向)。且不同于DSB算法,CVX波束训练方法在计算w
i(ω)的过程中,输入的滤波方向是可以灵活变化的。而在步骤S105中,已经计算出了两个画面的视场角信息。因此,在本申请实施例中,可以将某个画面的视场角按照左右进行划分后作为该方法的输入,得到两个不同的权值向量w
il(ω)和w
ir(ω)右,在后续通过波束形成算法进行滤波时,可得到与该画面对应的两个波束,这两个波束就具备了左声道和右声道上的差异。
在采集到环境中的音频信号后,电子设备将根据两个画面的视场角信息分别对该音频信号进行滤波,得到两个画面相对应的音频信号。
接下来以图7A所示的录制界面90进行说明。结合前述对步骤S305的说明可知,图7A所示的录制界面90为上下分屏模式。在录制界面90中两个画面的视场角被分为左视场角1和右视场角1、左视场角2和右视场角2。
同理,电子设备可以将左视场角2和右视场角2作为输入,计算得到第一画面进行滤波时所需的权值向量,
和
并根据这两个权值向量对采集的音频进行过滤,得到第二画面对应的左声道波束
以及右声道波束
可以表示为:
S308、电子设备对声源进行重混。
在得到两个画面相应的波束和画面权重后,电子设备将结合两个画面的画面权重对两个波束进行混音。
同样以图7A示出的录制界面90为例进行说明,在重混后,最终电子设备在录制界面90下输出的音频的具体方式可表示为:
这里假设电子设备支持双声道输出,则在播放上述时域信号outl(t),outr(t)时,outl(t),outr(t)可以分别通过电子设备上的两个扬声器输出。在滤波时音频信号即区分了左右方向,因此,用户左右耳收听到的音频可能差别更为明显。此外,当两个显示区域的面积大小相 差无几或者相等时(在图5中(C)所示),用户在收听该音频时会感觉两个画面中的声音响度大致相等。不难看出,在本实施例中,outl(ω),outr(ω)用户输出时,用户可以更好的感知音频在左右声道上的差异,使声音更具立体感。
S309、电子设备判断界面是否发生变化。
S310、电子设备保存处理后的音频。
其中,所述步骤S309-步骤S310的具体实现方式可参见上述图9所对应实施例中对步骤S109-步骤S110的描述,这里将不再进行赘述。
应理解,不限于图7A示出的录制界面90,本实施例提供的音频处理的方法也同样适用于对图5、图6以及图7B和图7C中示出的录制界面以及其他录制界面下的音频。
例如,图7B中示出的录制界面91,在画中画的双景录制模式下,电子设备同样能获取到该录制界面下两个画面的前/后置信息、焦距信息和画面权重(具体可以参阅图10中对步骤S205-步骤S207的说明),再参考步骤S305-步骤S308的方式对音频进行过滤和融合。这里不再一一举例说明。
图12为本申请实施例提供的另一种音频的处理方法的流程图。该方法在能在图11的基础上,对画中画模式的双景录像模式的音频进行方位虚拟,能配合两个画面之间的方位感的使音频立体感进一步增强。
结合前述对图10的描述,双景录像模式下采用上下分屏的录制界面进行录制时,若使用本实施例的方法对音频进行处理,其具体的处理流程和所得音频与使用前述图11中所示方法的处理流程和所得音频在本质上无差异。因此,接下来结合图7B所示的录制界面91对本方法的具体步骤进行说明。
如图12所示,本申请实施例提供的方法可以包括:
S401、电子设备开启双景录像模式。
S402、电子设备根据用户对录制界面的调整显示相应的录制界面。
S403、电子设备采集音频。
S404、电子设备录制图像。
其中,所述步骤S401-步骤S404的具体实现方式可参见上述图9所对应实施例中对步骤S101-步骤S104的描述,这里将不再进行赘述。
S405、电子设备获取画面视场角。
参考图7B所示的录制界面91。如图7B所示出的录制界面91,用户正采用焦距倍数为6×的前置摄像头和焦距倍数为2×的后摄像头在画中画模式下进行双景录像。此时,取景框911呈现的画面(以下称为主画面)为焦距倍数为2×的后置摄像头所拍摄的画面,其为电子设备100前方的风景;取景框912呈现的画面(以下称为子画面)为焦距倍数为6×的前置摄像头所拍摄的画面,其为用户自己的人脸。则由上述表1可知,此时主画面对应的视场角为大小90°,子画面对应的视场角为大小为30°。
为了在步骤407中对音频进行RVX波束训练时,能够基于画面的视场角信息得到该与画面对应的具备左右声道区别的两个波束,在本实施例中,电子设备在获取到主画面的 视场角后,将对主画面的视场角进行划分。结合前述对图8B的说明可知,∠FPF’即为上述子画面的视场角(以下称,视场角3),∠APA’即为上述主画面的视场角(以下称为视场角4)。电子设备100所在平面的法线QQ’为∠APA’的角平分线,则法线QQ’将∠APA’分为∠APQ(以下称为左视场角4)以及∠A’PQ(以下称为右视场角4)。而由于子画面的面积相对较小,且后续需结合子画面和主画面的相对位置,对子画面经过RVX波束训练所得到音频信号再进行方位虚拟,因此,在对音频进行RVX波束训练时,子画面无需进行左右声道波束的区分,电子设备也就无需对子画面的视场角进行划分了。
S406、电子设备计算画面权重。
S407、电子设备计算子画面的方位信息。
其中,所述步骤S406-步骤S407的具体实现方式可参见上述图9所对应实施例中对步骤S206-步骤S207的描述,这里将不再进行赘述。
具体的,在步骤S406中,电子设备可计算得到两个画面的画面权重wm以及ws分别为:
ws=α×d1×dw/(Dl×Dw)
wm=1-ws
其中,ws为子画面的画面权重,wm为主画面的画面权重,α为校正系数,是在电子设备出厂时已经设定好的定值,其取值范围为[1,(Dl×Dw)/(d1×dw)],这样,可以防止两个画面面积差距太大而导致面积较小的取景框的画面权重的值过小。
在步骤S406中,电子设备可计算得出,子画面相对于主画面的方位角z和俯仰角e为:
z=a/(Dw/2)×90°
e=b/(Dl/2)×90°
S408、电子设备基于CVX波束训练方法对音频进行过滤。
步骤S408中,电子设备采集音频以及对音频进行频域转化的具体方式可以参考前述对图9中步骤S107的相关描述,这里不再赘述。
在采集到环境中的音频信号后,电子设备将根据两个画面的视场角信息分别对该音频信号进行滤波,得到两个画面相对应的音频信号。
同样以图7B示出的录制界面91为例进行说明。参考前述对步骤S405的说明可知,图7B所示录制界面91中,子画面的视场角为视场角3,主画面的视场角4被分为左视场角4和右视场角4。
结合前述对图7B和图8B的相关描述可知,波束
即为图8B中与∠FPF’同角度方向的音频信号31(图8B中未画出);波束
即为图8B中与∠BPQ同角度方向的左声道音频信号41(图8B中未画出),波束
即为图8B中与∠B’PQ同角度方向的右声道音频信号42(图8B中未画出)。
S409、电子设备对子画面的声源进行方位虚拟。
可参考对图10中步骤S409的说明。本实施例采用开源的CIPIC数据库,并根据方位角z和俯仰角e选择用于卷积的数据,选择的方法为:
zimut=z/90×12+13
elevation=e/90×16+9
相应的,用于卷积的CIPIC_HRIR为:
data_l=hrir_l(zimut,elevation,:)
data_r=hrir_r(zimut,elevation,:)
主画面进行双声道虚拟:
S410、电子设备对声源进行重混。
在得到两个画面相应的波束和画面权重后,电子设备将结合两个画面的画面权重对两 个波束进行混音。
在重混后,最终电子设备在录制界面90下输出的音频的具体方式可表示为:
outl(ω)=yml(ω)×wm+ysl(ω)×ws
outr(ω)=ymr(ω)×wm+ysr(ω)×ws
在播放上述音频信号outl(t),outr(t)时,outl(t),outr(t)可以分别通过电子设备上的两个扬声器输出。在通过双声道输出分别outl(t),outr(t)时,由于图7B所示的录制界面91中,子画面是位于主画面的偏左一侧,则用户的双耳对outl(ω),outr(ω)进行收听时,用户会感觉到子画面中传出的声音是从用户的左侧传来的。且在滤波时音频信号即区分了左右方向,因此,用户左右耳收听到的音频可能差别更为明显,音频的立体感更强。此外,由于子画面的面积小于主画面的面积,用户在收听该音频时会感觉到主画面中声音的响度较子画面中的声音响度更大。
S411、电子设备判断界面是否发生变化。
S412、电子设备保存处理后的音频。
其中,所述步骤S211-步骤S212的具体实现方式可参见上述图9所对应实施例中对步骤S101-步骤S104的描述,这里将不再进行赘述。
应理解,不限于图7B示出的录制界面90,本实施例提供的音频处理的方法也同样适用于对图5、图6以及图7A和图7C中示出的录制界面以及其他录制界面下的所录制的音频进行处理。在上下分屏的双景录像模式下,电子设备同样能获取到该录制界面中两个画面的前/后置信息、焦距信息和画面权重,并使用本实施例提供的音频的处理方法对在录制界面91下录制的音频进行过滤和融合。这里不再一一举例说明。只不过但是,若在步骤S411中,电子设备的检测到录制界面发生变化,且变化后的录制界面不再是画中画模式的录制界面,比如变化为在上下分屏的双景录像模式时,由于上下分屏的录制界面中,两个显示区域的相对方位并无左右之分,则电子设备在后续的对音频的处理过程中,无需执行步骤S410。也就是说,在采用图12示出的方法对上下分屏的录制界面下采集的音频进行处理时,其具体的处理流程可以变为图11中示出的处理流程。
可选的,电子设备中处理器可以是在开始录制音视频时,执行如图9/图10/图11/图12所示出的音频处理方法。即在图5示出的用户界面中响应于控件704的用户操作之后,电子设备中处理器默认执行图9/图10/图11/图12所示出的音频处理方法。
可选的,电子设备中处理器还可以是在录制音视频结束,将录制得到音视频信号存储进存储器时,执行图9/图10/图11/图12所示出的音频处理方法。电子设备中处理器在结束录制音视频,将录制得到音视频信号存储进存储器时执行图9/图10/图11/图12所示出的 音频处理方法,可以在录制音频过程中减少处理器的占用,提高音频录制过程的流畅度。这样,在需要保存录制得到音频信号时才对音频信号执行图9/图10/图11/图12所示出的音频处理方法,从而可以节省处理器资源。
参考图9、图10、图11以及图12示出的音频的处理方法,以及前述对用户界面实施例的说明中可知,双景录像的过程中用户可以对录制界面进行调整,具体的调整方式可以参考前述对图9中步骤S102的说明。对于其中一些调整方式,电子设备可以基于用户的操作平滑处理,例如调整摄像头的焦距以及对取景框的面积进行缩放(包括上下分屏模式下调整两个取景框分隔线的位置,以及现在画中画模式下调整子取景框的大小)、将画面位置进行拖动、或者调整摄像头的焦距时,用户往往能感觉到画面是在缓慢的变化的。但是,当用户将某个取景框对应的摄像头进行切换时,电子设备往往需要一定的处理时间。例如,用户在将某个取景框对应的摄像头从前置摄像头切换为后置摄像头时,或者对调两个取景框之间的画面时,用户往往感觉到画面是突变的。
在图9、图10、图11以及图12示出的音频的处理方法中,当电子设备切换某个取景框对应的摄像头时,该取景框中画面的视场角也随之改变,电子设备基于该画面对音频进行滤波所得的音频信号也会改变。但是,由于电子设备进行镜头切换时在往往需要一定的处理时间,而电子设备基于镜头切换音频的调整短在极短的时间内就能完成。这样,画面的观感和音频的听感可能会有不平衡性。
针对此类调整录制界面的场景,本申请实施例提供了一种平滑切换音频的方法,该方法的应用场景不限于双景摄像模式,也可以是单景摄像(普通摄像)模式。如图13所示,该方法可以包括:
S501、电子设备将取景框的摄像头由历史摄像头切换为目标摄像头。
由历史摄像头切换为目标摄像头的具体场景包括但不限于将该取景框对应的摄像头从前置摄像头切换为后置摄像头时、对调两个取景框之间的画面。
示例性地,电子设备可以检测到作用于如图7A中控件901B上的点击操作,并响应于该操作对图7A中取景框901对应的摄像头由前置摄像头切换为后置摄像头,切换后的用户界面可以参考图7C示出的录制界面92。
示例性地,电子设备可以检测到作用于如图5中(D)所示的画面调换控件708上的点击操作,并响应于该操作将取景框的701和取景框702的画面内容进行切换。
S502、电子设备获取历史音频信号和目标音频信号。
该历史音频信号为电子设备基于历史摄像头的画面(视场角)对音频进行过滤得到的音频信号,示例性的,该历史音频信号为电子设备检测到切换摄像头的操作的前一刻,基于该画面过滤所得到音频信号,例如为用户点击前后置摄像头切换按钮(例如911B、912B、画面调换控件708)或者双击子取景框802的时刻的音频信号。
以下用ya(ω)表示;该目标音频信号为电子设备基于目标摄像头的画面(视场角)对音频进行过滤得到的音频信号yb(ω)。
示例性的,当电子设备响应于用户对图7A中控件901B上的点击操作,将图7A中取景框901对应的摄像头由前置摄像头切换为后置摄像头,且切换后的用户界面为图7C示 出的录制界面92时,上述历史信号ya(ω)即为图8A中所示的音频信号1,上述目标信号yb(ω)即为图8A中所示的音频信号3。
S503、电子设备根据历史摄像头切换为目标摄像头的时长动态调整历史音频信号和目标音频信号的混音比例。
将历史信号yb(ω)在混音时所占的比例用β表示,则目标信号ya(ω)在混音时所占的比例即为(1-β),则β的一种动态调整的方法可表示为:
β=(T/T
1-t)/(T/T
1)
其中,T为电子设备将历史摄像头切换为目标摄像头的所需时长,单位为ms,其具体数值由电子设备的性能决定。可选的,T的数值就是摄像头进行切换所用的时长,例如,由前置摄像头切换后置;T
1为电子设备对音频处理的帧长,其表示电子设备在采集或者处理音频信号时,处理音频信号的帧计数;对不同的电子设备而言,T和T
1与电子设备的性能有关,不同的电子设备可能有不同的T和T
1,但对于固定的电子设备而言,T和T
1均为定值。t为帧计数,t的取值范围为[0,T/T
1-1]。在电子设备触发将历史摄像头切换为目标摄像头的动作后,t第一帧记为0,之后累加,一直到T/T
1-1。
S504、电子设备根据混音比例将历史音频信号和目标音频信号重混。
在历史摄像头切换为目标摄像头的过程中,电子设备每一个帧下计算得到的混音比例,将上述历史音频信号和目标音频信号重混,得到用于后续其他操作的音频。重混方式可表示为:
yc(ω)=β×ya(ω)+(1-β)yb(ω)
其中,yc(ω)即为重混后的音频信号。
由该重混公式可知,在摄像头切换过程中的,目标信号yb(ω)在混音时所占比例会越来越大,历史信号ya(ω)所占比例会越来越小。当摄像头切换完成之后,目标信号yb(ω)占比为1,历史信号ya(ω)占比为0。这样,音频就能配合画面的变换进行平滑切换,让用户感觉到声音的方向也是随着画面的切换平缓地发生变化的。
本申请实施例还提供了一种电子设备,该电子设备包括:一个或多个处理器和存储器;
其中,存储器与所述一个或多个处理器耦合,该存储器用于存储计算机程序代码,该计算机程序代码包括计算机指令,该一个或多个处理器调用该计算机指令以使得所述电子设备执行前述实施例中所示的方法。
上述实施例中所用,根据上下文,术语“当…时”可以被解释为意思是“如果…”或“在…后”或“响应于确定…”或“响应于检测到…”。类似地,根据上下文,短语“在确定…时”或“如果检测到(所陈述的条件或事件)”可以被解释为意思是“如果确定…”或“响应于确定…”或“在检测到(所陈述的条件或事件)时”或“响应于检测到(所陈述的条件或事件)”。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算 机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如DVD)、或者半导体介质(例如固态硬盘)等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。
Claims (13)
- 一种音频的处理方法,其特征在于,所述方法包括:显示第一界面,所述第一界面包括第一控件;检测到对所述第一控件的第一操作;响应于所述第一操作,在第一时刻开始拍摄,显示第二界面,所述第二界面包括第一显示区域和第二显示区域;在第二时刻,电子设备在第一显示区域显示第一摄像头实时采集的第一画面,在第二显示区域显示第二摄像头实时采集的第二画面;在所述第二时刻,麦克风采集第一声音,所述第一声音为在所述第二时刻所述电子设备所处的实时环境的声音;检测到对第三控件的第二操作;响应于所述第二操作,停止拍摄,保存第一视频,所述第一视频包括所述第一画面和所述第二画面,在所述第一视频的所述第二时刻处,所述第一画面、所述第二画面对应第二声音,所述第二声音是根据第一画面和第二画面的画面权重对所述第一声音进行处理后得到的。
- 根据权利要求1所述的方法,其特征在于,所述电子设备以上下分屏的形式显示所述第一显示区域和所述第二显示区域,所述第一显示区域的面积为第一面积,所述第二显示区域的面积为第二面积;所述第一画面的画面权重为所述第一面积与总面积的比值,所述第二画面的画面权重为所述第二面积与所述总面积的比值,所述总面积为所述第一面积和所述第二面积的面积之和。
- 根据权利要求1所述的方法,其特征在于,所述第一显示区域以悬浮窗的形式显示于所述第二显示区域上;所述第一显示区域的面积为第一面积,所述电子设备的显示屏的面积为第三面积;所述第一画面的画面权重为所述第一面积与所述第三面积的比值,所述第二画面的画面权重为整数1与所述第一画面的权重的差值。
- 根据权利要求1至3任一项所述的方法,其特征在于,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:根据第一画面和第二画面的画面权重对所述第一子声音和第二子声音进行混音;在所述第一画面的画面权重大于所述第二画面的权重的情况下,采用第一混音比例,使所述第一子声音的响度大于所述第二子声音的响度;在所述第一画面的画面权重小于所述第二画面的权重的情况下,采用第二混音比例,使所述第一子声音的响度小于所述第二子声音的响度;在所述第一画面的画面权重等于所述第二画面的权重的情况下,采用第三混音比例,使所述第一子声音的响度等于所述第二子声音的响度。
- 根据权利要求1至4任一项所述的方法,其特征在于,所述检测到对第三控件的第二操作之前,所述方法还包括:所述电子设备保存所述第一声音;根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:所述电子设备根据第一画面和第二画面的画面权重对所述第一声音进行处理,得到所述第二声音;所述电子设备保存所述第二声音,并将所述第一声音删除。
- 根据权利要求1至5任一项所述的方法,其特征在于,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:所述电子设备根据所述第一摄像头的第一视场角和所述第二摄像头的第二视场角,分别对所述第一声音进行滤波,得到所述第一子声音和所述第二子声音;所述电子设备根据所述第一画面和所述第二画面的画面权重,调整所述第一子声音和所述第二子声音的响度之后,将所述第一子声音和第二子声音进行混音,得到所述第二声音。
- 根据权利要求1-6任一项所述的方法,其特征在于,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:所述电子设备根据所述第一摄像头的第一视场角和所述第二摄像头的第二视场角,分别对所述第一声音进行滤波,得到所述第一子声音和所述第二子声音;所述电子设备获取所述第一显示区域相对于所述第二显示区域的第一方位信息;所述电子设备根据所述第一方位信息对所述第一子声音进行方位虚拟,得到第一左方位声音和第一右方位声音;所述电子设备根据所述第一画面和所述第二画面的画面权重,调整所述第一左方位声音、第一右方位声音以及所述第二子声音的响度之后,将所述第一子声音和第二子声音进行混音,得到所述第二声音。
- 根据权利要求1-7任一项所述的方法,其特征在于,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述根据第一画面和第二画面的画面权重对所述第一声音进行处理包括:所述电子设备根据所述第一摄像头的第一视场角和所述第二摄像头的第二视场角,分别对所述第一声音进行滤波,得到所述第一子声音和所述第二子声音;所述第一子声音包括第一左声道声音和第一右声道声音,所述第一左声道声音由所述电子设备根据所述第一视场角的左半角对所述第一声音进行滤波得到;所述第一右声道声音由所述电子设备根据所述第一视场角的右半角对所述第一声音进行滤波得到;所述第二方向声音包括第二左声道声音和第二右声道声音,所述第二左声道声音由所述电子设备根据所述第二视场角的左半角对所述第一声音进行滤波得到;所述第二右声道声音由所述电子设备根据所述第二视场角的右半角对所述第一声音进行滤波得到;所述电子设备根据所述第一画面和所述第二画面的画面权重,调整所述第一左声道声音、所述第一右声道声音、所述第二左声道声音以及所述第二右声道声音的响度之后,将所述第一子声音和第二子声音进行混音,得到所述第二声音。
- 根据权利要求1-8任一项所述的方法,其特征在于,所述第一声音包括第一子声音和第二子声音,所述第一子声音为所述第一画面的声音,所述第二子声音为所述第二画面的声音,所述检测到对第三控件的第二操作之前,所述方法还包括:在第三时刻,响应于切换摄像头的操作,所述电子设备将所述第一显示区域显示的画面由所述第一摄像头拍摄的画面切换为第三摄像头拍摄的画面;在第四时刻,所述电子设备在第一显示区域显示所述第三摄像头拍摄的第三画面,所述第四时刻在所述第三时刻之后;所述电子设备根据所述第三摄像头的第三视场角和所述第一摄像头的第一视场角,分别对所述第一声音进行滤波,得到历史声音和目标声音;在所述第三时刻和所述第四时刻的时间内,所述电子设备根据将所述第三时刻和所述第四时刻之间的时间间隔,动态调整所述历史声音和所述目标声音的混音比例,并基于所述混音比例将所述历史声音和所述目标声音混音,得到所述第一子声音。
- 一种电子设备,其特征在于,所述电子设备包括:一个或多个处理器、存储器和显示屏;所述存储器与所述一个或多个处理器耦合,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,所述一个或多个处理器调用所述计算机指令以使得所述电子设备执行如权利要求1-9中任一项所述的方法。
- 一种芯片系统,所述芯片系统应用于电子设备,所述芯片系统包括一个或多个处理器,所述处理器用于调用计算机指令以使得所述电子设备执行如权利要求1-9中任一项所述的方法。
- 一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在电子设备上运行时,使得所述电子设备执行如权利要求1-9中任一项所述的方法。
- 一种计算机可读存储介质,包括指令,其特征在于,当所述指令在电子设备上运行时,使得所述电子设备执行如权利要求1-9中任一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22797242.9A EP4142295A4 (en) | 2021-06-16 | 2022-04-22 | AUDIO PROCESSING METHOD AND ELECTRONIC DEVICE |
US17/925,373 US20240236596A1 (en) | 2021-06-16 | 2022-04-22 | Audio processing method and electronic device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110667735.XA CN113573120B (zh) | 2021-06-16 | 2021-06-16 | 音频的处理方法及电子设备、芯片系统及存储介质 |
CN202110667735.X | 2021-06-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022262416A1 true WO2022262416A1 (zh) | 2022-12-22 |
Family
ID=78162090
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/088335 WO2022262416A1 (zh) | 2021-06-16 | 2022-04-22 | 音频的处理方法及电子设备 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240236596A1 (zh) |
EP (1) | EP4142295A4 (zh) |
CN (1) | CN113573120B (zh) |
WO (1) | WO2022262416A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113573120B (zh) * | 2021-06-16 | 2023-10-27 | 北京荣耀终端有限公司 | 音频的处理方法及电子设备、芯片系统及存储介质 |
CN116095254B (zh) * | 2022-05-30 | 2023-10-20 | 荣耀终端有限公司 | 音频处理方法和装置 |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008048374A (ja) * | 2006-07-21 | 2008-02-28 | Victor Co Of Japan Ltd | ビデオカメラ装置 |
CN103516895A (zh) * | 2012-06-25 | 2014-01-15 | Lg电子株式会社 | 移动终端及其音频缩放方法 |
CN104699445A (zh) * | 2013-12-06 | 2015-06-10 | 华为技术有限公司 | 一种音频信息处理方法及装置 |
CN105165017A (zh) * | 2013-02-25 | 2015-12-16 | 萨万特系统有限责任公司 | 视频平铺 |
CN110072070A (zh) * | 2019-03-18 | 2019-07-30 | 华为技术有限公司 | 一种多路录像方法及设备 |
CN110740259A (zh) * | 2019-10-21 | 2020-01-31 | 维沃移动通信有限公司 | 视频处理方法及电子设备 |
CN112351248A (zh) * | 2020-10-20 | 2021-02-09 | 杭州海康威视数字技术股份有限公司 | 一种关联图像数据和声音数据的处理方法 |
CN113573120A (zh) * | 2021-06-16 | 2021-10-29 | 荣耀终端有限公司 | 音频的处理方法及电子设备 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4735991B2 (ja) * | 2008-03-18 | 2011-07-27 | ソニー株式会社 | 画像処理装置および方法、プログラム並びに記録媒体 |
KR20140114501A (ko) * | 2013-03-14 | 2014-09-29 | 삼성전자주식회사 | 영상 데이터 처리 방법 및 이를 지원하는 전자 장치 |
EP3172653A4 (en) * | 2014-07-25 | 2018-01-17 | Samsung Electronics Co., Ltd. | Displaying method, animation image generating method, and electronic device configured to execute the same |
CN108337465B (zh) * | 2017-02-09 | 2021-05-14 | 腾讯科技(深圳)有限公司 | 视频处理方法和装置 |
CN108683874B (zh) * | 2018-05-16 | 2020-09-11 | 瑞芯微电子股份有限公司 | 一种视频会议注意力聚焦的方法及一种存储设备 |
CN108668215B (zh) * | 2018-07-18 | 2024-05-14 | 广州市锐丰音响科技股份有限公司 | 全景音域系统 |
CN112929739A (zh) * | 2021-01-27 | 2021-06-08 | 维沃移动通信有限公司 | 发声控制方法、装置、电子设备和存储介质 |
-
2021
- 2021-06-16 CN CN202110667735.XA patent/CN113573120B/zh active Active
-
2022
- 2022-04-22 WO PCT/CN2022/088335 patent/WO2022262416A1/zh active Application Filing
- 2022-04-22 US US17/925,373 patent/US20240236596A1/en active Pending
- 2022-04-22 EP EP22797242.9A patent/EP4142295A4/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008048374A (ja) * | 2006-07-21 | 2008-02-28 | Victor Co Of Japan Ltd | ビデオカメラ装置 |
CN103516895A (zh) * | 2012-06-25 | 2014-01-15 | Lg电子株式会社 | 移动终端及其音频缩放方法 |
CN105165017A (zh) * | 2013-02-25 | 2015-12-16 | 萨万特系统有限责任公司 | 视频平铺 |
CN104699445A (zh) * | 2013-12-06 | 2015-06-10 | 华为技术有限公司 | 一种音频信息处理方法及装置 |
CN110072070A (zh) * | 2019-03-18 | 2019-07-30 | 华为技术有限公司 | 一种多路录像方法及设备 |
CN110740259A (zh) * | 2019-10-21 | 2020-01-31 | 维沃移动通信有限公司 | 视频处理方法及电子设备 |
CN112351248A (zh) * | 2020-10-20 | 2021-02-09 | 杭州海康威视数字技术股份有限公司 | 一种关联图像数据和声音数据的处理方法 |
CN113573120A (zh) * | 2021-06-16 | 2021-10-29 | 荣耀终端有限公司 | 音频的处理方法及电子设备 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4142295A4 |
Also Published As
Publication number | Publication date |
---|---|
US20240236596A1 (en) | 2024-07-11 |
EP4142295A4 (en) | 2023-11-15 |
CN113573120A (zh) | 2021-10-29 |
CN113573120B (zh) | 2023-10-27 |
EP4142295A1 (en) | 2023-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230116044A1 (en) | Audio processing method and device | |
WO2020186969A1 (zh) | 一种多路录像方法及设备 | |
WO2020192458A1 (zh) | 一种图像处理的方法及头戴式显示设备 | |
JP7355941B2 (ja) | 長焦点シナリオにおける撮影方法および端末 | |
US20230328429A1 (en) | Audio processing method and electronic device | |
CN109191549B (zh) | 显示动画的方法及装置 | |
CN111183632A (zh) | 图像捕捉方法及电子设备 | |
EP3996046A1 (en) | Image-text fusion method and apparatus, and electronic device | |
WO2022262416A1 (zh) | 音频的处理方法及电子设备 | |
WO2022262313A1 (zh) | 基于画中画的图像处理方法、设备、存储介质和程序产品 | |
US11870941B2 (en) | Audio processing method and electronic device | |
EP4138381A1 (en) | Method and device for video playback | |
EP4195707A1 (en) | Function switching entry determining method and electronic device | |
CN115514883B (zh) | 跨设备的协同拍摄方法、相关装置及系统 | |
EP4325877A1 (en) | Photographing method and related device | |
WO2024037352A1 (zh) | 一种分屏显示方法及相关装置 | |
WO2023016032A1 (zh) | 一种视频处理方法及电子设备 | |
WO2023143171A1 (zh) | 一种采集音频的方法及电子设备 | |
WO2022160795A1 (zh) | 基于光场显示的显示模式的转换方法及转换装置 | |
WO2024027374A1 (zh) | 隐藏信息显示方法、设备、芯片系统、介质及程序产品 | |
CN118012319A (zh) | 一种图像处理方法、电子设备及计算机可读存储介质 | |
CN114816051A (zh) | 虚拟空间互动方法、装置、终端以及计算机可读存储介质 | |
CN115691555A (zh) | 一种录制处理方法及相关装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 17925373 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 2022797242 Country of ref document: EP Effective date: 20221108 |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22797242 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |