WO2023231585A1 - 视频拍摄方法、装置、设备和存储介质 - Google Patents

视频拍摄方法、装置、设备和存储介质 Download PDF

Info

Publication number
WO2023231585A1
WO2023231585A1 PCT/CN2023/087623 CN2023087623W WO2023231585A1 WO 2023231585 A1 WO2023231585 A1 WO 2023231585A1 CN 2023087623 W CN2023087623 W CN 2023087623W WO 2023231585 A1 WO2023231585 A1 WO 2023231585A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
video stream
multimedia file
image
image fusion
Prior art date
Application number
PCT/CN2023/087623
Other languages
English (en)
French (fr)
Inventor
崔瀚涛
苗锋
Original Assignee
荣耀终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 荣耀终端有限公司 filed Critical 荣耀终端有限公司
Publication of WO2023231585A1 publication Critical patent/WO2023231585A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/62Control of parameters via user interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/95Computational photography systems, e.g. light-field imaging systems
    • H04N23/951Computational photography systems, e.g. light-field imaging systems by using two or more images to influence resolution, frame rate or aspect ratio
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing

Definitions

  • the present application relates to the field of video processing technology, and in particular to a video shooting method, device, equipment and storage medium.
  • terminals gradually integrate communication, photography, audio and video functions, and become an indispensable part of people's daily lives. Users can use the terminal to shoot videos and record every moment of life.
  • the terminal supports using multiple cameras at the same time to shoot videos. Specifically, the terminal can simultaneously collect multiple video streams through multiple cameras, and then perform image fusion processing on the multiple video streams to obtain a fused video stream, so as to display the video image of the fused video stream on the recording interface. Moreover, after the video shooting is completed, the terminal can also save the fused video stream for subsequent viewing by the user.
  • This application provides a video shooting method, device, equipment and storage medium, which can generate a video stream with good image fusion effect after the video shooting is completed.
  • the technical solutions are as follows:
  • a video shooting method is provided.
  • multiple video streams are obtained, and then image fusion processing is performed on the multiple video streams to obtain the first video stream, and the image fusion parameters corresponding to the first video stream are obtained.
  • a first multimedia file containing the first video stream is generated, and a second multimedia file containing the multi-channel video stream and the image fusion parameter is generated.
  • the first multimedia file and the second multimedia file are stored in association.
  • the image fusion parameter corresponding to the first video stream is used to indicate the image fusion method of the multiple video streams when the first video stream is obtained.
  • the image fusion parameters may include an image splicing mode, and may further include an image splicing position of each video stream in the multiple video streams.
  • the image splicing mode may include one or more of up-down splicing mode, left-right splicing mode, picture-in-picture nesting mode, etc.
  • the image splicing position of any one of the multiple video streams is used to indicate the position of the video image of this video stream when splicing is performed according to the corresponding image splicing mode.
  • the first video stream in the first multimedia file has an image fusion effect.
  • the user can watch the first video stream with the image fusion effect in the stored first multimedia file, and can instantly share the first multimedia file with other people for viewing. .
  • the multi-channel video stream in the second multimedia file is the original video stream without image fusion processing, that is, without Video streaming with image fusion effects.
  • the image fusion parameter in the second multimedia file is used to indicate the image fusion method that needs to be used in subsequent fusion of the multiple video streams in the second multimedia file.
  • the terminal can not only play each video stream in the multiple video streams according to the stored second multimedia file, but also playback each video stream in the stored second multimedia file according to the stored second multimedia file.
  • the multi-channel video stream and the image fusion parameter generate a fused video stream with an image fusion effect. Since the terminal does not need to perform real-time video recording after the video shooting is completed, it can provide higher video processing capabilities.
  • the image fusion effect of the fused video stream generated by the terminal based on the second multimedia file is better than that during the video shooting.
  • the image fusion effect of the first video stream in the first multimedia file generated during the process can enable the user to finally obtain a video stream with better image fusion effect for playback.
  • the operation of obtaining multiple video streams may be: obtaining one video stream collected by each camera in multiple cameras to obtain the multiple video streams.
  • This method is a multi-camera simultaneous recording scene, that is, multiple cameras record at the same time to obtain a video stream collected by each of the multiple cameras.
  • the multiple cameras may all be disposed on the terminal.
  • the terminal records video simultaneously through its multiple cameras, so that the terminal can obtain one video stream collected by each of the multiple cameras to obtain multiple video streams.
  • some of the multiple cameras may be disposed on the terminal, and another part of the cameras may be disposed on a collaboration device that is in a multi-screen collaboration state with the terminal.
  • the terminal records simultaneously through its own camera and the camera of the collaborative device.
  • the collaborative device can send the video stream collected by its own camera to the terminal, so that the terminal can obtain the video stream collected by its own camera and
  • the video stream collected by the camera of the collaborative device is used to obtain multiple video streams.
  • the operation of obtaining multiple video streams may be: obtaining one video stream collected by the camera, performing image processing on this video stream, and obtaining another video stream.
  • This method is a single-camera simultaneous recording scene, that is, recording through one camera to obtain one video stream collected by this camera, and performing image processing on one video stream collected by this camera to obtain another video stream, so that Obtain two video streams, including the original video stream and the image-processed video stream.
  • the terminal can also perform different image processing on this video stream to obtain different video streams.
  • at least three video streams can be obtained, including the original video stream and at least one obtained by different image processing. Two video streams.
  • the video image of the first video stream can also be displayed on the recording interface, so that the captured video can be processed during the video shooting process.
  • Real-time preview allows users to know the image fusion effect of the video in real time.
  • the operation of generating the second multimedia file containing the multiple video streams and the image fusion parameters may be: encoding each video stream in the multiple video streams separately, to obtain Multiple video files; for any video file among the multiple video files, use this video file as a video track, use the image fusion parameter as a parameter track, encapsulate the video track and the parameter track, and obtain the corresponding A multi-track file; multiple multi-track files corresponding one-to-one to the multiple video files are determined as second multimedia files.
  • the video files of each video stream in the multi-channel video stream are individually encapsulated to obtain the corresponding Corresponding to a multi-track file, in this way, the multi-track file of each video stream in the multi-channel video stream can be obtained, that is, multiple multi-track files can be obtained.
  • the second multimedia file includes the plurality of multi-track files.
  • the operation of generating the second multimedia file containing the multiple video streams and the image fusion parameters may be: encoding each of the multiple video streams separately, Obtain multiple video files; use each video file in the multiple video files as a video track to obtain multiple video tracks; use the image fusion parameter as a parameter track; use the multiple video tracks and the parameter track Encapsulate to obtain the second multimedia file.
  • the plurality of video files of the multi-channel video stream are encapsulated as a whole to obtain the second multimedia file.
  • the first video stream and the associated button in the first multimedia file can also be displayed in the video list, and the associated button is used to indicate displaying the second video stream associated with the first multimedia file. media files. If the selection operation of the associated button is detected, the multiple video streams in the second multimedia file are displayed so that the user can know which original video streams the first video stream in the first multimedia file is fused with. obtained, and facilitates the user to select and play any video stream among the multiple video streams in the second multimedia file.
  • the multiple video streams can also be obtained from the second multimedia file, and then at least one video stream among the multiple video streams can be played.
  • the multiple video streams can be displayed in a video list, and then the user can select to play at least one of the multiple video streams.
  • the fusion adjustment information in the second multimedia file is updated according to the fusion adjustment information carried by the fusion adjustment instruction. Image fusion parameters.
  • the fusion adjustment instruction is used to indicate the image fusion method that needs to be used to adjust the multiple video streams.
  • the user can manually trigger the fusion adjustment instruction frame by frame according to his own needs.
  • the fusion adjustment instruction is used to instruct a change of the image fusion method.
  • the user can instruct a change of the image splicing mode, and/or change The image splicing position of each video stream. That is, the fusion adjustment information carried in the fusion adjustment instruction may include the image splicing mode that needs to be adjusted, and/or may include the image splicing positions that each video stream needs to be adjusted to.
  • the terminal can modify the image fusion parameter in the second multimedia file according to the fusion adjustment information, and update the image fusion parameter, so that the subsequent image fusion parameter can be modified according to the image fusion parameter in the second multimedia file.
  • the image fusion processing performed meets the latest needs of users.
  • the multiple video streams and the image fusion parameters can also be obtained from the second multimedia file, and then image fusion processing is performed on the multiple video streams according to the image fusion parameters to obtain the third
  • the second video stream generates a third multimedia file based on the second video stream.
  • the first multimedia file stored in association with the second multimedia file can also be updated to the third multimedia file.
  • the image fusion parameters corresponding to the first video stream are the same as the image fusion parameters corresponding to the second video stream. That is, the same image fusion method is used to perform image fusion processing on the multiple video streams to obtain the third video stream.
  • a video stream and a second video stream Since there is no need to perform real-time video recording after the video shooting is completed, higher video processing capabilities can be provided, so that the image fusion effect of the second video stream generated at this time is better than that of the first video stream generated during the video shooting process. Image fusion effect.
  • the first multimedia file stored in association with the second multimedia file is updated to the third multimedia file, so that the video stream in the multimedia file stored in association with the second multimedia file is an image.
  • Video streams with better fusion effects can enable users to finally get video streams with better image fusion effects for playback.
  • the third multimedia file can be generated according to the second multimedia file. Then the multimedia file stored in association with the second multimedia file (which may be the first multimedia file or the old third multimedia file) is updated to the newly generated third multimedia file, so as to
  • the video stream in the multimedia file stored in association with the second multimedia file is a video stream with a good image fusion effect that meets the user's latest image fusion requirements.
  • a video shooting device In a second aspect, a video shooting device is provided.
  • the video shooting device has the function of realizing the behavior of the video shooting method in the first aspect.
  • the video shooting device includes at least one module, and the at least one module is used to implement the video shooting method provided in the first aspect.
  • a video shooting device in a third aspect, includes a processor and a memory.
  • the memory is used to store a program that supports the video shooting device in executing the video shooting method provided in the first aspect. and storing data involved in implementing the video shooting method described in the first aspect.
  • the processor is configured to execute a program stored in the memory.
  • the video capture device may further include a communication bus for establishing a connection between the processor and the memory.
  • a computer-readable storage medium stores instructions, which when run on a computer, cause the computer to execute the video shooting method described in the first aspect.
  • a fifth aspect provides a computer program product containing instructions that, when run on a computer, causes the computer to execute the video shooting method described in the first aspect.
  • Figure 1 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • Figure 2 is a block diagram of a terminal software system provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of a video image provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of the first recording interface provided by the embodiment of the present application.
  • Figure 5 is a schematic diagram of the second recording interface provided by the embodiment of the present application.
  • Figure 6 is a schematic diagram of the third recording interface provided by the embodiment of the present application.
  • Figure 7 is a schematic diagram of the fourth video recording interface provided by the embodiment of the present application.
  • Figure 8 is a flow chart of a video shooting method provided by an embodiment of the present application.
  • Figure 9 is a schematic diagram of a dual video container provided by an embodiment of the present application.
  • Figure 10 is a schematic diagram of another dual video container provided by an embodiment of the present application.
  • Figure 11 is a schematic diagram of a video list provided by an embodiment of the present application.
  • Figure 12 is a schematic diagram of generating a third multimedia file provided by an embodiment of the present application.
  • Figure 13 is a schematic diagram of the first video shooting method provided by the embodiment of the present application.
  • Figure 14 is a schematic diagram of the second video shooting method provided by the embodiment of the present application.
  • Figure 15 is a schematic diagram of the third video shooting method provided by the embodiment of the present application.
  • Figure 16 is a schematic diagram of the fourth video shooting method provided by the embodiment of the present application.
  • Figure 17 is a schematic structural diagram of a video shooting device provided by an embodiment of the present application.
  • phrases such as “one embodiment” or “some embodiments” used in this application mean that a particular feature, structure, or characteristic described in the embodiment is included in one or more embodiments of the application. Therefore, the phrases “in one embodiment”, “in some embodiments”, “in some other embodiments”, “in some other embodiments” etc. appearing in different places in this application are not necessarily References are made to the same embodiment, but rather to “one or more but not all embodiments” unless specifically stated otherwise. In addition, the terms “includes,” “includes,” “having,” and variations thereof all mean “including but not limited to,” unless otherwise specifically emphasized.
  • Figure 1 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • the terminal 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, and an antenna 1 , Antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and subscriber identity module (SIM) card interface 195, etc.
  • SIM subscriber identity module
  • the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and an environment.
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the terminal 100.
  • the terminal 100 may include more or fewer components than shown in the figures, or some components may be combined, or some components may be separated, or may be arranged differently.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU) wait.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • NPU neural-network processing unit
  • different processing units can be independent devices or integrated in one or more processors.
  • the controller may be the nerve center and command center of the terminal 100 .
  • the controller can operate according to instructions Code and timing signals generate operation control signals to complete the control of instruction fetching and execution.
  • the processor 110 may also be provided with a memory for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
  • the charging management module 140 may receive wireless charging input through the wireless charging coil of the terminal 100 . While charging the battery 142, the charging management module 140 can also provide power to the terminal 100 through the power management module 141.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, internal memory 121, external memory, display screen 194, camera 193, wireless communication module 160, etc.
  • the power management module 141 can also be used to monitor battery capacity, battery cycle times, battery health status (leakage, impedance) and other parameters.
  • the power management module 141 may also be provided in the processor 110 .
  • the power management module 141 and the charging management module 140 may also be provided in the same device.
  • the wireless communication function of the terminal 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied to the terminal 100.
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, perform filtering, amplification and other processing on the received electromagnetic waves, and transmit them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves through the antenna 1 for radiation.
  • at least part of the functional modules of the mobile communication module 150 may be disposed in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
  • the wireless communication module 160 can provide applications on the terminal 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (bluetooth, BT), and global navigation satellite system. (global navigation satellite system, GNSS), frequency modulation (FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • WLAN wireless local area networks
  • BT Bluetooth
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC near field communication technology
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110, frequency modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation
  • the terminal 100 implements the display function through the GPU, the display screen 194, and the application processor.
  • the GPU is an image processing microprocessor and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • the terminal 100 can implement the shooting function through the ISP, camera 193, video codec, GPU, display screen 194, application processor, etc.
  • the terminal 100 may include 1 or N cameras 193, where N is an integer greater than 1.
  • the terminal 100 can record video through one or more cameras 193 .
  • the terminal 100 records video simultaneously through multiple cameras 193 .
  • the terminal 100 records the video through one camera 193 .
  • the camera 193 is used to collect video streams. After the video stream is collected by the camera 193, it can be transferred to the ISP for processing.
  • the format of the video image of the video stream collected by the camera 193 is RAW format.
  • the ISP can convert the RAW format video image in the video stream into a YUV format video image, and then perform basic processing on the YUV format video image. Processing, such as adjusting contrast, removing noise, etc.
  • the ISP can receive the video streams collected by each of the multiple cameras 193 , perform basic processing on the multiple video streams, and then transmit the multiple video streams to the application processor.
  • the ISP can receive a video stream collected by one camera 193, perform basic processing on this video stream, and perform image processing on the video stream after basic processing, such as enlarging and cropping. Cut processing, etc., to obtain another video stream, and then transmit the two video streams to the application processor.
  • the application processor can perform image fusion processing on the multiple video streams to obtain the first video stream, and can also generate a first multimedia file containing the first video stream. Further, the application processor can also use the video codec to , the GPU and the display screen 194 display the video image of the first video stream on the recording interface to realize video preview.
  • the application processor can also obtain the image fusion parameters corresponding to the first video stream.
  • the image fusion parameters are used to indicate the image fusion method of the multiple video streams when the first video stream is obtained, and then generate a video containing the multiple channels of video.
  • the second multimedia file of the image fusion parameter corresponding to the stream and the first video stream, and the second multimedia file and the first multimedia file are stored in association, and the image fusion parameter in the second multimedia file Used to indicate the image fusion method that needs to be used in subsequent fusion of the multiple video streams in the second multimedia file. In this way, after the video shooting is completed, a video stream with better image fusion effect can be generated according to the stored second multimedia file.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the terminal 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement the data storage function. For example, save music, video and other files on an external memory card.
  • Internal memory 121 may be used to store computer-executable program code, which includes instructions.
  • the processor 110 executes various functional applications and data processing of the terminal 100 by executing instructions stored in the internal memory 121 .
  • the internal memory 121 may include a program storage area and a data storage area. Among them, the stored program area can store an operating system, at least one application program required for a function (such as a sound playback function, an image playback function, etc.).
  • the storage data area may store data created during use of the terminal 100 (such as audio data, phone book, etc.).
  • the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc.
  • the terminal 100 can implement audio functions, such as music playback, recording, etc., through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor.
  • audio functions such as music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signals. Audio module 170 may also be used to encode and decode audio signals. in some In embodiments, the audio module 170 may be disposed in the processor 110 , or some functional modules of the audio module 170 may be disposed in the processor 110 .
  • the SIM card interface 195 is used to connect a SIM card.
  • the SIM card can be connected to or separated from the terminal 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
  • the terminal 100 can support 1 or N SIM card interfaces, where N is an integer greater than 1.
  • SIM card interface 195 can support Nano SIM card, Micro SIM card, SIM card, etc. Multiple cards can be inserted into the same SIM card interface 195 at the same time. Multiple cards can be of the same type or different types.
  • the SIM card interface 195 is also compatible with different types of SIM cards.
  • the SIM card interface 195 is also compatible with external memory cards.
  • the terminal 100 interacts with the network through the SIM card to implement functions such as calls and data communications.
  • the terminal 100 adopts eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the terminal 100 and cannot be separated from the terminal 100.
  • the software system of the terminal 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
  • This embodiment of the present application takes the Android system with a layered architecture as an example to illustrate the software system of the terminal 100 .
  • FIG. 2 is a block diagram of a software system of a terminal 100 provided by an embodiment of the present application.
  • the layered architecture divides the software into several layers, and each layer has clear roles and division of labor.
  • the layers communicate through software interfaces.
  • the Android system is divided from top to bottom into application layer (application, APP), application framework layer (framework, FWK), Android runtime (Android runtime) and system layer, and kernel layer (kernel ).
  • the application layer can include a series of application packages. As shown in Figure 2, the application package can include camera, gallery, calendar, calling, map, navigation, WLAN, Bluetooth, music, video, short message and other applications.
  • the application framework layer provides an application programming interface (API) and programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer can include a window manager, content provider, view system, phone manager, resource manager, notification manager, etc.
  • a window manager is used to manage window programs. The window manager can obtain the display size, determine whether there is a status bar, lock the screen, capture the screen, etc.
  • Content providers are used to store and retrieve data and make this data accessible to applications. These data can include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
  • the view system includes visual controls, such as controls that display text, controls that display pictures, etc. The view system can be used to build the display interface of the application.
  • the display interface can be composed of one or more views. For example, it includes a view that displays SMS notification icons, a view that displays text, and a view that displays pictures.
  • the phone manager is used to provide communication functions of the terminal 100, such as management of call status (including connecting, hanging up, etc.).
  • the resource manager provides various resources to applications, such as localized strings, icons, pictures, layout files, video files, etc.
  • the notification manager allows the application to display notification information in the status bar. It can be used to convey notification-type messages. It can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc. .
  • the notification manager can also be notifications that appear in the status bar at the top of the system in the form of graphics or scrollbar text, for example, notifications for applications running in the background.
  • the notification manager can also be a notification that appears on the screen in the form of a dialog window, for example, prompting text information in the status bar, emitting a prompt sound, vibrating the electronic device, flashing the indicator light, etc.
  • Android runtime includes core libraries and virtual machines.
  • the Android runtime is responsible for the scheduling and management of the Android system. reason.
  • the core library contains two parts: one is the functional functions that need to be called by the Java language, and the other is the core library of Android.
  • the application layer and application framework layer run in virtual machines.
  • the virtual machine executes the java files of the application layer and application framework layer into binary files.
  • the virtual machine is used to perform object life cycle management, stack management, thread management, security and exception management, and garbage collection and other functions.
  • the system layer can include multiple functional modules, such as surface manager, media libraries, three-dimensional graphics processing library (such as OpenGL ES), two-dimensional graphics engine (such as SGL), etc.
  • the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as static image files, etc.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, composition, and layer processing.
  • 2D Graphics Engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer includes at least display driver, camera driver, audio driver, sensor driver, etc.
  • terminals such as mobile phones, tablets, laptops, etc. can display video images 31 of each video stream in multiple video streams during the video shooting process.
  • the multiple video streams may be video streams collected by different cameras.
  • This recording scenario may be called a multi-camera simultaneous recording scenario.
  • the multiple video streams may be video streams collected by one camera but processed differently.
  • This recording scenario may be called a single-camera simultaneous recording scenario.
  • the first video recording scenario multi-camera simultaneous recording scenario
  • the recording interface (which may also be called the video preview interface or the video shooting interface).
  • the terminal has multiple cameras, and the shooting directions of the multiple cameras are different.
  • the terminal can activate a multi-camera recording function to record simultaneously through multiple cameras of the terminal itself, and then display the video image of the video stream collected by each of the multiple cameras in the recording interface.
  • the terminal may have a front camera and a rear camera. After the terminal starts the multi-camera recording function, it starts its own front camera and rear camera. The front camera collects one video stream and the rear camera collects one video stream. Afterwards, as shown in FIG. 4 , the terminal can display the video image 421 of the video stream collected by the front camera and the video image 422 of the video stream collected by the rear camera in the recording interface 41 .
  • the terminal is in a multi-screen collaboration state with other devices (which may be called collaborative devices). Both the terminal and the collaborative device have cameras, and the terminal can take pictures with the help of the camera of the collaborative device.
  • the terminal can activate the collaborative recording function to record simultaneously through the camera of the terminal and the camera of the collaborative device, and then display the video image of the video stream collected by the camera of the terminal and the video captured by the camera of the collaborative device in the recording interface Streaming video images.
  • the terminal and the collaborative device both have a camera.
  • the terminal starts the collaborative recording function It starts its own camera and instructs the collaborative device to start the camera of the collaborative device.
  • the camera of the terminal can capture a video stream
  • the camera of the collaborative device can capture a video stream
  • the collaborative device can send the video stream captured by its own camera to the terminal.
  • the terminal 501 can display the video image 521 of the video stream collected by its own camera and the video image 522 of the video stream collected by the camera of the collaborative device 502 in the recording interface 51 .
  • the second recording scenario single camera and simultaneous recording scene
  • one camera is used to record the video, and the video images collected by this camera and processed through different processes are displayed in the recording interface.
  • the terminal has a camera.
  • the terminal can activate the single-camera recording function to record through the terminal's own camera, and then display the video images collected by the camera after different processing in the recording interface.
  • the terminal may have a rear camera. After the terminal starts the single-camera recording function, it starts its own rear camera. The rear camera collects one video stream. The terminal enlarges and crops the video image of this video stream to obtain the image of the other video stream. video images. After that, as shown in FIG. 6 , the terminal can display the video image 622 of the original video stream captured by the rear camera and the video image 621 of another video stream obtained by amplification and cropping in the recording interface 61 .
  • the video image 622 is an original video image captured by the rear camera
  • the video image 621 is a video image obtained by enlarging and cropping the original video image 622 .
  • the terminal can obtain multiple video streams during the video shooting process, and display the video images of each of the multiple video streams in the recording interface.
  • the video images of each of the multiple video streams can be spliced according to a specific image splicing mode to obtain The video image of the video stream is fused, and then the video image of the fused video stream is displayed in the recording interface.
  • the image splicing mode may include a top-down splicing mode, a left-right splicing mode, a picture-in-picture nested mode, etc.
  • the top-down splicing mode refers to splicing the video images of each video stream in the multi-channel video stream in order from top to bottom, so that the video image of the fused video stream obtained according to the top-down splicing mode contains the multi-channel video
  • the video images of each video stream in the stream are arranged from top to bottom.
  • the video image 32 of the fused video stream displayed in the recording interface is obtained by splicing the video images of each video stream in the multiple video streams according to the top and bottom splicing mode.
  • the left and right splicing mode refers to splicing the video images of each video stream in the multi-channel video stream in order from left to right, so that the video image of the fused video stream obtained according to the left and right splicing mode contains the multi-channel video
  • the video images of each video stream in the stream are arranged from left to right.
  • the picture-in-picture nested mode refers to the process of displaying the main picture in full screen and simultaneously displaying sub-pictures on a small area of the main picture. That is to say, the picture-in-picture nested mode refers to using the video image of one video stream in the multi-channel video stream as the main picture, and using the video images of other video streams in the multi-channel video stream except this video stream. As a sub-picture, the sub-picture is spliced onto a small area of the main picture. For example, as shown in Figure 7, the terminal can display the video image 32 of the fused video stream in the recording interface 71 during the multi-camera recording process.
  • the video image 32 of the fused video stream includes the video collected by the front camera of the terminal.
  • the video image 721 of the video stream and the video image 722 of the video stream collected by the rear camera of the terminal, and the video image 722 of the video stream collected by the rear camera is the main screen
  • the video image 721 of the video stream collected by the front camera is the main screen. Small side of the main screen Sprites that exist on the plot area.
  • the terminal After the terminal obtains multiple video streams during the video shooting process, it needs to perform image fusion processing on the multiple video streams to obtain a fused video stream, so as to display the video image of the fused video stream on the recording interface. Moreover, after the video shooting is completed, the terminal can also save the fused video stream for subsequent viewing by the user.
  • the terminal due to limitations of the camera device, processing chip, image algorithm, etc., it is difficult to take into account the video processing capabilities while ensuring real-time video recording, resulting in the fusion of video streams obtained during the video shooting process. The video doesn't look good.
  • embodiments of the present application provide a video shooting method.
  • image fusion processing is performed on multiple video streams to obtain a first video stream.
  • a first multimedia file containing the first video stream is generated, but also a second multimedia file containing the image fusion parameters corresponding to the multi-channel video stream and the first video stream is generated, and the first multimedia file is generated.
  • the media file is stored in association with the second multimedia file. In this way, after the video shooting is completed, the user can watch the first video stream with the image fusion effect in the stored first multimedia file, and can instantly share the first multimedia file with other people for viewing. .
  • the terminal can also generate a fused video stream with an image fusion effect based on the multi-channel video streams and the image fusion parameters in the stored second multimedia file. Since the terminal does not need to perform real-time video recording after the video shooting is completed, it can provide higher video processing capabilities, so that the image fusion effect of the fused video stream generated based on the second multimedia file is better than that generated during the video shooting process.
  • the image fusion effect of the first video stream in the first multimedia file can be used so that the user can finally obtain a video stream with better image fusion effect for playback.
  • FIG. 8 is a flow chart of a video shooting method provided by an embodiment of the present application. The method is applied to a terminal.
  • the terminal may be the terminal 100 described in the embodiments of FIGS. 1 to 2 above. Referring to Figure 8, the method includes:
  • Step 801 During the video shooting process, the terminal acquires multiple video streams.
  • each timestamp has a corresponding frame of video image in the multi-channel video stream.
  • the timestamp of the i-th frame video image of each video stream in the multi-channel video stream is the same, i is a positive integer.
  • the multiple video streams may be video streams collected by different cameras.
  • the multiple video streams may be video streams collected by one camera but processed differently.
  • the terminal can obtain multiple video streams in the following two ways:
  • the first method the terminal obtains one video stream collected by each camera in multiple cameras to obtain multiple video streams.
  • This method is a multi-camera simultaneous recording scene, that is, multiple cameras record at the same time to obtain a video stream collected by each of the multiple cameras.
  • the multiple cameras may all be disposed on the terminal.
  • the terminal records video simultaneously through its multiple cameras, so that the terminal can obtain one video stream collected by each of the multiple cameras to obtain multiple video streams.
  • some of the multiple cameras may be disposed on the terminal, and another part of the cameras may be disposed on a collaboration device that is in a multi-screen collaboration state with the terminal.
  • the terminal passes its own The camera and the camera of the collaborative device record at the same time.
  • the collaborative device can send the video stream collected by its own camera to the terminal, so that the terminal can obtain the video stream collected by its own camera and the video collected by the camera of the collaborative device. stream to get multiple video streams.
  • the second method the terminal obtains a video stream collected by the camera, performs image processing on this video stream, and obtains another video stream.
  • This method is a single-camera simultaneous recording scene, that is, recording through one camera to obtain one video stream collected by this camera, and performing image processing on one video stream collected by this camera to obtain another video stream, so that Obtain two video streams, including the original video stream and the image-processed video stream.
  • the terminal can also perform different image processing on this video stream to obtain different video streams.
  • at least three video streams can be obtained, including the original video stream and at least one obtained by different image processing. Two video streams.
  • the terminal performs image processing on this video stream, that is, it processes the video images of this video stream. For example, it can enlarge and crop the video images of this video stream to obtain the video of another video stream. image.
  • the camera can be set on the terminal or on a collaboration device in a multi-screen collaboration state with the terminal, which is not limited in the embodiments of the present application.
  • Step 802 The terminal performs image fusion processing on the multiple video streams to obtain the first video stream.
  • the terminal performs image fusion processing on the multiple video streams, that is, performs fusion processing on the video images of each video stream in the multiple video streams to obtain the video image of the first video stream.
  • the first video stream is a video stream with a specific image fusion effect.
  • the timestamps of the video images of each video stream in the multiple video streams are aligned, multiple video images with the same timestamp in the multiple video streams can be fused. Specifically, every time the i-th frame video image of each video stream in the multi-channel video stream is obtained, the i-th frame video image of each video stream in the multi-channel video stream is fused to obtain the first video stream.
  • the i-th frame of video image that is, the video image of each video stream in the multi-channel video stream is fused frame by frame to obtain each frame of video image of the first video stream, so that the video image of the first video stream
  • the timestamp of is also aligned with the timestamp of the video image of each video stream in the multi-channel video stream. In this way, after the image fusion process is performed on the multiple video streams, the video image of the first video stream obtained includes the image after the fusion process of the video image of each video stream in the multiple video streams.
  • the terminal when the terminal performs fusion processing on the video images of each of the multiple video streams, it can splice the video images of each of the multiple video streams according to specific image fusion parameters.
  • the image fusion parameter is used to indicate the image fusion method of the multiple video streams.
  • the image fusion parameters may include an image splicing mode, and may further include an image splicing position of each video stream in the multiple video streams.
  • the image splicing mode may include one or more of up-down splicing mode, left-right splicing mode, picture-in-picture nesting mode, etc., which are not limited in the embodiments of the present application.
  • the image splicing position of any one of the multiple video streams is used to indicate the position of the video image of this video stream when splicing is performed according to the corresponding image splicing mode.
  • the multiple video streams include video stream A and video stream B.
  • the image splicing mode is an up-down splicing mode
  • the image splicing position of video stream A is up
  • the image splicing position of video stream B is down
  • the terminal can splice the video image of video stream A on the video image of video stream B.
  • the upper half of the video image of the first video stream is the video image of video stream A
  • the lower half is the video image of video stream B.
  • the terminal can splice the video image of video stream B in on a small area of video stream A to obtain the video image of the first video stream.
  • the main picture of the video image of the first video stream is the video image of video stream A
  • the sub-picture is the video image of video stream B.
  • the terminal fuses the video images of each video stream in the multi-channel video streams frame by frame to obtain each frame of the video image of the first video stream
  • the image fusion parameters also exist frame by frame. of. That is, the i-th frame video image of each video stream in the multi-channel video stream corresponds to an image fusion parameter, and the first video stream obtained based on the i-th frame video image of each video stream in the multi-channel video stream
  • the i-th frame video image also corresponds to this image fusion parameter.
  • This image fusion parameter can also have a timestamp, and the timestamp of this image fusion parameter is the same as the time of the i-th frame video image of each video stream in the multi-channel video stream.
  • the timestamp of the i-th frame video image of the first video stream is aligned.
  • the image fusion parameters used by the terminal when fusing the video images of each video stream in the multiple video streams can be default, or can be set in advance by the user according to their own needs before shooting the video. , or may be automatically determined by the terminal based on the content of the video image of each video stream in the multiple video streams, which is not limited in the embodiments of the present application.
  • users can also actively adjust image fusion parameters during video shooting.
  • the default image stitching mode is top-bottom stitching mode.
  • the terminal uses the default up-and-down splicing mode to splice the video images of each video stream in the multiple video streams.
  • the user can adjust the image splicing mode to picture in the terminal.
  • the terminal will thereafter use the picture-in-picture nesting mode to continue splicing the video images of each video stream in the multiple video streams.
  • the image fusion parameters of each frame of the video image of each video stream in the multi-channel video stream may be the same or different.
  • the image fusion method of the multiple video streams may be constantly changing during the entire video shooting process. This change may come from the user's manual adjustment. For example, the user may manually adjust the image stitching mode during the video shooting process. , or this change can also come from the automatic adjustment of the terminal. For example, the terminal can select different image fusion parameters according to the different contents of the video images of the multi-channel video stream.
  • the terminal has a front camera and a rear camera.
  • the terminal uses the default image fusion parameters to perform image fusion processing on the multi-channel video stream.
  • the image splicing mode in the default image fusion parameters is the up-down splicing mode, and the front camera captures
  • the image splicing position of the video stream is up, and the image splicing position of the video stream collected by the rear camera is down.
  • the terminal can follow the up and down splicing mode.
  • the video image 421 of the video stream collected by the front camera and the video image 422 of the video stream collected by the rear camera are spliced to obtain the video image 32 of the first video stream displayed in the recording interface 41.
  • the video image 32 of the first video stream is obtained.
  • Video image 421 and video image 422 are arranged in order from top to bottom.
  • the image splicing mode in the adjusted image fusion parameters is picture-in-picture nested mode, and the image splicing position of the video stream captured by the front camera is a sub-picture.
  • the image splicing position of the video stream captured by the rear camera is the main screen, as shown in Figure 7.
  • the terminal can follow the picture-in-picture nesting mode to edit the video stream captured by the front camera.
  • the image 721 is spliced with the video image 722 of the video stream collected by the rear camera to obtain the video image displayed in the recording interface 71
  • the video image 32 of the first video stream is shown. In the video image 32 of the first video stream, the video image 722 is the main picture, and the video image 721 is a sub-picture existing in a small area of the main picture.
  • Step 803 The terminal obtains the image fusion parameters corresponding to the first video stream.
  • the image fusion parameter (also called metadata) corresponding to the first video stream is used to indicate the image fusion method of the multiple video streams when the first video stream is obtained.
  • the image fusion parameter is the image fusion method of the multiple video streams.
  • Parameter information about the splicing method of video images in the video stream That is, after the terminal performs a frame-by-frame fusion process on the video images of each video stream in the multi-channel video stream to obtain each frame of the video image of the first video stream, it can also obtain each frame of the first video stream.
  • the image fusion parameter corresponding to the i-th frame video image of the first video stream is used to indicate the i-th frame of each video stream in the multiple video streams when obtaining the i-th frame video image of the first video stream.
  • the image fusion method of video images that is, the image fusion parameters corresponding to the i-th frame video image of the first video stream are used when fusing the i-th frame video image of each video stream in the multi-channel video stream. image fusion parameters.
  • the image fusion parameters corresponding to each frame of the video image of the first video stream are obtained frame by frame, the image fusion parameters corresponding to the first video stream are actually a parameter stream, and the image fusion parameters of this parameter stream have timestamps, and the The timestamp of the image fusion parameter of the parameter stream is aligned with the timestamp of the video image of the first video stream.
  • the image fusion parameter of the parameter stream is used to indicate how to specifically fuse the video image according to each video stream in the multi-channel video stream.
  • the video image of the first video stream is obtained, that is, the image fusion parameter of the parameter stream is a frame-by-frame description of the image fusion method.
  • the terminal after the terminal obtains the first video stream, it can also display the video image of the first video stream on the recording interface, that is, every time it obtains a frame of video image of the first video stream, it can display this frame of video on the recording interface. Images, so that real-time preview of the captured video can be achieved during the video shooting process, allowing users to know the image fusion effect of the video in a timely manner.
  • the terminal has a front camera and a rear camera.
  • the image splicing mode in this image fusion parameter is the up-down splicing mode, and the image splicing position of the video stream collected by the front camera is up, and the image splicing position of the video stream collected by the rear camera is down, as shown in Figure 4 , the terminal can splice the video image 421 of the video stream collected by the front camera and the video image 422 of the video stream collected by the rear camera according to the up and down splicing mode, to obtain the video image 32 of the first video stream displayed in the recording interface 41 , the video image 421 and the video image 422 in the video image 32 of the first video stream are arranged in order from top to bottom.
  • the terminal has a front camera and a rear camera.
  • the image splicing mode in this image fusion parameter is picture-in-picture nested mode
  • the image splicing position of the video stream collected by the front camera is the sub-picture
  • the image splicing position of the video stream collected by the rear camera is the main picture
  • the terminal can splice the video image 721 of the video stream collected by the front camera and the video image 722 of the video stream collected by the rear camera according to the picture-in-picture nesting mode to obtain the video image displayed in the recording interface 71
  • the video image 722 is the main picture
  • the video image 721 is a sub-picture existing in a small area of the main picture.
  • Step 804 The terminal generates a first multimedia file including the first video stream.
  • the first multimedia file is a file used to play the first video stream.
  • the first video stream in the first multimedia file has an image fusion effect.
  • the terminal can continuously fuse the video image of the first video stream during the video shooting process, so that it can The first multimedia file is continuously generated according to the first video stream. In this way, after the video shooting is completed, the terminal can obtain the first multimedia file including the complete first video stream, which is convenient for users to share instantly.
  • the terminal when the terminal generates the first multimedia file including the first video stream, it can first encode the first video stream to obtain the video file, and then encode the video file and other related files (including but not (limited to audio files) are encapsulated to obtain the first multimedia file.
  • the terminal can also use other methods to generate the first multimedia file including the first video stream, which is not limited in the embodiments of the present application.
  • the format of the video file can be a preset format, such as moving picture experts group 4 (MPEG-4) format, that is, MP4 format, or it can be a streaming media format (flash video, FLV) format, etc. , of course, it can also be in other formats, and the embodiments of this application do not limit this.
  • MPEG-4 moving picture experts group 4
  • FLV streaming media format
  • the audio file may be obtained by encoding an audio stream.
  • the audio stream may be continuously collected by the terminal during the video shooting process, for example, it may be continuously collected by the microphone of the terminal.
  • the timestamp of the audio frame of the audio stream is aligned with the timestamp of the video image of each of the multiple video streams.
  • the format of the audio file may be the same as or different from the format of the video file.
  • the format of the audio file may be MP4 format, FLV format, advanced audio coding (AAC) format, etc. This is not the case in the embodiments of this application. limited.
  • the video file can be used as a video track (track), and other related files can be used as other tracks (for example, the audio file can be used as an audio track), and then the video track can be Pack it with other tracks to obtain a multi-track file as the first multimedia file.
  • track is a sequence of timestamps.
  • the terminal can use a video multiplexer to encapsulate (also called synthesis (mux)) the video track corresponding to the video file and the audio track corresponding to the audio file into an MP4 file.
  • the MP4 file is a multi-track file. It is also the first multimedia file.
  • Step 805 The terminal generates a second multimedia file including the multi-channel video stream and the image fusion parameters.
  • Each video stream in the multiple video streams is stored separately in the second multimedia file, that is, each video stream exists independently.
  • the second multimedia file can be used to play each of the multiple video streams separately.
  • the multi-channel video streams in the second multimedia file are original video streams without image fusion processing, that is, video streams without image fusion effects.
  • the image fusion parameter in the second multimedia file is used to indicate the image fusion method that needs to be used in subsequent fusion of the multiple video streams in the second multimedia file.
  • the terminal can continuously obtain the video image of each video stream in the multiple video streams during the video shooting process, and can also continuously obtain the image fusion parameters in the process of continuously performing image fusion processing on the multiple video streams. , so that the second multimedia file can be continuously generated according to the multi-channel video stream and the image fusion parameter. In this way, after the video shooting is completed, the terminal can obtain the second multimedia file containing the complete multi-channel video stream and the complete image fusion parameters, so as to facilitate post-processing of the multi-channel video stream according to the image fusion parameters. , which improves the post-processing space of the multi-channel video stream.
  • step 805 can be implemented in the following two possible ways.
  • the terminal separately encodes each video stream in the multiple video streams to obtain multiple video files; for any video file among the multiple video files, encodes this video file and the The image fusion parameters are encapsulated to obtain a corresponding encapsulated file; multiple encapsulated files corresponding to the multiple video files are determined as second multimedia files.
  • the format of the video file may be a preset format, such as MP4 format, FLV format, etc., which is not limited in the embodiments of the present application.
  • the video file of each video stream in the multiple video streams is individually encapsulated to obtain a corresponding encapsulated file.
  • the encapsulated file of each video stream in the multiple video streams can be obtained, that is, Get multiple package files.
  • the second multimedia file includes the multiple encapsulated files.
  • the terminal when it encapsulates a certain video file and the image fusion parameter, it can also encapsulate other related files.
  • the terminal can encapsulate the video file, the image fusion parameter and the audio file, Get a corresponding package file.
  • the audio file may be obtained by encoding an audio stream.
  • the audio stream may be continuously collected by the terminal during the video shooting process, for example, it may be continuously collected by the microphone of the terminal.
  • the timestamp of the audio frame of the audio stream is aligned with the timestamp of the video image of each of the multiple video streams.
  • the format of the audio file may be the same as or different from the format of the video file.
  • the format of the audio file may be MP4 format, FLV format, AAC format, etc., which is not limited in the embodiments of the present application.
  • the terminal when encapsulating a certain video file, the image fusion parameters and other related files, can use the video file as a video track, the image fusion parameters as a parameter track, and other related files as other related files.
  • track encapsulate the video track, the parameter track and the other tracks to obtain a corresponding multi-track file as the encapsulated file.
  • a plurality of multi-track files corresponding one-to-one to the plurality of video files are determined as the second multimedia files.
  • the terminal can use a video multiplexer to encapsulate the video track corresponding to the video file, the parameter track corresponding to the image fusion parameter, and the audio track corresponding to the audio file into An MP4 file, which is a multi-track file. Afterwards, the encapsulated multiple MP4 files corresponding one-to-one to the multiple video files are determined as second multimedia files.
  • the terminal encodes each of the multiple video streams separately to obtain multiple video files; encapsulates the multiple video files and the image fusion parameters to obtain the second multimedia body file.
  • the format of the video file may be a preset format, such as MP4 format, FLV format, etc., which is not limited in the embodiments of the present application.
  • the plurality of video files of the multi-channel video stream are encapsulated as a whole to obtain an encapsulated file as the second multimedia file.
  • the terminal when it encapsulates the multiple video files and the image fusion parameters, it can also encapsulate other related files.
  • the terminal can encapsulate the multiple video files, the image fusion parameters and the audio files. Encapsulate to obtain the second multimedia file.
  • the audio file may be obtained by encoding an audio stream.
  • the audio stream may be continuously collected by the terminal during the video shooting process, for example, it may be continuously collected by the microphone of the terminal.
  • the timestamp of the audio frame of the audio stream is aligned with the timestamp of the video image of each of the multiple video streams.
  • the format of the audio file may be the same as or different from the format of the video file.
  • the format of the audio file may be MP4 format, FLV format, AAC format, etc., which is not limited in the embodiments of the present application.
  • each video file in the multiple video files can be used as a video track to obtain multiple videos.
  • track use the image fusion parameters as parameter tracks, use other related files as other tracks, and then The video track, the parameter track and the other tracks are encapsulated to obtain a second multimedia file.
  • the terminal can use a video multiplexer to encapsulate multiple video tracks corresponding to the multiple video files, the parameter track corresponding to the image fusion parameter, and the audio track corresponding to the audio file into an MP4 file.
  • the file is a multi-track file, that is, a second multimedia file.
  • Step 806 The terminal associates and stores the first multimedia file with the second multimedia file.
  • the first video stream in the first multimedia file has an image fusion effect.
  • the user can watch the first video stream with the image fusion effect in the stored first multimedia file, and can instantly share the first multimedia file with other people for viewing. .
  • the multi-channel video streams in the second multimedia file are original video streams without image fusion processing, that is, video streams without image fusion effects.
  • the image fusion parameter in the second multimedia file is used to indicate the image fusion method that needs to be used in subsequent fusion of the multiple video streams in the second multimedia file.
  • the terminal can not only play each video stream in the multiple video streams according to the stored second multimedia file, but also playback each video stream in the stored second multimedia file.
  • the multi-channel video stream and the image fusion parameter generate a fusion video stream with an image fusion effect. Since the terminal does not need to perform real-time video recording after the video shooting is completed, it can provide higher video processing capabilities.
  • the image fusion effect of the fused video stream generated by the terminal based on the second multimedia file is better than that during the video shooting.
  • the image fusion effect of the first video stream in the first multimedia file generated during the process can enable the user to finally obtain a video stream with better image fusion effect for playback.
  • the terminal when the terminal associates and stores the first multimedia file and the second multimedia file, it may bind and associate the first multimedia file and the second multimedia file to form a video container.
  • This video container may be called a dual video container in this embodiment of the application. That is, the terminal can store the first multimedia file and the second multimedia file in the dual video container to achieve associated storage of the first multimedia file and the second multimedia file.
  • the first multimedia file can be stored in the dual video container, and the first multimedia file can be stored in the dual video container.
  • the file contains a first video stream, and the first video stream has an image fusion effect
  • the dual video container can also store a second multimedia file, and the second multimedia file includes multiple encapsulated files, as shown in the figure Encapsulated file A and encapsulated file B shown in 9.
  • Each encapsulated file in the multiple encapsulated files contains a video stream and the image fusion parameters.
  • This video stream is the original video stream without image fusion effect, as shown in the figure
  • the encapsulated file A shown in Figure 9 contains video stream A
  • the encapsulated file B contains video stream B.
  • Neither video stream A nor video stream B has an image fusion effect.
  • the first multimedia file can be stored in the dual video container, and the first multimedia file can be stored in the dual video container.
  • the multimedia file contains a first video stream, and the first video stream has an image fusion effect
  • the dual video container can also store a second multimedia file, and the second multimedia file contains multiple video streams (such as Video stream A and video stream B) shown in Figure 10 and the image fusion parameters, the multi-channel video streams are all original video streams without image fusion effects.
  • the implementation specifications of the dual video container are different depending on the recording scene.
  • the implementation specifications of the dual video container can be as shown in Table 1 below.
  • the embodiment of the present application only takes the above Table 1 as an example to describe the implementation specifications of the dual video container.
  • the above Table 1 does not limit the embodiment of the present application.
  • the terminal can continuously generate the first multimedia file and the second multimedia file, and store the two in association. Further, after the video shooting is completed, the terminal can also display the first video stream in the stored first multimedia file in the video list (also called a gallery), so that the user can choose to play the first multimedia file.
  • the first video stream in the body file is also displayed.
  • the terminal may display an associated button in the video list.
  • the association button is used to indicate displaying a second multimedia file associated with the first multimedia file.
  • the terminal detects the selection operation of the associated button, it can display the multiple video streams in the second multimedia file, so that the user can know which original video streams the first video stream in the first multimedia file is made of. It is obtained by merging video streams and facilitates the user to select and play any video stream among the multiple video streams in the second multimedia file.
  • the terminal can display the first video stream 1102 of the first multimedia file in the video list 1101 and display the association button 1103 .
  • the user may choose to play the first video stream 1102.
  • the terminal responds to the click operation (ie, the selection operation) on the association button 1103. , displaying the multiple video streams 1104 in the second multimedia file.
  • the user can choose to play any video stream 1104 among the multiple video streams 1104 .
  • the terminal may display a video thumbnail corresponding to each of the multiple video streams in the second multimedia file in the video list. In this way, if the terminal detects a selection operation on any displayed video thumbnail, it can display a video stream corresponding to the video thumbnail in the second multimedia file, so that the user can choose to play this video stream. .
  • the terminal can also display the multiple video streams in the second multimedia file through other methods, which is not limited in the embodiments of the present application.
  • the terminal can also obtain the multiple video streams from the second multimedia file, and then play at least one video stream among the multiple video streams. For example, the terminal can display the multiple video streams in the video list, and then the user can select to play at least one of the multiple video streams.
  • the terminal receives a fusion adjustment instruction for the video image of the at least one video stream during the playback of the at least one video stream, it updates the second multimedia file according to the fusion adjustment information carried by the fusion adjustment instruction.
  • the image fusion parameters are used to determine whether the terminal has completed a fusion adjustment instruction for the video image of the at least one video stream during the playback of the at least one video stream.
  • the fusion adjustment instruction is used to indicate the image fusion method required to adjust the multiple video streams.
  • the user can manually trigger a fusion adjustment instruction according to his own needs.
  • the fusion adjustment instruction is used to instruct a change in the image fusion method.
  • the user can instruct a change in the image splicing mode, and/or change each channel.
  • the terminal can modify the image fusion parameter in the second multimedia file according to the fusion adjustment information, and update the image fusion parameter, so that the subsequent image fusion parameter can be modified according to the image fusion parameter in the second multimedia file.
  • the image fusion processing meets the latest needs of users.
  • the multi-channel video stream includes video stream A and video stream B.
  • the image splicing mode of video stream A and video stream B in the first 10 seconds is the up-down splicing mode
  • the image splicing mode after 10 seconds is Left and right splicing mode.
  • the image splicing modes in the image fusion parameters with timestamps within the first 10 seconds are all up-down splicing modes
  • the image fusion parameters with timestamps after 10 seconds The image stitching modes are all left and right stitching modes.
  • the terminal plays video stream A or video stream B according to the second multimedia file, or plays video stream A and video stream B simultaneously.
  • the user wants to adjust the image splicing mode of the first 3 seconds to the left and right splicing mode, during the playback process of video stream A and/or video stream B, the user can trigger the image splicing mode for the first 3 seconds of video stream A and/or video stream B.
  • a fusion adjustment instruction for the 3-second video image is used to instruct the image splicing mode of the video image for the previous 3 seconds to be adjusted to the left-right splicing mode.
  • the terminal updates the image fusion parameters in the second multimedia file according to the fusion adjustment instruction.
  • the image splicing modes in the image fusion parameters whose timestamps are within the first 3 seconds are all Left and right splicing mode.
  • the image splicing modes in the image fusion parameters with timestamps between 3 seconds and 10 seconds are all up and down splicing modes.
  • the image splicing modes in the image fusion parameters with timestamps after 10 seconds are all left and right splicing modes.
  • the terminal can generate a third multimedia file based on the second multimedia file. Specifically, the terminal can obtain the multi-channel video stream and the image fusion parameter from the second multimedia file, and then perform image fusion processing on the multi-channel video stream according to the image fusion parameter to obtain the second video stream, according to The second video stream generates a third multimedia file. Furthermore, the terminal can also update the first multimedia file stored in association with the second multimedia file to the third multimedia file.
  • the image fusion parameters corresponding to the first video stream are the same as the image fusion parameters corresponding to the second video stream. That is, the terminal uses the same image fusion method to perform image fusion processing on the multiple video streams. Get the first video stream and the second video stream.
  • the terminal uses the same image fusion method to perform image fusion processing on the multiple video streams. Get the first video stream and the second video stream.
  • the terminal due to limitations of the camera device, processing chip, image algorithm, etc., it is difficult for the terminal to take into account the video processing capabilities while ensuring real-time video recording. Therefore, the first video generated by the terminal during the video shooting process There's a good chance that the stream's image fusion will not work well. After the video shooting is completed, the terminal no longer needs to perform real-time video recording, so it can provide higher video processing capabilities. Therefore, the image fusion effect of the second video stream generated by the terminal at this time is better than that generated during the video shooting process. Image fusion effect of the first video stream.
  • the terminal updates the first multimedia file stored in association with the second multimedia file to the third multimedia file, so that the video stream in the multimedia file stored in association with the second multimedia file It is a video stream with better image fusion effect, so that the user can finally get a video stream with better image fusion effect for playback.
  • the terminal can generate a third multimedia file based on the second multimedia file. .
  • the multimedia file stored in association with the second multimedia file (which may be the first multimedia file or the old third multimedia file) is updated to the newly generated third multimedia file, so as to
  • the video stream in the multimedia file stored in association with the second multimedia file is a video stream with a good image fusion effect that meets the user's latest image fusion requirements.
  • the terminal when it obtains the multi-channel video streams and the image fusion parameters from the second multimedia file, it can first decapsulate (demux) the second multimedia file to obtain the multiple video files and the image fusion parameters. ,Then Then each video file in the plurality of video files is decoded respectively to obtain the multi-channel video stream.
  • the manner in which the terminal generates the third multimedia file based on the second video stream is similar to the above-mentioned manner in which the first multimedia file is generated based on the first video stream, which will not be described again in the embodiments of this application.
  • the terminal decapsulates the second multimedia file to obtain video file A, video file B and the image fusion parameters, then decodes video file A to obtain video stream A, and decodes video file B Video stream B is obtained, and then image fusion processing is performed on video stream A and video stream B according to the image fusion parameters to obtain a second video stream. Afterwards, the terminal encodes the second video stream to obtain the video file C, and encapsulates the video file C to obtain the third multimedia file.
  • multiple video streams are acquired.
  • image fusion processing is performed on the multi-channel video streams to obtain the first video stream, and image fusion parameters corresponding to the first video stream are obtained.
  • the image fusion parameters are used to indicate the multi-channel video streams when the first video stream is obtained.
  • Image fusion method Afterwards, a first multimedia file containing the first video stream is generated, and a second multimedia file containing the multi-channel video stream and the image fusion parameters is generated, and the image fusion parameters in the second multimedia file are generated Used to indicate the image fusion method that needs to be used in subsequent fusion of the multiple video streams in the second multimedia file.
  • the first multimedia file and the second multimedia file are stored in association.
  • the terminal can also generate a fused video stream with an image fusion effect based on the multiple video streams and the image fusion parameters in the stored second multimedia file. Since the terminal does not need to perform real-time video recording after the video shooting is completed, it can provide higher video processing capabilities. Therefore, the image fusion effect of the fused video stream generated by the terminal based on the second multimedia file is better than that during the video shooting. The image fusion effect of the first video stream in the first multimedia file generated during the process can enable the user to finally obtain a video stream with better image fusion effect for playback.
  • Figure 13 is a schematic diagram of a video shooting method provided by an embodiment of the present application. This method is applied to multi-camera simultaneous recording scenarios. In this case, the terminal records through camera A and camera B at the same time.
  • the method may include the following steps (1) to (4):
  • Camera A collects video stream A. After being processed by ISP front-end module 0 and ISP back-end module 0, video stream A is transmitted to the image fusion module and associated storage module.
  • the video image of video stream A collected by camera A can be in RAW format.
  • the ISP front-end module 0 can convert the RAW format video image of video stream A into a video image in YUV format.
  • the ISP back-end module 0 can convert the video Basic processing is performed on the YUV format video image of stream A, such as adjusting contrast, removing noise, etc.
  • Camera B collects video stream B. After being processed by ISP front-end module 1 and ISP back-end module 1, video stream B is transmitted to the image fusion module and associated storage module.
  • the video image of video stream B collected by camera B can be in RAW format.
  • the ISP front-end module 1 can convert the RAW format video image of video stream B into a video image in YUV format.
  • the ISP back-end module 1 can convert the video Basic processing is performed on the YUV format video image of stream B, such as adjusting contrast, removing noise, etc.
  • the image fusion module performs image fusion processing on video stream A and video stream B to obtain the first video stream, and sends the image fusion parameters corresponding to the first video stream to the associated storage module.
  • the first video stream has an image fusion effect. .
  • the video image of the first video stream can be displayed on the recording interface as a preview video image to implement video preview.
  • a first multimedia file including the first video stream may also be generated and stored.
  • the associated storage module generates a second multimedia file including video stream A, video stream B and the image fusion parameters, and associates and stores the second multimedia file with the first multimedia file.
  • the preview video image may be a video image of the first video stream, that is, the preview video image corresponds to the video image of the first video stream in the stored first multimedia file.
  • the image fusion method is the same. However, the embodiment of the present application is only described as an example. In actual use, the image fusion methods corresponding to the preview video image and the video image of the first video stream in the stored first multimedia file may also be different. In this case, as shown in Figure 14, the method may include the following steps a to e:
  • Step a Camera A collects video stream A. After being processed by ISP front-end module 0 and ISP back-end module 0, video stream A is transmitted to the preview module, film production module and associated storage module.
  • Step b Camera B collects video stream B. After being processed by ISP front-end module 1 and ISP back-end module 1, video stream B is transmitted to the preview module, film production module and associated storage module.
  • Step c The preview module performs image fusion processing on video stream A and video stream B to obtain a preview video stream, and displays the video image of the preview video stream as a preview video image on the recording interface.
  • the preview video stream has an image fusion effect.
  • Step d The film-making module performs image fusion processing on video stream A and video stream B to obtain the first video stream, sends the image fusion parameters corresponding to the first video stream to the associated storage module, and generates the third video stream containing the first video stream.
  • a multimedia file is stored.
  • the image fusion methods used by the preview module and the filmmaking module can be different.
  • the preview module is simpler to perform image fusion processing on video stream A and video stream B. For example, when the filmmaking module performs image fusion processing on video stream A and video stream B, Image anti-shake processing is required, but the preview module does not require image anti-shake processing.
  • Step e The associated storage module generates a second multimedia file including video stream A, video stream B and the image fusion parameters, and associates and stores the second multimedia file with the first multimedia file.
  • Figure 15 is a schematic diagram of a video shooting method provided by an embodiment of the present application. This method is applied to single-camera simultaneous recording scenarios. In this case, the terminal records through camera A.
  • the method may include the following steps (1) to (5):
  • Camera A collects video stream A. After being processed by ISP front-end module 0, video stream A is transmitted to ISP back-end module 0 and ISP back-end module 1.
  • the video image of video stream A collected by camera A may be in RAW format
  • the ISP front-end module 0 may convert the video image of video stream A in RAW format into a video image in YUV format.
  • ISP backend module 0 After ISP backend module 0 performs basic processing on video stream A, it transmits video stream A to the image fusion module and associated storage module.
  • ISP backend module 0 can perform basic processing on the YUV format video image of video stream A, Such as adjusting contrast, removing noise, etc.
  • the ISP back-end module 1 After the ISP back-end module 1 performs image processing on the video stream A, it obtains the video stream A', and transmits the video stream A' to the image fusion module and the associated storage module.
  • the ISP backend module 1 can perform image processing on the YUV format video image of video stream A. For example, it can enlarge and crop the YUV format video image of video stream A based on specific logic.
  • this specific logic can be logic such as human body tracking or other salient subject tracking.
  • the image fusion module performs image fusion processing on video stream A and video stream A' to obtain the first video stream, and sends the image fusion parameters corresponding to the first video stream to the associated storage module.
  • the first video stream has image fusion Effect.
  • the video image of the first video stream can be displayed on the recording interface as a preview video image to implement video preview.
  • a first multimedia file including the first video stream may also be generated and stored.
  • the associated storage module generates a second multimedia file including video stream A, video stream A' and the image fusion parameters, and associates and stores the second multimedia file with the first multimedia file.
  • the preview video image may be a video image of the first video stream, that is, the preview video image corresponds to the video image of the first video stream in the stored first multimedia file.
  • the image fusion method is the same. However, the embodiment of the present application is only described as an example. In actual use, the image fusion methods corresponding to the preview video image and the video image of the first video stream in the stored first multimedia file may also be different. In this case, as shown in Figure 16, the method may include the following steps a to f:
  • Step a Camera A collects video stream A. After being processed by ISP front-end module 0, video stream A is transmitted to ISP back-end module 0 and ISP back-end module 1.
  • Step b After ISP backend module 0 performs basic processing on video stream A, it transmits video stream A to the preview module, film production module and associated storage module.
  • Step c After the ISP backend module 1 performs image processing on the video stream A, the video stream A' is obtained, and the video stream A' is transmitted to the preview module, the film production module and the associated storage module.
  • Step d The preview module performs image fusion processing on video stream A and video stream A' to obtain a preview video stream, and displays the video image of the preview video stream as a preview video image on the recording interface.
  • the preview video stream has an image fusion effect.
  • Step e The film-making module performs image fusion processing on video stream A and video stream A' to obtain the first video stream, sends the image fusion parameters corresponding to the first video stream to the associated storage module, and generates a video stream containing the first video stream.
  • First multimedia files are stored.
  • the image fusion methods used by the preview module and the filmmaking module can be different.
  • the preview module performs image fusion processing on video stream A and video stream A' in a simpler manner.
  • the filmmaking module performs image fusion on video stream A and video stream A'.
  • Image stabilization is required during processing, but image stabilization is not required in the preview module.
  • Step f The associated storage module generates a second multimedia file including video stream A, video stream A' and the image fusion parameters, and associates and stores the second multimedia file with the first multimedia file.
  • Figure 17 is a schematic structural diagram of a video shooting device provided by an embodiment of the present application.
  • the device can be implemented as part or all of a computer device by software, hardware, or a combination of the two.
  • the computer device can be the above The terminal 100 described in the embodiments of Figures 1 to 2.
  • the device includes: a first acquisition module 1701, a processing module 1702, a second acquisition module 1703, a first generation module 1704, a second generation module 1705 and a storage module 1706.
  • the first acquisition module 1701 is used to acquire multiple video streams during the video shooting process
  • the processing module 1702 is used to perform image fusion processing on multiple video streams to obtain the first video stream;
  • the second acquisition module 1703 is used to acquire the image fusion parameters corresponding to the first video stream.
  • the image fusion parameters are used to indicate the image fusion method of the multiple video streams when the first video stream is obtained;
  • the first generation module 1704 is used to generate a first multimedia file containing the first video stream
  • the second generation module 1705 is used to generate a second multimedia file containing multiple video streams and image fusion parameters
  • the storage module 1706 is used to associate and store the first multimedia file and the second multimedia file.
  • the first acquisition module 1701 is used for:
  • the plurality of cameras are all set on the terminal; or, some of the multiple cameras are set on the terminal, and the other part of the cameras are set on a collaboration device that is in a multi-screen collaboration state with the terminal.
  • the first acquisition module 1701 is used for:
  • the image fusion parameters include an image splicing mode
  • the image splicing mode includes one or more of a top and bottom splicing mode, a left and right splicing mode, and a picture-in-picture nested mode.
  • the device also includes:
  • the display module is used to display the video image of the first video stream on the recording interface.
  • the second generation module 1705 is used for:
  • any video file among multiple video files use this video file as a video track, use the image fusion parameters as a parameter track, encapsulate the video track and parameter track to obtain a corresponding multi-track file;
  • Multiple multi-track files corresponding one-to-one to multiple video files are determined as second multimedia files.
  • the second generation module 1705 is used for:
  • the device also includes:
  • the third acquisition module is used to acquire multiple video streams and image fusion parameters from the second multimedia file after the video shooting is completed;
  • the processing module 1702 is also used to perform image fusion processing on multiple video streams according to the image fusion parameters to obtain a second video stream;
  • the first generation module 1704 is used to generate a third multimedia file according to the second video stream.
  • the device also includes:
  • the first update module is used to update the first multimedia file stored in association with the second multimedia file to a third multimedia file.
  • the device also includes:
  • the fourth acquisition module is used to acquire multiple video streams from the second multimedia file after the video shooting is completed;
  • a playback module used to play at least one video stream among multiple video streams
  • the second update module is configured to update the second multimedia according to the fusion adjustment information carried by the fusion adjustment instruction if a fusion adjustment instruction for the video image of the at least one video stream is received during the playback of the at least one video stream.
  • Image fusion parameters in the file are configured to update the second multimedia according to the fusion adjustment information carried by the fusion adjustment instruction if a fusion adjustment instruction for the video image of the at least one video stream is received during the playback of the at least one video stream.
  • the device also includes:
  • the first display module is used to display the first video stream and associated button in the first multimedia file in the video list after the video shooting is completed;
  • the second display module is used to display the multiple video streams in the second multimedia file if a selection operation on the associated button is detected.
  • multiple video streams are acquired.
  • image fusion processing is performed on the multi-channel video streams to obtain the first video stream, and image fusion parameters corresponding to the first video stream are obtained.
  • the image fusion parameters are used to indicate the multi-channel video streams when the first video stream is obtained.
  • Image fusion method Afterwards, a first multimedia file containing the first video stream is generated, and a second multimedia file containing the multi-channel video stream and the image fusion parameters is generated, and the image fusion parameters in the second multimedia file are generated Used to indicate the image fusion method that needs to be used in subsequent fusion of the multiple video streams in the second multimedia file.
  • the first multimedia file and the second multimedia file are stored in association.
  • the device can also generate a fused video stream with an image fusion effect based on the multiple video streams and the image fusion parameters in the stored second multimedia file. Since the device does not need to perform real-time video recording after the video shooting is completed, it can provide higher video processing capabilities, so that the image fusion effect of the fused video stream generated based on the second multimedia file is better than that during the video shooting process.
  • the image fusion effect of the first video stream in the generated first multimedia file can enable the user to finally obtain a video stream with better image fusion effect for playback.
  • Each functional unit and module in the above embodiments can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above-mentioned integrated unit can either use hardware. It can also be implemented in the form of software functional units.
  • the specific names of each functional unit and module are only for the convenience of distinguishing each other and are not used to limit the scope of protection of the embodiments of the present application.
  • the video shooting device provided by the above embodiments and the video shooting method embodiments belong to the same concept.
  • the specific working processes and technical effects of the units and modules in the above embodiments can be found in the method embodiments section, and will not be described again here.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, such as from a website, computer, server, or data center Transmission to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) methods.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer, or a data storage device such as a server or data center integrated with one or more available media.
  • the available media may be magnetic media (such as floppy disks, hard disks, tapes), optical media (such as Digital Versatile Disc (DVD)) or semiconductor media (such as Solid State Disk (SSD)) wait.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本申请公开了一种视频拍摄方法、装置、设备和存储介质,属于视频处理技术领域。该方法包括:在视频拍摄过程中,获取多路视频流。对该多路视频流进行图像融合处理,得到第一视频流,获取第一视频流对应的图像融合参数。生成包含有第一视频流的第一多媒体文件,以及生成包含有该多路视频流和该图像融合参数的第二多媒体文件。将第一多媒体文件与第二多媒体文件关联存储。如此,在视频拍摄结束后可根据第二多媒体文件中的该多路视频流和该图像融合参数生成带有图像融合效果的融合视频流,该融合视频流的图像融合效果优于在视频拍摄过程中生成的第一多媒体文件中的第一视频流的图像融合效果,可使得用户最终得到图像融合效果更好的视频流来进行播放。

Description

视频拍摄方法、装置、设备和存储介质
本申请要求于2022年05月30日提交到国家知识产权局、申请号为202210601210.0、申请名称为“视频拍摄方法、装置、设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频处理技术领域,特别涉及一种视频拍摄方法、装置、设备和存储介质。
背景技术
随着终端技术的发展,终端逐渐集通讯、拍摄和影音等功能于一体,成为人们日常生活中不可缺少的部分。用户可以使用终端来拍摄视频,记录生活的点点滴滴。
目前,终端支持同时使用多个摄像头来拍摄视频。具体地,终端可以通过多个摄像头同时采集多路视频流,然后对该多路视频流进行图像融合处理来得到融合视频流,以在录像界面显示该融合视频流的视频图像。并且,在视频拍摄结束后,该终端还可以保存该融合视频流,以供用户后续观看。
然而,在视频拍摄过程中,受摄像器件、处理芯片、图像算法等限制,终端在保证视频实时录制的同时,很难兼顾视频处理能力,从而导致视频拍摄过程中获得的融合视频流的视频效果不佳。
发明内容
本申请提供了一种视频拍摄方法、装置、设备和存储介质,在视频拍摄结束后能够生成图像融合效果较好的视频流。所述技术方案如下:
第一方面,提供了一种视频拍摄方法。在该方法中,在视频拍摄过程中,获取多路视频流,然后对该多路视频流进行图像融合处理,得到第一视频流,获取第一视频流对应的图像融合参数。之后,生成包含有第一视频流的第一多媒体文件,以及生成包含有该多路视频流和该图像融合参数的第二多媒体文件。将第一多媒体文件与第二多媒体文件关联存储。
第一视频流对应的图像融合参数用于指示在得到第一视频流时该多路视频流的图像融合方式。示例地,该图像融合参数可以包括图像拼接模式,进一步还可以包括该多路视频流中每路视频流的图像拼接位置。该图像拼接模式可以包括上下拼接模式、左右拼接模式、画中画嵌套模式等中的一种或多种。该多路视频流中任意的一路视频流的图像拼接位置用于指示在按照相应的图像拼接模式进行拼接时,这一路视频流的视频图像所处的位置。
第一多媒体文件中的第一视频流具有图像融合效果。如此,在视频拍摄结束时,用户就可以观看所存储的第一多媒体文件中的带有图像融合效果的第一视频流,且可以将第一多媒体文件即时分享给其他人进行观看。
第二多媒体文件中的该多路视频流是未经图像融合处理的原始视频流,即是不带 有图像融合效果的视频流。第二多媒体文件中的该图像融合参数用于指示第二多媒体文件中该多路视频流在后续融合时需要采用的图像融合方式。如此,在视频拍摄结束后,终端不仅可以根据所存储的第二多媒体文件实现对该多路视频流中每路视频流的播放,还可以根据所存储的第二多媒体文件中的该多路视频流和该图像融合参数生成带有图像融合效果的融合视频流。由于在视频拍摄结束后,该终端无需再进行视频实时录制,所以可以提供较高的视频处理能力,从而该终端根据第二多媒体文件生成的融合视频流的图像融合效果优于在视频拍摄过程中生成的第一多媒体文件中的第一视频流的图像融合效果,如此可以使得用户最终得到图像融合效果更好的视频流来进行播放。
在一种可能的方式中,获取多路视频流的操作可以为:获取多个摄像头中的每个摄像头采集的一路视频流,以得到该多路视频流。
这种方式是多摄同录场景,即通过多个摄像头同时进行录像,以得到该多个摄像头中每个摄像头采集的一路视频流。
作为一种示例,该多个摄像头可以均设置于该终端。此时该终端是通过自身的多个摄像头同时进行录像,从而该终端可以获取到该多个摄像头中每个摄像头采集的一路视频流,以得到多路视频流。
作为另一种示例,该多个摄像头中的一部分摄像头可以设置于该终端,另一部分摄像头可以设置于与该终端处于多屏协同状态的协同设备。此时该终端是通过自身的摄像头和该协同设备的摄像头同时进行录像,该协同设备可以将自身的摄像头采集的视频流发送给该终端,从而该终端可以获取到自身的摄像头采集的视频流和该协同设备的摄像头采集的视频流,以得到多路视频流。
在另一种可能的方式中,获取多路视频流的操作可以为:获取摄像头采集的一路视频流,对这一路视频流进行图像处理,得到另一路视频流。
这种方式是单摄同录场景,即通过一个摄像头进行录像,以得到这个摄像头采集的一路视频流,并且,对这个摄像头采集的一路视频流进行图像处理,得到另一路视频流,如此就可以获得两路视频流,包括原始的视频流和经图像处理得到的视频流。
需注意的是,终端还可以对这一路视频流进行不同的图像处理,来得到不同的视频流,如此就可以获得至少三路视频流,包括原始的视频流和经不同的图像处理得到的至少两路视频流。
可选地,对该多路视频流进行图像融合处理,得到第一视频流之后,还可以在录像界面显示第一视频流的视频图像,如此可以在视频拍摄过程中实现对所拍摄的视频的实时预览,便于用户及时获知视频的图像融合效果。
在一种可能的方式中,生成包含有该多路视频流和该图像融合参数的第二多媒体文件的操作可以为:分别对该多路视频流中的每路视频流进行编码,得到多个视频文件;对于该多个视频文件中任意的一个视频文件,将这个视频文件作为一个视频轨道,将该图像融合参数作为参数轨道,对这个视频轨道和该参数轨道进行封装,得到对应的一个多轨道文件;将与该多个视频文件一一对应的多个多轨道文件确定为第二多媒体文件。
这种方式中,是对该多路视频流中每路视频流的视频文件单独进行封装来得到对 应的一个多轨道文件,如此,可以得到该多路视频流中每路视频流的多轨道文件,即得到多个多轨道文件。此时第二多媒体文件包括该多个多轨道文件。
在另一种可能的方式中,生成包含有该多路视频流和该图像融合参数的第二多媒体文件的操作可以为:分别对该多路视频流中的每路视频流进行编码,得到多个视频文件;将该多个视频文件中的每个视频文件均作为一个视频轨道,以得到多个视频轨道;将该图像融合参数作为参数轨道;对该多个视频轨道和该参数轨道进行封装,得到第二多媒体文件。
这种方式中,是对该多路视频流的该多个视频文件整体进行封装来得到第二多媒体文件。
进一步地,在视频拍摄结束后,还可以在视频列表中展示第一多媒体文件中的第一视频流和关联按钮,该关联按钮用于指示展示第一多媒体文件关联的第二多媒体文件。若检测到对该关联按钮的选择操作,则展示第二多媒体文件中的该多路视频流,以便用户可以获知第一多媒体文件中的第一视频流是由哪些原始视频流融合得到的,且便于用户选择播放第二多媒体文件中的该多路视频流中的任意一路视频流。
进一步地,在视频拍摄结束后,还可以从第二多媒体文件中获取该多路视频流,然后播放该多路视频流中的至少一路视频流。比如,可以在视频列表中展示该多路视频流,然后用户可以选择播放该多路视频流中的至少一路视频流。之后,若在该至少一路视频流的播放过程中接收到针对该至少一路视频流的视频图像的融合调整指令,则根据该融合调整指令携带的融合调整信息更新第二多媒体文件中的该图像融合参数。
该融合调整指令用于指示调整该多路视频流需要采用的图像融合方式。用户可以在该至少一路视频流的播放过程中,根据自身的需求手动逐帧触发融合调整指令,该融合调整指令用于指示改变图像融合方式,如可以指示改变图像拼接模式,和/或,改变各路视频流的图像拼接位置。也即,该融合调整指令中携带的融合调整信息可以包括需调整至的图像拼接模式,和/或,可以包括各路视频流需调整至的图像拼接位置。如此,该终端就可以根据该融合调整信息修改第二多媒体文件中的该图像融合参数,实现对该图像融合参数的更新,以使得后续根据第二多媒体文件中的该图像融合参数所进行的图像融合处理满足用户的最新需求。
进一步地,在视频拍摄结束后,还可以从第二多媒体文件中获取该多路视频流和该图像融合参数,然后根据该图像融合参数对该多路视频流进行图像融合处理,得到第二视频流,根据第二视频流生成第三多媒体文件。更进一步地,还可以将与第二多媒体文件关联存储的第一多媒体文件更新为第三多媒体文件。
这种情况下,第一视频流对应的图像融合参数与第二视频流对应的图像融合参数相同,也即,是使用相同的图像融合方式对该多路视频流进行图像融合处理,来得到第一视频流和第二视频流的。由于在视频拍摄结束后无需再进行视频实时录制,所以可以提供较高的视频处理能力,从而此时生成的第二视频流的图像融合效果优于在视频拍摄过程中生成的第一视频流的图像融合效果。
这种情况下,将与第二多媒体文件关联存储的第一多媒体文件更新为第三多媒体文件,使得与第二多媒体文件关联存储的多媒体文件中的视频流为图像融合效果更好的视频流,如此可以使得用户最终得到图像融合效果更好的视频流来进行播放。
需注意的是,若根据用户触发的融合调整指令对第二多媒体文件中的该图像融合参数进行了更新,则可以根据第二多媒体文件生成第三多媒体文件。然后将与第二多媒体文件关联存储的多媒体文件(有可能为第一多媒体文件,也有可能为旧的第三多媒体文件)更新为新生成的第三多媒体文件,以使得与第二多媒体文件关联存储的多媒体文件中的视频流为满足用户最新的图像融合需求的图像融合效果佳的视频流。
第二方面,提供了一种视频拍摄装置,所述视频拍摄装置具有实现上述第一方面中视频拍摄方法行为的功能。所述视频拍摄装置包括至少一个模块,所述至少一个模块用于实现上述第一方面所提供的视频拍摄方法。
第三方面,提供了一种视频拍摄装置,所述视频拍摄装置的结构中包括处理器和存储器,所述存储器用于存储支持视频拍摄装置执行上述第一方面所提供的视频拍摄方法的程序,以及存储用于实现上述第一方面所述的视频拍摄方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述视频拍摄装置还可以包括通信总线,所述通信总线用于在所述处理器与所述存储器之间建立连接。
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面所述的视频拍摄方法。
第五方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面所述的视频拍摄方法。
上述第二方面、第三方面、第四方面和第五方面所获得的技术效果与上述第一方面中对应的技术手段获得的技术效果近似,在这里不再赘述。
附图说明
图1是本申请实施例提供的一种终端的结构示意图;
图2是本申请实施例提供的一种终端的软件系统的框图;
图3是本申请实施例提供的一种视频图像的示意图;
图4是本申请实施例提供的第一种录像界面的示意图;
图5是本申请实施例提供的第二种录像界面的示意图;
图6是本申请实施例提供的第三种录像界面的示意图;
图7是本申请实施例提供的第四种录像界面的示意图;
图8是本申请实施例提供的一种视频拍摄方法的流程图;
图9是本申请实施例提供的一种双视频容器的示意图;
图10是本申请实施例提供的另一种双视频容器的示意图;
图11是本申请实施例提供的一种视频列表的示意图;
图12是本申请实施例提供的一种生成第三多媒体文件的示意图;
图13是本申请实施例提供的第一种视频拍摄方法的示意图;
图14是本申请实施例提供的第二种视频拍摄方法的示意图;
图15是本申请实施例提供的第三种视频拍摄方法的示意图;
图16是本申请实施例提供的第四种视频拍摄方法的示意图;
图17是本申请实施例提供的一种视频拍摄装置的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请的实施方式作进一步地详细描述。
应当理解的是,本申请提及的“多个”是指两个或两个以上。在本申请的描述中,除非另有说明,“/”表示或的意思,比如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,比如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,为了便于清楚描述本申请的技术方案,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
在本申请中描述的“一个实施例”或“一些实施例”等语句意味着在本申请的一个或多个实施例中包括该实施例描述的特定特征、结构或特点。由此,在本申请中的不同之处出现的“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等语句不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。此外,术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
下面先对本申请实施例涉及的终端予以说明。
图1是本申请实施例提供的一种终端的结构示意图。参见图1,终端100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户识别模块(subscriber identity module,SIM)卡接口195等。其中,传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本申请实施例示意的结构并不构成对终端100的具体限定。在本申请另一些实施例中,终端100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,比如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是终端100的神经中枢和指挥中心。控制器可以根据指令操作 码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从该存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过终端100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为终端100供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
终端100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
移动通信模块150可以提供应用在终端100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
无线通信模块160可以提供应用在终端100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
终端100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
终端100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。终端100可以包括1个或N个摄像头193,N为大于1的整数。
在本申请实施例中,终端100可以通过1个或多个摄像头193进行录像。在多摄同录场景中,终端100通过多个摄像头193同时进行录像。在单摄同录场景中,终端100通过一个摄像头193进行录像。摄像头193用于采集视频流。摄像头193采集到视频流后可以传递至ISP进行处理。
作为一种示例,摄像头193采集的视频流的视频图像的格式为RAW格式,ISP可以将该视频流中RAW格式的视频图像转换为YUV格式的视频图像,然后再对YUV格式的视频图像进行基础处理,如可调整对比度、去除噪声等。
在多摄同录场景下,ISP可以接收到多个摄像头193中每个摄像头193采集的视频流,并对这多路视频流进行基础处理,之后将该多路视频流传输给应用处理器。在单摄同录场景下,ISP可以接收到一个摄像头193采集的视频流,对这一路视频流进行基础处理,并对进行基础处理后的这一路视频流进行图像处理,如可进行放大处理和裁切处理等,得到另一路视频流,然后将这两路视频流传输至应用处理器。
应用处理器可以对该多路视频流进行图像融合处理,得到第一视频流,并且,还可以生成包含有第一视频流的第一多媒体文件,进一步地,还可以通过视频编解码器、GPU和显示屏194将第一视频流的视频图像显示在录像界面,实现视频预览。
同时,应用处理器还可以获取第一视频流对应的图像融合参数,该图像融合参数用于指示在得到第一视频流时该多路视频流的图像融合方式,然后生成包含有该多路视频流和第一视频流对应的图像融合参数的第二多媒体文件,并将第二多媒体文件和第一多媒体文件进行关联存储,第二多媒体文件中的该图像融合参数用于指示第二多媒体文件中该多路视频流在后续融合时需要采用的图像融合方式。如此,在视频拍摄结束后,可以根据所存储的第二多媒体文件生成图像融合效果更好的视频流。
外部存储器接口120可以用于连接外部存储卡,比如Micro SD卡,实现扩展终端100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。比如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,计算机可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,来执行终端100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储终端100在使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,比如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
终端100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D以及应用处理器等实现音频功能,比如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些 实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和终端100的接触和分离。终端100可以支持1个或N个SIM卡接口,N为大于1的整数。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口195可以同时插入多张卡。多张卡的类型可以相同,也可以不同。SIM卡接口195也可以兼容不同类型的SIM卡。SIM卡接口195也可以兼容外部存储卡。终端100通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,终端100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在终端100中,不能和终端100分离。
接下来对终端100的软件系统予以说明。
终端100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的安卓(Android)系统为例,对终端100的软件系统进行示例性说明。
图2是本申请实施例提供的一种终端100的软件系统的框图。参见图2,分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统从上至下分为应用程序层(application,APP),应用程序框架层(framework,FWK),安卓运行时(Android runtime)和系统层,以及内核层(kernel)。
应用程序层可以包括一系列应用程序包。如图2所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问,这些数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。视图系统包括可视控件,比如,显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序的显示界面,显示界面可以由一个或多个视图组成,比如,包括显示短信通知图标的视图,包括显示文字的视图,以及包括显示图片的视图。电话管理器用于提供终端100的通信功能,比如,通话状态的管理(包括接通,挂断等)。资源管理器为应用程序提供各种资源,比如,本地化字符串,图标,图片,布局文件,视频文件等。通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互,比如,通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或滚动条文本形式出现在系统顶部状态栏的通知,比如,后台运行的应用程序的通知。通知管理器还可以是以对话窗口形式出现在屏幕上的通知,比如,在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
Android runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管 理。核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统层可以包括多个功能模块,比如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(比如:OpenGL ES),二维图形引擎(比如:SGL)等。表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,比如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。二维图形引擎是二维绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动等。
在对本申请实施例进行详细地解释说明之前,对本申请实施例涉及的应用场景予以说明。
目前,如图3所示,在很多录像场景中,诸如手机、平板电脑、笔记本电脑等终端在视频拍摄过程中都可以显示多路视频流中每路视频流的视频图像31。其中,该多路视频流可以是由不同摄像头采集的视频流,这种录像场景可以称为多摄同录场景。或者,该多路视频流可以是由一个摄像头采集的但经过不同处理的视频流,这种录像场景可以称为单摄同录场景。
下面对这两种录像场景进行示例性说明。
第一种录像场景:多摄同录场景
多摄同录场景中,通过多个摄像头同时进行录像,并且在录像界面(也可称为录像预览界面或视频拍摄界面)中显示该多个摄像头中每个摄像头采集的视频流的视频图像。
在一种可能的情况中,终端具有多个摄像头,该多个摄像头的拍摄方向不同。该终端可以启动多摄录像功能,以通过该终端自身的多个摄像头同时进行录像,然后在录像界面中显示该多个摄像头中每个摄像头采集的视频流的视频图像。
示例地,该终端可以具有前置摄像头和后置摄像头。该终端启动多摄录像功能后,就启动了自身的前置摄像头和后置摄像头,前置摄像头采集一路视频流,后置摄像头采集一路视频流。之后,如图4所示,该终端可以在录像界面41中显示前置摄像头采集的视频流的视频图像421以及显示后置摄像头采集的视频流的视频图像422。
在另一种可能的情况中,终端与其他设备(可称为协同设备)处于多屏协同状态,该终端和该协同设备都具有摄像头,该终端可以借助该协同设备的摄像头来拍摄。该终端可以启动协同录像功能,以通过该终端的摄像头和该协同设备的摄像头同时进行录像,然后在录像界面中显示该终端的摄像头采集的视频流的视频图像和该协同设备的摄像头采集的视频流的视频图像。
示例地,该终端和该协同设备均具有一个摄像头,该终端启动协同录像功能后, 就启动了自身的摄像头,且指示该协同设备启动了该协同设备的摄像头。该终端的摄像头可以采集一路视频流,该协同设备的摄像头可以采集一路视频流,且该协同设备可以将自身的摄像头采集的视频流发送至该终端。之后,如图5所示,该终端501可以在录像界面51中显示自身的摄像头采集的视频流的视频图像521以及显示该协同设备502的摄像头采集的视频流的视频图像522。
第二种录像场景:单摄同录场景
单摄同录场景中,通过一个摄像头进行录像,并且在录像界面中显示这个摄像头采集的视频流经不同处理后的视频图像。
在一种可能的情况中,终端具有一个摄像头。该终端可以启动单摄录像功能,以通过该终端自身的这个摄像头进行录像,然后在录像界面中显示这个摄像头采集的视频流经不同处理后的视频图像。
示例地,该终端可以具有后置摄像头。该终端启动单摄录像功能后,就启动了自身的后置摄像头,后置摄像头采集一路视频流,该终端对这路视频流的视频图像进行放大处理和裁切处理,得到另一路视频流的视频图像。之后,如图6所示,该终端可以在录像界面61中显示后置摄像头采集的原始的视频流的视频图像622以及显示经放大处理和裁切处理得到的另一路视频流的视频图像621。其中,视频图像622是后置摄像头拍摄的原始的视频图像,视频图像621是对原始的视频图像622进行放大处理和裁切处理后的视频图像。
上述多种录像场景中,终端可以在视频拍摄过程中获取多路视频流,并在录像界面中显示该多路视频流中每路视频流的视频图像。可选地,在录像界面中显示该多路视频流中每路视频流的视频图像时,可以先按照特定的图像拼接模式将该多路视频流中每路视频流的视频图像进行拼接,得到融合视频流的视频图像,再将融合视频流的视频图像显示在录像界面中。
示例地,该图像拼接模式可以包括上下拼接模式、左右拼接模式、画中画嵌套模式等。其中,上下拼接模式是指将多路视频流中每路视频流的视频图像按照从上到下的顺序依次进行拼接,如此按照上下拼接模式得到的融合视频流的视频图像包含的该多路视频流中每路视频流的视频图像是从上到下依次排列的。比如,如图4、图5或图6所示,录像界面中显示的融合视频流的视频图像32即是按照上下拼接模式将多路视频流中每路视频流的视频图像拼接得到的。其中,左右拼接模式是指将多路视频流中每路视频流的视频图像按照从左到右的顺序依次进行拼接,如此按照左右拼接模式得到的融合视频流的视频图像包含的该多路视频流中每路视频流的视频图像是从左到右依次排列的。其中,画中画嵌套模式是指在全屏显示主画面的过程中,于主画面的小面积区域上同时显示子画面。也即,画中画嵌套模式是指将多路视频流中的一路视频流的视频图像作为主画面,将该多路视频流中除这一路视频流之外的其他路视频流的视频图像作为子画面,且将子画面拼接在主画面的小面积区域上。比如,如图7所示,终端可以在多摄录像过程中,在录像界面71中显示融合视频流的视频图像32,该融合视频流的视频图像32包含该终端的前置摄像头采集的视频流的视频图像721和该终端的后置摄像头采集的视频流的视频图像722,且后置摄像头采集的视频流的视频图像722为主画面,前置摄像头采集的视频流的视频图像721为在主画面的小面 积区域上存在的子画面。
通过上述录像场景可知,终端在视频拍摄过程中获取多路视频流后,需要对该多路视频流进行图像融合处理来得到融合视频流,以在录像界面显示该融合视频流的视频图像。并且,在视频拍摄结束后,该终端还可以保存该融合视频流,以供用户后续观看。然而,在视频拍摄过程中,受摄像(Camera)器件、处理芯片、图像算法等限制,在保证视频实时录制的同时,很难兼顾视频处理能力,从而导致视频拍摄过程中获得的融合视频流的视频效果不佳。
为此,本申请实施例提供了一种视频拍摄方法,在视频拍摄过程中,对多路视频流进行图像融合处理,得到第一视频流。之后,不仅生成包含有第一视频流的第一多媒体文件,还生成包含有该多路视频流和第一视频流对应的图像融合参数的第二多媒体文件,并将第一多媒体文件与第二多媒体文件关联存储。如此,在视频拍摄结束后,用户就可以观看所存储的第一多媒体文件中的带有图像融合效果的第一视频流,且可以将第一多媒体文件即时分享给其他人进行观看。并且,终端还可以根据所存储的第二多媒体文件中的该多路视频流和该图像融合参数生成带有图像融合效果的融合视频流。由于在视频拍摄结束后,终端无需再进行视频实时录制,所以可以提供较高的视频处理能力,从而根据第二多媒体文件生成的融合视频流的图像融合效果优于在视频拍摄过程中生成的第一多媒体文件中的第一视频流的图像融合效果,如此可以使得用户最终得到图像融合效果更好的视频流来进行播放。
下面对本申请实施例提供的视频拍摄方法进行详细地解释说明。
图8是本申请实施例提供的一种视频拍摄方法的流程图,该方法应用于终端,该终端可以是上文图1至图2实施例所述的终端100。参见图8,该方法包括:
步骤801:终端在视频拍摄过程中,获取多路视频流。
该多路视频流中每路视频流的视频图像的时间戳是对齐的。也即,每个时间戳在该多路视频流中均有对应的一帧视频图像,换句话说,该多路视频流中每路视频流的第i帧视频图像的时间戳均相同,i为正整数。
可选地,若该视频拍摄过程为多摄同录过程,则该多路视频流可以是由不同摄像头采集的视频流。或者,若该视频拍摄过程为单摄同录过程,则该多路视频流可以是由一个摄像头采集的但经过不同处理的视频流。
这种情况下,该终端获取多路视频流的操作可以通过如下两种方式实现:
第一种方式:该终端获取多个摄像头中的每个摄像头采集的一路视频流,以得到多路视频流。
这种方式是多摄同录场景,即通过多个摄像头同时进行录像,以得到该多个摄像头中每个摄像头采集的一路视频流。
作为一种示例,该多个摄像头可以均设置于该终端。此时该终端是通过自身的多个摄像头同时进行录像,从而该终端可以获取到该多个摄像头中每个摄像头采集的一路视频流,以得到多路视频流。
作为另一种示例,该多个摄像头中的一部分摄像头可以设置于该终端,另一部分摄像头可以设置于与该终端处于多屏协同状态的协同设备。此时该终端是通过自身的 摄像头和该协同设备的摄像头同时进行录像,该协同设备可以将自身的摄像头采集的视频流发送给该终端,从而该终端可以获取到自身的摄像头采集的视频流和该协同设备的摄像头采集的视频流,以得到多路视频流。
第二种方式:终端获取摄像头采集的一路视频流,对这一路视频流进行图像处理,得到另一路视频流。
这种方式是单摄同录场景,即通过一个摄像头进行录像,以得到这个摄像头采集的一路视频流,并且,对这个摄像头采集的一路视频流进行图像处理,得到另一路视频流,如此就可以获得两路视频流,包括原始的视频流和经图像处理得到的视频流。
需注意的是,终端还可以对这一路视频流进行不同的图像处理,来得到不同的视频流,如此就可以获得至少三路视频流,包括原始的视频流和经不同的图像处理得到的至少两路视频流。
其中,该终端对这一路视频流进行图像处理,即是对这一路视频流的视频图像进行处理,如可以对这一路视频流的视频图像进行放大处理和裁切处理,得到另一路视频流的视频图像。
可选地,这个摄像头可以设置于该终端,也可以设置于与该终端处于多屏协同状态的协同设备,本申请实施例对此不作限定。
步骤802:该终端对该多路视频流进行图像融合处理,得到第一视频流。
该终端对该多路视频流进行图像融合处理,即是对该多路视频流中每路视频流的视频图像进行融合处理,来得到第一视频流的视频图像。如此,第一视频流是带有特定的图像融合效果的视频流。
由于该多路视频流中每路视频流的视频图像的时间戳是对齐的,所以可以对该多路视频流中时间戳相同的多张视频图像进行融合处理。具体地,每获取到该多路视频流中每路视频流的第i帧视频图像,就对该多路视频流中每路视频流的第i帧视频图像进行融合处理,得到第一视频流的第i帧视频图像,也即,逐帧对该多路视频流中每路视频流的视频图像进行融合处理,来得到第一视频流的每帧视频图像,从而第一视频流的视频图像的时间戳与该多路视频流中每路视频流的视频图像的时间戳也是对齐的。如此,对该多路视频流进行图像融合处理后,得到的第一视频流的视频图像包含对该多路视频流中每路视频流的视频图像进行融合处理后的图像。
其中,该终端在对该多路视频流中每路视频流的视频图像进行融合处理时,可以根据特定的图像融合参数将该多路视频流中每路视频流的视频图像进行拼接。
该图像融合参数用于指示该多路视频流的图像融合方式。示例地,该图像融合参数可以包括图像拼接模式,进一步还可以包括该多路视频流中每路视频流的图像拼接位置。该图像拼接模式可以包括上下拼接模式、左右拼接模式、画中画嵌套模式等中的一种或多种,本申请实施例对此不作限定。该多路视频流中任意的一路视频流的图像拼接位置用于指示在按照相应的图像拼接模式进行拼接时,这一路视频流的视频图像所处的位置。
比如,该多路视频流包括视频流A和视频流B。假设该图像拼接模式为上下拼接模式,视频流A的图像拼接位置为上,视频流B的图像拼接位置为下,则该终端可以将视频流A的视频图像拼接在视频流B的视频图像的上方,以得到第一视频流的视频 图像,此时第一视频流的视频图像的上半部分为视频流A的视频图像,下半部分为视频流B的视频图像。或者,假设该图像拼接模式为画中画嵌套模式,视频流A的图像拼接位置为主画面,视频流B的图像拼接位置为子画面,则该终端可以将视频流B的视频图像拼接在视频流A的小面积区域上,以得到第一视频流的视频图像,此时第一视频流的视频图像的主画面为视频流A的视频图像,子画面为视频流B的视频图像。
需注意的是,由于该终端是逐帧对该多路视频流中每路视频流的视频图像进行融合处理,来得到第一视频流的每帧视频图像,所以该图像融合参数也是逐帧存在的。也即,该多路视频流中每路视频流的第i帧视频图像均对应一个图像融合参数,且根据该多路视频流中每路视频流的第i帧视频图像得到的第一视频流的第i帧视频图像也对应这个图像融合参数,这个图像融合参数也可以具有时间戳,且这个图像融合参数的时间戳与该多路视频流中每路视频流的第i帧视频图像的时间戳以及第一视频流的第i帧视频图像的时间戳对齐。
可选地,该终端在对该多路视频流中每路视频流的视频图像进行融合处理时所采用的图像融合参数可以是默认的,或者,可以是用户在拍摄视频前根据自身需求事先设置的,或者,可以是该终端根据该多路视频流中每路视频流的视频图像的内容自动确定的,本申请实施例对此不作限定。
在一些实施例中,用户还可以在视频拍摄过程中主动去调整图像融合参数。比如,假设默认的图像拼接模式为上下拼接模式。在拍摄刚开始时,该终端采用默认的上下拼接模式对该多路视频流中每路视频流的视频图像进行拼接,在拍摄了一段时间后,用户可以在该终端中调整图像拼接模式为画中画嵌套模式,则在此之后该终端采用画中画嵌套模式继续对该多路视频流中每路视频流的视频图像进行拼接。
值得注意的是,该多路视频流中每路视频流的各帧视频图像的图像融合参数可以相同,也可以不同。在一些实施例中,整个视频拍摄过程中该多路视频流的图像融合方式可以是不断变化的,这种变化可以来自于用户的手动调整,如用户可以在视频拍摄过程中手动调整图像拼接模式,或者,这种变化也可以来自于该终端的自动调整,如该终端可以根据该多路视频流的视频图像的内容的不同而选择不同的图像融合参数。
比如,该终端具有前置摄像头和后置摄像头。在开始拍摄视频的前10秒内,该终端采用默认图像融合参数对该多路视频流进行图像融合处理,假设该默认图像融合参数中的图像拼接模式为上下拼接模式,且其中前置摄像头采集的视频流的图像拼接位置为上,后置摄像头采集的视频流的图像拼接位置为下,则如图4所示,在开始拍摄视频的前10秒内,该终端可以按照上下拼接模式对前置摄像头采集的视频流的视频图像421和后置摄像头采集的视频流的视频图像422进行拼接,得到录像界面41中显示的第一视频流的视频图像32,第一视频流的视频图像32中视频图像421和视频图像422按照从上至下的顺序排列。
拍摄视频的10秒以后,用户手动调整图像融合参数,调整后的图像融合参数中的图像拼接模式为画中画嵌套模式,且其中前置摄像头采集的视频流的图像拼接位置为子画面,后置摄像头采集的视频流的图像拼接位置为主画面,则如图7所示,在拍摄视频的10秒以后,该终端可以按照画中画嵌套模式对前置摄像头采集的视频流的视频图像721和后置摄像头采集的视频流的视频图像722进行拼接,得到录像界面71中显 示的第一视频流的视频图像32,第一视频流的视频图像32中视频图像722为主画面,视频图像721为在主画面的小面积区域上存在的子画面。
步骤803:该终端获取第一视频流对应的图像融合参数。
第一视频流对应的图像融合参数(也可称为元数据(Metadata))用于指示在得到第一视频流时该多路视频流的图像融合方式,具体地,该图像融合参数是该多路视频流的视频图像的拼接方式的参数信息。也即,该终端在逐帧对该多路视频流中每路视频流的视频图像进行融合处理,来得到第一视频流的每帧视频图像后,还可以逐帧获取第一视频流的每帧视频图像对应的图像融合参数。这种情况下,第一视频流的第i帧视频图像对应的图像融合参数用于指示在得到第一视频流的第i帧视频图像时该多路视频流中每路视频流的第i帧视频图像的图像融合方式,也即,第一视频流的第i帧视频图像对应的图像融合参数是在对该多路视频流中每路视频流的第i帧视频图像进行融合处理时所采用的图像融合参数。
由于是逐帧获取第一视频流的每帧视频图像对应的图像融合参数,所以第一视频流对应的图像融合参数实际上是一个参数流,该参数流的图像融合参数具有时间戳,且该参数流的图像融合参数的时间戳与第一视频流的视频图像的时间戳对齐,该参数流的图像融合参数用于指示具体是如何根据该多路视频流中每路视频流的视频图像融合得到第一视频流的视频图像的,即该参数流的图像融合参数是逐帧的图像融合方式的描述。
进一步地,该终端在得到第一视频流后,还可以在录像界面显示第一视频流的视频图像,即每得到第一视频流的一帧视频图像,就可以在录像界面显示这一帧视频图像,如此可以在视频拍摄过程中实现对所拍摄的视频的实时预览,便于用户及时获知视频的图像融合效果。
比如,该终端具有前置摄像头和后置摄像头。该图像融合参数中的图像拼接模式为上下拼接模式,且其中前置摄像头采集的视频流的图像拼接位置为上,后置摄像头采集的视频流的图像拼接位置为下,则如图4所示,该终端可以按照上下拼接模式对前置摄像头采集的视频流的视频图像421和后置摄像头采集的视频流的视频图像422进行拼接,得到录像界面41中显示的第一视频流的视频图像32,第一视频流的视频图像32中视频图像421和视频图像422按照从上至下的顺序排列。
又比如,该终端具有前置摄像头和后置摄像头。该图像融合参数中的图像拼接模式为画中画嵌套模式,且其中前置摄像头采集的视频流的图像拼接位置为子画面,后置摄像头采集的视频流的图像拼接位置为主画面,则如图7所示,该终端可以按照画中画嵌套模式对前置摄像头采集的视频流的视频图像721和后置摄像头采集的视频流的视频图像722进行拼接,得到录像界面71中显示的第一视频流的视频图像32,第一视频流的视频图像32中视频图像722为主画面,视频图像721为在主画面的小面积区域上存在的子画面。
步骤804:该终端生成包含有第一视频流的第一多媒体文件。
第一多媒体文件是用于播放第一视频流的文件。第一多媒体文件中的第一视频流具有图像融合效果。
该终端在视频拍摄过程中可以不断融合得到第一视频流的视频图像,从而可以根 据第一视频流不断生成第一多媒体文件。如此,在视频拍摄结束后,该终端就可以得到包含有完整的第一视频流的第一多媒体文件,便于用户即时分享。
可选地,该终端在生成包含有第一视频流的第一多媒体文件时,可以先对第一视频流进行编码,得到视频文件,再对该视频文件和其他相关文件(包含但不限于音频文件)进行封装,得到第一多媒体文件。当然,该终端也可以采用其他方式生成包含有第一视频流的第一多媒体文件,本申请实施例对此不作限定。
该视频文件的格式可以是预设的格式,如可以是动态图像专家组(moving picture experts group 4,MPEG-4)格式,即MP4格式,或者可以是流媒体格式(flash video,FLV)格式等,当然,也可以是其他格式,本申请实施例对此不作限定。
该音频文件可以是对音频流进行编码得到的。该音频流可以是该终端在视频拍摄过程中不断采集得到,如可以是由该终端的麦克风不断采集得到的。该音频流的音频帧的时间戳与该多路视频流中每路视频流的视频图像的时间戳对齐。该音频文件的格式可以与该视频文件的格式相同或不同,如该音频文件的格式可以是MP4格式、FLV格式、高级音频编码(advanced audio coding,AAC)格式等,本申请实施例对此不作限定。
该终端对该视频文件和其他相关文件进行封装时,可以将该视频文件作为一个视频轨道(track),将其他相关文件作为其他轨道(如可将音频文件作为音频轨道),然后对这个视频轨道和其他轨道进行封装,得到一个多轨道文件作为第一多媒体文件。其中,轨道是时间戳序列。
比如,该终端可以使用视频复用器将该视频文件对应的视频轨道和该音频文件对应的音频轨道封装(也可称为合成(mux))成一个MP4文件,该MP4文件为多轨道文件,也即为第一多媒体文件。
步骤805:该终端生成包含有该多路视频流和该图像融合参数的第二多媒体文件。
该多路视频流中的各路视频流是单独保存在第二多媒体文件中的,也即,各路视频流是独立存在的。第二多媒体文件可用于分别播放该多路视频流中的每路视频流。第二多媒体文件中的该多路视频流是未经图像融合处理的原始视频流,即是不带有图像融合效果的视频流。第二多媒体文件中的该图像融合参数用于指示第二多媒体文件中该多路视频流在后续融合时需要采用的图像融合方式。
该终端在视频拍摄过程中可以不断获取到该多路视频流中每路视频流的视频图像,且在不断对该多路视频流进行图像融合处理的过程中也可以不断获取到该图像融合参数,从而可以根据该多路视频流和该图像融合参数不断生成第二多媒体文件。如此,在视频拍摄结束后,该终端就可以得到包含有完整的多路视频流和完整的图像融合参数的第二多媒体文件,便于根据该图像融合参数对该多路视频流进行后处理,提升了该多路视频流的后处理空间。
可选地,步骤805的操作可以通过如下两种可能的方式实现。
第一种可能的方式:该终端分别对该多路视频流中的每路视频流进行编码,得到多个视频文件;对于该多个视频文件中任意的一个视频文件,对这个视频文件和该图像融合参数进行封装,得到对应的一个封装文件;将与该多个视频文件一一对应的多个封装文件确定为第二多媒体文件。
该视频文件的格式可以是预设的格式,如可以是MP4格式、FLV格式等,本申请实施例对此不作限定。
这种方式中,是对该多路视频流中每路视频流的视频文件单独进行封装来得到对应的一个封装文件,如此,可以得到该多路视频流中每路视频流的封装文件,即得到多个封装文件。此时第二多媒体文件包括该多个封装文件。
可选地,该终端在对某个视频文件和该图像融合参数进行封装时,还可以将其他相关文件也一同封装,如该终端可以将这个视频文件和该图像融合参数以及音频文件进行封装,得到对应的一个封装文件。
该音频文件可以是对音频流进行编码得到的。该音频流可以是该终端在视频拍摄过程中不断采集得到,如可以是由该终端的麦克风不断采集得到的。该音频流的音频帧的时间戳与该多路视频流中每路视频流的视频图像的时间戳是对齐的。该音频文件的格式可以与该视频文件的格式相同或不同,如该音频文件的格式可以是MP4格式、FLV格式、AAC格式等,本申请实施例对此不作限定。
可选地,该终端在对某个视频文件、该图像融合参数和其他相关文件进行封装时,可以将这个视频文件作为一个视频轨道,将该图像融合参数作为参数轨道,将其他相关文件作为其他轨道,对这个视频轨道、该参数轨道和该其他轨道进行封装,得到对应的一个多轨道文件作为封装文件。这种情况下,是将与该多个视频文件一一对应的多个多轨道文件确定为第二多媒体文件。
比如,对于该多个视频文件中任意的一个视频文件,该终端可以使用视频复用器将这个视频文件对应的视频轨道、该图像融合参数对应的参数轨道、该音频文件对应的音频轨道封装成一个MP4文件,该MP4文件为多轨道文件。之后,将封装得到的与该多个视频文件一一对应的多个MP4文件确定为第二多媒体文件。
第二种可能的方式:该终端分别对该多路视频流中的每路视频流进行编码,得到多个视频文件;对该多个视频文件和该图像融合参数进行封装,得到第二多媒体文件。
该视频文件的格式可以是预设的格式,如可以是MP4格式、FLV格式等,本申请实施例对此不作限定。
这种方式中,是对该多路视频流的该多个视频文件整体进行封装来得到一个封装文件作为第二多媒体文件。
可选地,该终端在对该多个视频文件和该图像融合参数进行封装时,还可以将其他相关文件也一同封装,如该终端可以将该多个视频文件和该图像融合参数以及音频文件进行封装,得到第二多媒体文件。
该音频文件可以是对音频流进行编码得到的。该音频流可以是该终端在视频拍摄过程中不断采集得到,如可以是由该终端的麦克风不断采集得到的。该音频流的音频帧的时间戳与该多路视频流中每路视频流的视频图像的时间戳是对齐的。该音频文件的格式可以与该视频文件的格式相同或不同,如该音频文件的格式可以是MP4格式、FLV格式、AAC格式等,本申请实施例对此不作限定。
可选地,该终端在对该多个视频文件、该图像融合参数和其他相关文件进行封装时,可以将该多个视频文件中的每个视频文件均作为一个视频轨道,以得到多个视频轨道,将该图像融合参数作为参数轨道,将其他相关文件作为其他轨道,然后对该多 个视频轨道、该参数轨道和该其他轨道进行封装,得到第二多媒体文件。
比如,该终端可以使用视频复用器将与该多个视频文件一一对应的多个视频轨道、该图像融合参数对应的参数轨道、该音频文件对应的音频轨道封装成一个MP4文件,该MP4文件为多轨道文件,也即为第二多媒体文件。
步骤806:该终端将第一多媒体文件与第二多媒体文件关联存储。
第一多媒体文件中的第一视频流具有图像融合效果。如此,在视频拍摄结束时,用户就可以观看所存储的第一多媒体文件中的带有图像融合效果的第一视频流,且可以将第一多媒体文件即时分享给其他人进行观看。
第二多媒体文件中的该多路视频流是未经图像融合处理的原始视频流,即是不带有图像融合效果的视频流。第二多媒体文件中的该图像融合参数用于指示第二多媒体文件中该多路视频流在后续融合时需要采用的图像融合方式。如此,在视频拍摄结束后,该终端不仅可以根据所存储的第二多媒体文件实现对该多路视频流中每路视频流的播放,还可以根据所存储的第二多媒体文件中的该多路视频流和该图像融合参数生成带有图像融合效果的融合视频流。由于在视频拍摄结束后,该终端无需再进行视频实时录制,所以可以提供较高的视频处理能力,从而该终端根据第二多媒体文件生成的融合视频流的图像融合效果优于在视频拍摄过程中生成的第一多媒体文件中的第一视频流的图像融合效果,如此可以使得用户最终得到图像融合效果更好的视频流来进行播放。
作为一种示例,该终端将第一多媒体文件与第二多媒体文件关联存储时,可以是将第一多媒体文件和第二多媒体文件绑定关联来形成一个视频容器,这个视频容器在本申请实施例中可称为双视频容器。也即,该终端可以在双视频容器中存放第一多媒体文件和第二多媒体文件,以实现对第一多媒体文件和第二多媒体文件的关联存储。
比如,若第二多媒体文件是通过上述步骤805中的第一种方式得到的,则如图9所示,该双视频容器中可以存放有第一多媒体文件,第一多媒体文件包含有第一视频流,第一视频流带有图像融合效果,并且,该双视频容器中还可以存放有第二多媒体文件,第二多媒体文件包括多个封装文件,如图9所示的封装文件A和封装文件B,该多个封装文件中每个封装文件包含一路视频流和该图像融合参数,这路视频流是不带有图像融合效果的原始视频流,如图9所示的封装文件A中包含有视频流A,封装文件B中包含有视频流B,视频流A和视频流B均不带有图像融合效果。
又比如,若第二多媒体文件是通过上述步骤805中的第二种方式得到的,则如图10所示,该双视频容器中可以存放有第一多媒体文件,第一多媒体文件包含有第一视频流,第一视频流具有图像融合效果,并且,该双视频容器中还可以存放有第二多媒体文件,第二多媒体文件包含有多路视频流(如图10所示的视频流A和视频流B)和该图像融合参数,该多路视频流均是不带有图像融合效果的原始视频流。
需注意的是,根据录像场景的不同,双视频容器的实现规格也有所不同。示例地,在多摄同录场景和单摄同录场景中,双视频容器的实现规格可以如下表1所示。
表1

本申请实施例仅以上表1为例来对双视频容器的实现规格进行说明,上表1并不对本申请实施例构成限定。
该终端在视频拍摄过程中,可以不断生成第一多媒体文件和第二多媒体文件,并将两者进行关联存储。进一步地,在视频拍摄结束后,该终端还可以在视频列表(也可称为图库)中展示所存储的第一多媒体文件中的第一视频流,以便用户可以选择播放第一多媒体文件中的第一视频流。
作为一种示例,该终端可以在视频列表中显示关联按钮。该关联按钮用于指示展示第一多媒体文件关联的第二多媒体文件。如此,该终端若检测到对关联按钮的选择操作,则可以展示第二多媒体文件中的多路视频流,以便用户可以获知第一多媒体文件中的第一视频流是由哪些原始视频流融合得到的,且便于用户选择播放第二多媒体文件中的该多路视频流中的任意一路视频流。
比如,如图11所示,该终端可以在视频列表1101中展示第一多媒体文件的第一视频流1102,并显示关联按钮1103。这种情况下,用户可以选择播放第一视频流1102。之后,如图11中的(a)图所示,若用户点击关联按钮1103,则如图11中的(b)图所示,该终端响应于对关联按钮1103的点击操作(即选择操作),展示第二多媒体文件中的该多路视频流1104。这种情况下,用户可以选择播放该多路视频流1104中的任意一路视频流1104。
作为另一种示例,该终端可以在视频列表中显示第二多媒体文件中的多路视频流中每路视频流对应的视频缩略图。如此,该终端若检测到对所显示的任意一个视频缩略图的选择操作,则可以展示第二多媒体文件中与这个视频缩略图对应的一路视频流,以便用户可以选择播放这一路视频流。
当然,除了上述两种示例性的方式之外,该终端也可以通过其他方式展示第二多媒体文件中的多路视频流,本申请实施例对此不作限定。
进一步地,在视频拍摄结束后,该终端还可以从第二多媒体文件中获取该多路视频流,然后播放该多路视频流中的至少一路视频流。比如,该终端可以在视频列表中展示该多路视频流,然后用户可以选择播放该多路视频流中的至少一路视频流。
之后,若终端在该至少一路视频流的播放过程中接收到针对该至少一路视频流的视频图像的融合调整指令,则根据该融合调整指令携带的融合调整信息更新第二多媒体文件中的该图像融合参数。
该融合调整指令用于指示调整该多路视频流所需采用的图像融合方式。用户可以在该至少一路视频流的播放过程中,根据自身的需求手动触发融合调整指令,该融合调整指令用于指示改变图像融合方式,如可以指示改变图像拼接模式,和/或,改变各路视频流的图像拼接位置。也即,该融合调整指令中携带的融合调整信息可以包括需调整至的图像拼接模式,和/或,可以包括各路视频流需调整至的图像拼接位置。如此,该终端就可以根据该融合调整信息修改第二多媒体文件中的该图像融合参数,实现对该图像融合参数的更新,以使得后续根据第二多媒体文件中的该图像融合参数所进行 的图像融合处理满足用户的最新需求。
比如,该多路视频流包括视频流A和视频流B,在视频拍摄过程中视频流A和视频流B在前10秒的图像拼接模式为上下拼接模式,在10秒后的图像拼接模式为左右拼接模式。这种情况下,第二多媒体文件中的图像融合参数中时间戳在前10秒内的图像融合参数中的图像拼接模式均为上下拼接模式,时间戳在10秒后的图像融合参数中的图像拼接模式均为左右拼接模式。
在视频拍摄结束后,该终端根据第二多媒体文件播放视频流A、或播放视频流B,或同时分别播放视频流A和视频流B。此时,用户若想要调整前3秒的图像拼接模式为左右拼接模式,则在视频流A和/或视频流B的播放过程中,可以触发针对视频流A和/或视频流B的前3秒的视频图像的融合调整指令,该融合调整指令用于指示调整前3秒的视频图像的图像拼接模式为左右拼接模式。这种情况下,该终端根据该融合调整指令更新第二多媒体文件中的图像融合参数,更新后的图像融合参数中时间戳在前3秒内的图像融合参数中的图像拼接模式均为左右拼接模式,时间戳在3秒到10秒内的图像融合参数中的图像拼接模式均为上下拼接模式,时间戳在10秒后的图像融合参数中的图像拼接模式均为左右拼接模式。
进一步地,在视频拍摄结束后,该终端可以根据第二多媒体文件生成第三多媒体文件。具体地,该终端可以从第二多媒体文件中获取该多路视频流和该图像融合参数,然后根据该图像融合参数对该多路视频流进行图像融合处理,得到第二视频流,根据第二视频流生成第三多媒体文件。更进一步地,该终端还可以将与第二多媒体文件关联存储的第一多媒体文件更新为第三多媒体文件。
这种情况下,第一视频流对应的图像融合参数与第二视频流对应的图像融合参数相同,也即,该终端是使用相同的图像融合方式对该多路视频流进行图像融合处理,来得到第一视频流和第二视频流的。然而,在视频拍摄过程中,受Camera器件、处理芯片、图像算法等限制,该终端在保证视频实时录制的同时,很难兼顾视频处理能力,因而该终端在视频拍摄过程中生成的第一视频流的图像融合效果很有可能不佳。而在视频拍摄结束后,该终端无需再进行视频实时录制,所以可以提供较高的视频处理能力,从而该终端此时生成的第二视频流的图像融合效果优于在视频拍摄过程中生成的第一视频流的图像融合效果。
这种情况下,该终端将与第二多媒体文件关联存储的第一多媒体文件更新为第三多媒体文件,使得与第二多媒体文件关联存储的多媒体文件中的视频流为图像融合效果更好的视频流,如此可以使得用户最终得到图像融合效果更好的视频流来进行播放。
需注意的是,若该终端根据用户触发的融合调整指令对第二多媒体文件中的该图像融合参数进行了更新,则该终端可以根据第二多媒体文件生成第三多媒体文件。然后将与第二多媒体文件关联存储的多媒体文件(有可能为第一多媒体文件,也有可能为旧的第三多媒体文件)更新为新生成的第三多媒体文件,以使得与第二多媒体文件关联存储的多媒体文件中的视频流为满足用户最新的图像融合需求的图像融合效果佳的视频流。
其中,该终端从第二多媒体文件中获取该多路视频流和该图像融合参数时,可以先对第二多媒体文件解封装(demux),得到多个视频文件和该图像融合参数,然后 再分别对该多个视频文件中的每个视频文件进行解码,得到该多路视频流。
其中,该终端根据第二视频流生成第三多媒体文件的方式与上述根据第一视频流生成第一多媒体文件的方式类似,本申请实施例对此不再赘述。
比如,如图12所示,该终端对第二多媒体文件解封装,得到视频文件A、视频文件B和该图像融合参数,然后对视频文件A解码得到视频流A,对视频文件B解码得到视频流B,之后,根据该图像融合参数对视频流A和视频流B进行图像融合处理,得到第二视频流。之后,该终端对第二视频流进行编码,得到视频文件C,对视频文件C进行封装,得到第三多媒体文件。
在本申请实施例中,在视频拍摄过程中,获取多路视频流。之后,对该多路视频流进行图像融合处理,得到第一视频流,并获取第一视频流对应的图像融合参数,该图像融合参数用于指示在得到第一视频流时多路视频流的图像融合方式。之后,生成包含有第一视频流的第一多媒体文件,以及生成包含有该多路视频流和该图像融合参数的第二多媒体文件,第二多媒体文件中的图像融合参数用于指示第二多媒体文件中该多路视频流在后续融合时需要采用的图像融合方式。将第一多媒体文件与第二多媒体文件关联存储。如此,在视频拍摄结束后,用户就可以观看所存储的第一多媒体文件中的带有图像融合效果的第一视频流,且可以将第一多媒体文件即时分享给其他人进行观看。并且,该终端还可以根据所存储的第二多媒体文件中的该多路视频流和该图像融合参数生成带有图像融合效果的融合视频流。由于在视频拍摄结束后,该终端无需再进行视频实时录制,所以可以提供较高的视频处理能力,从而该终端根据第二多媒体文件生成的融合视频流的图像融合效果优于在视频拍摄过程中生成的第一多媒体文件中的第一视频流的图像融合效果,如此可以使得用户最终得到图像融合效果更好的视频流来进行播放。
为了便于理解,下面结合图13至图16来对上述视频拍摄方法进行举例说明。
下面结合图13和图14对多摄同录场景下的视频拍摄方法进行说明。
图13是本申请实施例提供的一种视频拍摄方法的示意图。该方法应用于多摄同录场景,这种情况下,该终端通过摄像头A和摄像头B同时进行录像。该方法可以包括如下步骤(1)至步骤(4):
(1)摄像头A采集视频流A,视频流A经过ISP前端模块0和ISP后端模块0的处理后,传输至图像融合模块和关联存储模块。
示例地,摄像头A采集的视频流A的视频图像可以是RAW格式的,ISP前端模块0可以将视频流A的RAW格式的视频图像转换为YUV格式的视频图像,ISP后端模块0可以对视频流A的YUV格式的视频图像进行基础处理,如调整对比度、去除噪声等。
(2)摄像头B采集视频流B,视频流B经过ISP前端模块1和ISP后端模块1的处理后,传输至图像融合模块和关联存储模块。
示例地,摄像头B采集的视频流B的视频图像可以是RAW格式的,ISP前端模块1可以将视频流B的RAW格式的视频图像转换为YUV格式的视频图像,ISP后端模块1可以对视频流B的YUV格式的视频图像进行基础处理,如调整对比度、去除噪声等。
(3)图像融合模块对视频流A和视频流B进行图像融合处理,得到第一视频流,将第一视频流对应的图像融合参数发送至关联存储模块,第一视频流带有图像融合效果。
可选地,可将第一视频流的视频图像作为预览视频图像显示于录像界面,实现视频预览(preview)。可选地,还可以生成包含有第一视频流的第一多媒体文件并存储。
(4)关联存储模块生成包含有视频流A、视频流B和该图像融合参数的第二多媒体文件,并将第二多媒体文件与第一多媒体文件进行关联存储。
值得注意的是,本申请实施例中,预览视频图像可以是第一视频流的视频图像,也即,预览视频图像和所存储的第一多媒体文件中的第一视频流的视频图像对应的图像融合方式相同。但本申请实施例仅是以此为例进行说明,实际使用时,预览视频图像和所存储的第一多媒体文件中的第一视频流的视频图像对应的图像融合方式也可以不同。这种情况下,如图14所示,该方法可以包括如下步骤a至步骤e:
步骤a:摄像头A采集视频流A,视频流A经过ISP前端模块0和ISP后端模块0的处理后,传输至预览模块、成片模块和关联存储模块。
步骤b:摄像头B采集视频流B,视频流B经过ISP前端模块1和ISP后端模块1的处理后,传输至预览模块、成片模块和关联存储模块。
步骤c:预览模块对视频流A和视频流B进行图像融合处理,得到预览视频流,将预览视频流的视频图像作为预览视频图像显示于录像界面,预览视频流带有图像融合效果。
步骤d:成片模块对视频流A和视频流B进行图像融合处理,得到第一视频流,将第一视频流对应的图像融合参数发送至关联存储模块,生成包含有第一视频流的第一多媒体文件并存储。
这种情况下,预览模块和成片模块所使用的图像融合方式可以不同。并且,相比于成片模块,预览模块对视频流A和视频流B进行图像融合处理时的操作更为简单一些,比如,成片模块在对视频流A和视频流B进行图像融合处理时需要进行图像防抖处理,而预览模块则无需进行图像防抖处理。
步骤e:关联存储模块生成包含有视频流A、视频流B和该图像融合参数的第二多媒体文件,并将第二多媒体文件与第一多媒体文件进行关联存储。
下面结合图15和图16对单摄同录场景下的视频拍摄方法进行说明。
图15是本申请实施例提供的一种视频拍摄方法的示意图。该方法应用于单摄同录场景,这种情况下,该终端通过摄像头A进行录像。该方法可以包括如下步骤(1)至步骤(5):
(1)摄像头A采集视频流A,视频流A经过ISP前端模块0的处理后,传输至ISP后端模块0和ISP后端模块1。
示例地,摄像头A采集的视频流A的视频图像可以是RAW格式的,ISP前端模块0可以将视频流A的RAW格式的视频图像转换为YUV格式的视频图像。
(2)ISP后端模块0对视频流A进行基础处理后,将视频流A传输至图像融合模块和关联存储模块。
示例地,ISP后端模块0可以对视频流A的YUV格式的视频图像进行基础处理, 如调整对比度、去除噪声等。
(3)ISP后端模块1对视频流A进行图像处理后,得到视频流A',将视频流A'传输至图像融合模块和关联存储模块。
示例地,ISP后端模块1可以对视频流A的YUV格式的视频图像进行图像处理,如可以基于特定的逻辑对视频流A的YUV格式的视频图像进行放大处理和裁切处理等。比如,此特定的逻辑可以是人体追踪或其他显著性主体追踪等逻辑。
(4)图像融合模块对视频流A和视频流A'进行图像融合处理,得到第一视频流,将第一视频流对应的图像融合参数发送至关联存储模块,第一视频流带有图像融合效果。
可选地,可将第一视频流的视频图像作为预览视频图像显示于录像界面,实现视频预览。可选地,还可以生成包含有第一视频流的第一多媒体文件并存储。
(5)关联存储模块生成包含有视频流A、视频流A'和该图像融合参数的第二多媒体文件,并将第二多媒体文件与第一多媒体文件进行关联存储。
值得注意的是,本申请实施例中,预览视频图像可以是第一视频流的视频图像,也即,预览视频图像和所存储的第一多媒体文件中的第一视频流的视频图像对应的图像融合方式相同。但本申请实施例仅是以此为例进行说明,实际使用时,预览视频图像和所存储的第一多媒体文件中的第一视频流的视频图像对应的图像融合方式也可以不同。这种情况下,如图16所示,该方法可以包括如下步骤a至步骤f:
步骤a:摄像头A采集视频流A,视频流A经过ISP前端模块0的处理后,传输至ISP后端模块0和ISP后端模块1。
步骤b:ISP后端模块0对视频流A进行基础处理后,将视频流A传输至预览模块、成片模块和关联存储模块。
步骤c:ISP后端模块1对视频流A进行图像处理后,得到视频流A',将视频流A'传输至预览模块、成片模块和关联存储模块。
步骤d:预览模块对视频流A和视频流A'进行图像融合处理,得到预览视频流,将预览视频流的视频图像作为预览视频图像显示于录像界面,预览视频流带有图像融合效果。
步骤e:成片模块对视频流A和视频流A'进行图像融合处理,得到第一视频流,将第一视频流对应的图像融合参数发送至关联存储模块,生成包含有第一视频流的第一多媒体文件并存储。
这种情况下,预览模块和成片模块所使用的图像融合方式可以不同。并且,相比于成片模块,预览模块对视频流A和视频流A'进行图像融合处理时的操作更为简单一些,比如,成片模块在对视频流A和视频流A'进行图像融合处理时需要进行图像防抖处理,而预览模块则无需进行图像防抖处理。
步骤f:关联存储模块生成包含有视频流A、视频流A'和该图像融合参数的第二多媒体文件,并将第二多媒体文件与第一多媒体文件进行关联存储。
图17是本申请实施例提供的一种视频拍摄装置的结构示意图,该装置可以由软件、硬件或者两者的结合实现成为计算机设备的部分或者全部,该计算机设备可以为上文 图1至图2实施例所述的终端100。参见图17,该装置包括:第一获取模块1701、处理模块1702、第二获取模块1703、第一生成模块1704、第二生成模块1705和存储模块1706。
第一获取模块1701,用于在视频拍摄过程中,获取多路视频流;
处理模块1702,用于对多路视频流进行图像融合处理,得到第一视频流;
第二获取模块1703,用于获取第一视频流对应的图像融合参数,图像融合参数用于指示在得到第一视频流时多路视频流的图像融合方式;
第一生成模块1704,用于生成包含有第一视频流的第一多媒体文件;
第二生成模块1705,用于生成包含有多路视频流和图像融合参数的第二多媒体文件;
存储模块1706,用于将第一多媒体文件与第二多媒体文件关联存储。
可选地,第一获取模块1701用于:
获取多个摄像头中的每个摄像头采集的一路视频流,以得到多路视频流;
其中,多个摄像头均设置于终端;或者,多个摄像头中的一部分摄像头设置于终端,另一部分摄像头设置于与终端处于多屏协同状态的协同设备。
可选地,第一获取模块1701用于:
获取摄像头采集的一路视频流;
对这一路视频流进行图像处理,得到另一路视频流。
可选地,图像融合参数包括图像拼接模式,图像拼接模式包括上下拼接模式、左右拼接模式、画中画嵌套模式中的一种或多种。
可选地,该装置还包括:
显示模块,用于在录像界面显示第一视频流的视频图像。
可选地,第二生成模块1705用于:
分别对多路视频流中的每路视频流进行编码,得到多个视频文件;
对于多个视频文件中任意的一个视频文件,将这一个视频文件作为一个视频轨道,将图像融合参数作为参数轨道,对这一个视频轨道和参数轨道进行封装,得到对应的一个多轨道文件;
将与多个视频文件一一对应的多个多轨道文件确定为第二多媒体文件。
可选地,第二生成模块1705用于:
分别对多路视频流中的每路视频流进行编码,得到多个视频文件;
将多个视频文件中的每个视频文件均作为一个视频轨道,以得到多个视频轨道;
将图像融合参数作为参数轨道;
对多个视频轨道和参数轨道进行封装,得到第二多媒体文件。
可选地,该装置还包括:
第三获取模块,用于在视频拍摄结束后,从第二多媒体文件中获取多路视频流和图像融合参数;
处理模块1702,还用于根据图像融合参数对多路视频流进行图像融合处理,得到第二视频流;
第一生成模块1704,用于根据第二视频流生成第三多媒体文件。
可选地,该装置还包括:
第一更新模块,用于将与第二多媒体文件关联存储的第一多媒体文件更新为第三多媒体文件。
可选地,该装置还包括:
第四获取模块,用于在视频拍摄结束后,从第二多媒体文件中获取多路视频流;
播放模块,用于播放多路视频流中的至少一路视频流;
第二更新模块,用于若在该至少一路视频流的播放过程中接收到针对该至少一路视频流的视频图像的融合调整指令,则根据融合调整指令携带的融合调整信息更新第二多媒体文件中的图像融合参数。
可选地,该装置还包括:
第一展示模块,用于在视频拍摄结束后,在视频列表中展示第一多媒体文件中的第一视频流和关联按钮;
第二展示模块,用于若检测到对关联按钮的选择操作,则展示第二多媒体文件中的多路视频流。
在本申请实施例中,在视频拍摄过程中,获取多路视频流。之后,对该多路视频流进行图像融合处理,得到第一视频流,并获取第一视频流对应的图像融合参数,该图像融合参数用于指示在得到第一视频流时多路视频流的图像融合方式。之后,生成包含有第一视频流的第一多媒体文件,以及生成包含有该多路视频流和该图像融合参数的第二多媒体文件,第二多媒体文件中的图像融合参数用于指示第二多媒体文件中该多路视频流在后续融合时需要采用的图像融合方式。将第一多媒体文件与第二多媒体文件关联存储。如此,在视频拍摄结束后,用户就可以观看所存储的第一多媒体文件中的带有图像融合效果的第一视频流,且可以将第一多媒体文件即时分享给其他人进行观看。并且,该装置还可以根据所存储的第二多媒体文件中的该多路视频流和该图像融合参数生成带有图像融合效果的融合视频流。由于在视频拍摄结束后,该装置无需再进行视频实时录制,所以可以提供较高的视频处理能力,从而根据第二多媒体文件生成的融合视频流的图像融合效果优于在视频拍摄过程中生成的第一多媒体文件中的第一视频流的图像融合效果,如此可以使得用户最终得到图像融合效果更好的视频流来进行播放。
需要说明的是:上述实施例提供的视频拍摄装置在视频拍摄时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
上述实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请实施例的保护范围。
上述实施例提供的视频拍摄装置与视频拍摄方法实施例属于同一构思,上述实施例中单元、模块的具体工作过程及带来的技术效果,可参见方法实施例部分,此处不再赘述。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意结合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络或其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,比如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(比如:同轴电缆、光纤、数据用户线(Digital Subscriber Line,DSL))或无线(比如:红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质,或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(比如:软盘、硬盘、磁带)、光介质(比如:数字通用光盘(Digital Versatile Disc,DVD))或半导体介质(比如:固态硬盘(Solid State Disk,SSD))等。
以上所述为本申请提供的可选实施例,并不用以限制本申请,凡在本申请的揭露的技术范围之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (14)

  1. 一种视频拍摄方法,其特征在于,应用于终端,所述方法包括:
    在视频拍摄过程中,获取多路视频流;
    对所述多路视频流进行图像融合处理,得到第一视频流;
    获取所述第一视频流对应的图像融合参数,所述图像融合参数用于指示在得到所述第一视频流时所述多路视频流的图像融合方式;
    生成包含有所述第一视频流的第一多媒体文件;
    生成包含有所述多路视频流和所述图像融合参数的第二多媒体文件;
    将所述第一多媒体文件与所述第二多媒体文件关联存储。
  2. 如权利要求1所述的方法,其特征在于,所述获取多路视频流,包括:
    获取多个摄像头中的每个摄像头采集的一路视频流,以得到所述多路视频流;
    其中,所述多个摄像头均设置于所述终端;或者,所述多个摄像头中的一部分摄像头设置于所述终端,另一部分摄像头设置于与所述终端处于多屏协同状态的协同设备。
  3. 如权利要求1所述的方法,其特征在于,所述获取多路视频流,包括:
    获取摄像头采集的一路视频流;
    对所述一路视频流进行图像处理,得到另一路视频流。
  4. 如权利要求1至3任一所述的方法,其特征在于,所述图像融合参数包括图像拼接模式,所述图像拼接模式包括上下拼接模式、左右拼接模式、画中画嵌套模式中的一种或多种。
  5. 如权利要求1至4任一所述的方法,其特征在于,所述对所述多路视频流进行图像融合处理,得到第一视频流之后,还包括:
    在录像界面显示所述第一视频流的视频图像。
  6. 如权利要求1至5任一所述的方法,其特征在于,所述生成包含有所述多路视频流和所述图像融合参数的第二多媒体文件,包括:
    分别对所述多路视频流中的每路视频流进行编码,得到多个视频文件;
    对于所述多个视频文件中任意的一个视频文件,将所述一个视频文件作为一个视频轨道,将所述图像融合参数作为参数轨道,对所述一个视频轨道和所述参数轨道进行封装,得到对应的一个多轨道文件;
    将与所述多个视频文件一一对应的多个多轨道文件确定为所述第二多媒体文件。
  7. 如权利要求1至5任一所述的方法,其特征在于,所述生成包含有所述多路视频流和所述图像融合参数的第二多媒体文件,包括:
    分别对所述多路视频流中的每路视频流进行编码,得到多个视频文件;
    将所述多个视频文件中的每个视频文件均作为一个视频轨道,以得到多个视频轨道;
    将所述图像融合参数作为参数轨道;
    对所述多个视频轨道和所述参数轨道进行封装,得到所述第二多媒体文件。
  8. 如权利要求1至7任一所述的方法,其特征在于,所述方法还包括:
    在视频拍摄结束后,从所述第二多媒体文件中获取所述多路视频流和所述图像融合参数;
    根据所述图像融合参数对所述多路视频流进行图像融合处理,得到第二视频流;
    根据所述第二视频流生成第三多媒体文件。
  9. 如权利要求8所述的方法,其特征在于,所述根据所述第二视频流生成第三多媒体文件之后,还包括:
    将与所述第二多媒体文件关联存储的所述第一多媒体文件更新为所述第三多媒体文件。
  10. 如权利要求1至9任一所述的方法,其特征在于,所述方法还包括:
    在视频拍摄结束后,从所述第二多媒体文件中获取所述多路视频流;
    播放所述多路视频流中的至少一路视频流;
    若在所述至少一路视频流的播放过程中接收到针对所述至少一路视频流的视频图像的融合调整指令,则根据所述融合调整指令携带的融合调整信息更新所述第二多媒体文件中的所述图像融合参数。
  11. 如权利要求1至10任一所述的方法,其特征在于,所述方法还包括:
    在视频拍摄结束后,在视频列表中展示所述第一多媒体文件中的所述第一视频流和关联按钮;
    若检测到对所述关联按钮的选择操作,则展示所述第二多媒体文件中的所述多路视频流。
  12. 一种视频拍摄装置,其特征在于,所述装置包括:
    第一获取模块,用于在视频拍摄过程中,获取多路视频流;
    处理模块,用于对所述多路视频流进行图像融合处理,得到第一视频流;
    第二获取模块,用于获取所述第一视频流对应的图像融合参数,所述图像融合参数用于指示在得到所述第一视频流时所述多路视频流的图像融合方式;
    第一生成模块,用于生成包含有所述第一视频流的第一多媒体文件;
    第二生成模块,用于生成包含有所述多路视频流和所述图像融合参数的第二多媒体文件;
    存储模块,用于将所述第一多媒体文件与所述第二多媒体文件关联存储。
  13. 一种计算机设备,其特征在于,所述计算机设备包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至11任意一项所述的方法。
  14. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行如权利要求1至11任意一项所述的方法。
PCT/CN2023/087623 2022-05-30 2023-04-11 视频拍摄方法、装置、设备和存储介质 WO2023231585A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210601210.0 2022-05-30
CN202210601210.0A CN117201955A (zh) 2022-05-30 2022-05-30 视频拍摄方法、装置、设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023231585A1 true WO2023231585A1 (zh) 2023-12-07

Family

ID=88992914

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/087623 WO2023231585A1 (zh) 2022-05-30 2023-04-11 视频拍摄方法、装置、设备和存储介质

Country Status (2)

Country Link
CN (1) CN117201955A (zh)
WO (1) WO2023231585A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150264433A1 (en) * 2009-11-13 2015-09-17 Samsung Electronics Co., Ltd. Photographing apparatus and method of providing photographed video
CN108881957A (zh) * 2017-11-02 2018-11-23 北京视联动力国际信息技术有限公司 一种多媒体文件的混合方法和装置
CN112954218A (zh) * 2019-03-18 2021-06-11 荣耀终端有限公司 一种多路录像方法及设备
CN113596319A (zh) * 2021-06-16 2021-11-02 荣耀终端有限公司 基于画中画的图像处理方法、设备、存储介质和程序产品
CN114466246A (zh) * 2022-02-14 2022-05-10 维沃移动通信有限公司 一种视频处理方法及其装置

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4849343B2 (ja) * 2005-08-25 2012-01-11 ソニー株式会社 データ生成方法、記録装置および方法、並びに、プログラム
CN102801979A (zh) * 2012-08-09 2012-11-28 武汉微创光电股份有限公司 多路视频混合编码方法及装置
CN105338290B (zh) * 2014-06-10 2019-04-12 杭州海康威视数字技术股份有限公司 码流的合成方法及装置
CN105472371B (zh) * 2016-01-13 2019-11-05 腾讯科技(深圳)有限公司 视频码流处理方法和装置
CN106454130A (zh) * 2016-11-29 2017-02-22 广东欧珀移动通信有限公司 控制方法、控制装置和电子装置
CN107454269A (zh) * 2017-09-08 2017-12-08 维沃移动通信有限公司 一种拍摄方法及移动终端
CN109120867A (zh) * 2018-09-27 2019-01-01 乐蜜有限公司 视频合成方法及装置
CN111343415A (zh) * 2018-12-18 2020-06-26 杭州海康威视数字技术股份有限公司 数据传输方法及装置
CN109587401B (zh) * 2019-01-02 2021-10-08 广州市奥威亚电子科技有限公司 电子云台多场景拍摄实现方法及系统
CN112312039A (zh) * 2019-07-15 2021-02-02 北京小米移动软件有限公司 音视频信息获取方法、装置、设备及存储介质
CN113873187B (zh) * 2020-06-12 2023-03-10 华为技术有限公司 跨终端录屏方法、终端设备及存储介质
CN112752036A (zh) * 2020-12-28 2021-05-04 北京爱奇艺科技有限公司 一种视频处理方法及装置
CN113923391B (zh) * 2021-09-08 2022-10-14 荣耀终端有限公司 视频处理的方法、设备和存储介质
CN113923351B (zh) * 2021-09-09 2022-09-27 荣耀终端有限公司 多路视频拍摄的退出方法、设备和存储介质
CN114257760A (zh) * 2021-12-10 2022-03-29 广东科凯达智能机器人有限公司 视频拼接处理方法、智能机器人及系统
CN114466145B (zh) * 2022-01-30 2024-04-12 北京字跳网络技术有限公司 视频处理方法、装置、设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150264433A1 (en) * 2009-11-13 2015-09-17 Samsung Electronics Co., Ltd. Photographing apparatus and method of providing photographed video
CN108881957A (zh) * 2017-11-02 2018-11-23 北京视联动力国际信息技术有限公司 一种多媒体文件的混合方法和装置
CN112954218A (zh) * 2019-03-18 2021-06-11 荣耀终端有限公司 一种多路录像方法及设备
CN113596319A (zh) * 2021-06-16 2021-11-02 荣耀终端有限公司 基于画中画的图像处理方法、设备、存储介质和程序产品
CN114466246A (zh) * 2022-02-14 2022-05-10 维沃移动通信有限公司 一种视频处理方法及其装置

Also Published As

Publication number Publication date
CN117201955A (zh) 2023-12-08

Similar Documents

Publication Publication Date Title
CN110109636B (zh) 投屏方法、电子设备以及系统
WO2020253719A1 (zh) 一种录屏方法及电子设备
US20220321795A1 (en) Photographing Method and Terminal
WO2021139768A1 (zh) 跨设备任务处理的交互方法、电子设备及存储介质
WO2020224485A1 (zh) 一种截屏方法及电子设备
CN112286477B (zh) 投屏显示方法及相关产品
WO2022121775A1 (zh) 一种投屏方法及设备
WO2021249318A1 (zh) 一种投屏方法和终端
CN109819306B (zh) 一种媒体文件裁剪的方法、电子设备和服务器
WO2022166521A1 (zh) 跨设备的协同拍摄方法、相关装置及系统
CN112328941A (zh) 基于浏览器的应用投屏方法及相关装置
WO2022135527A1 (zh) 一种视频录制方法及电子设备
WO2022160985A1 (zh) 一种分布式拍摄方法,电子设备及介质
WO2023005900A1 (zh) 一种投屏方法、电子设备及系统
WO2023231585A1 (zh) 视频拍摄方法、装置、设备和存储介质
CN115550559B (zh) 视频画面显示方法、装置、设备和存储介质
WO2022222773A1 (zh) 拍摄方法、相关装置及系统
CN115037872A (zh) 视频处理方法和相关装置
WO2023142731A1 (zh) 一种分享多媒体文件的方法、发送端设备和接收端设备
CN115016871B (zh) 多媒体编辑方法、电子设备和存储介质
WO2024094063A1 (zh) 截屏处理方法及电子设备
WO2023273460A1 (zh) 一种投屏显示方法及电子设备
WO2022160999A1 (zh) 显示方法和电子设备
WO2023185590A1 (zh) 媒体信息的获取方法及电子设备
WO2023226725A9 (zh) 录像方法和相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23814781

Country of ref document: EP

Kind code of ref document: A1