WO2022062642A1 - 视频处理方法、显示装置和存储介质 - Google Patents

视频处理方法、显示装置和存储介质 Download PDF

Info

Publication number
WO2022062642A1
WO2022062642A1 PCT/CN2021/109020 CN2021109020W WO2022062642A1 WO 2022062642 A1 WO2022062642 A1 WO 2022062642A1 CN 2021109020 W CN2021109020 W CN 2021109020W WO 2022062642 A1 WO2022062642 A1 WO 2022062642A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
pose
displayed
pose data
display device
Prior art date
Application number
PCT/CN2021/109020
Other languages
English (en)
French (fr)
Inventor
杨骁�
吕晴阳
李耔余
陈怡�
陈浩
王国晖
杨建朝
任龙
刘舒
连晓晨
梅星
Original Assignee
杨骁�
字节跳动有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杨骁�, 字节跳动有限公司 filed Critical 杨骁�
Publication of WO2022062642A1 publication Critical patent/WO2022062642A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/4104Peripherals receiving signals from specially adapted client devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42204User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
    • H04N21/42206User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor characterized by hardware details
    • H04N21/42222Additional components integrated in the remote control device, e.g. timer, speaker, sensors for detecting position, direction or movement of the remote control, microphone or battery charging device
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video

Definitions

  • Embodiments of the present disclosure relate to a video processing method, a display device, and a storage medium.
  • Short videos have the characteristics of strong social attributes, easy creation and short duration, which are more in line with the fragmented content consumption habits of users in the mobile Internet era.
  • Augmented Reality (AR, Augmented Reality) technology is a technology that ingeniously integrates virtual information with the real world. , images, 3D models, music, videos and other virtual information are simulated and applied to the real world, and the real world information and virtual information complement each other, thereby realizing the "enhancement" of the real world. AR's unique virtual and real fusion special effects determine that AR has unlimited expansion space in the field of short video.
  • At least one embodiment of the present disclosure provides a video processing method for a display device, including: acquiring a video to be processed, and adding a plurality of video time stamps to a plurality of video frames of the video to be processed, wherein the plurality of video time stamps are The video frames are in one-to-one correspondence with the plurality of video time stamps; N pieces of pose data of the display device are obtained, N pose time stamps are respectively applied to the N pieces of pose data, and the N pieces of pose data are cached.
  • the pose data and the N pose timestamps wherein the N pose data and the N pose timestamps are in one-to-one correspondence, and N is a positive integer;
  • For the displayed video frame at least one pose data corresponding to the to-be-displayed video frame is determined from the N pose data according to the video time stamp corresponding to the to-be-displayed video frame;
  • For the at least one pose data corresponding to the displayed video frame the pose of the virtual model is adjusted to obtain the virtual model to be displayed; the video frame to be displayed and the to-be-displayed video frame are simultaneously displayed by the display device. virtual model.
  • At least one embodiment of the present disclosure provides a display device, comprising: a memory for non-transitory storage of computer-readable instructions; a processor for executing the computer-readable instructions, the computer-readable instructions being executed by the The processor runtime implements the video processing method according to any embodiment of the present disclosure.
  • At least one embodiment of the present disclosure provides a non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer-readable instructions that, when executed by a processor, implement a The video processing method described in any embodiment of the present disclosure.
  • FIG. 1 is a schematic flowchart of a video processing method provided by at least one embodiment of the present disclosure
  • FIG. 2 is a schematic block diagram of a parameter acquisition apparatus provided by at least one embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a non-transitory computer-readable storage medium provided by at least one embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of an electronic device according to at least one embodiment of the present disclosure.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • the AR effect can follow the movement or rotation of the electronic device and accordingly move on the screen in real time (in fact, there may be a small negligible delay) Or rotate (the AR special effect may move out of the screen), that is, the movement of the AR special effect is consistent with the movement of the electronic device; and each video frame displayed on the screen has a delay, and the delay time of the video frame is greater than that of the AR special effect.
  • the currently displayed picture is captured by the camera in the camera pose at a previous moment, and the currently displayed AR special effect is adjusted according to the position of the camera in the camera pose at the current moment, so that As a result, the AR special effect cannot be aligned with the screen displayed on the screen, which affects the effect of the landmark AR special effect.
  • At least one embodiment of the present disclosure provides a video processing method, a display device, and a non-transitory computer-readable storage medium.
  • a video processing method is used for a display device, and the video processing method includes: acquiring a video to be processed, and adding multiple video time stamps to multiple video frames of the to-be-processed video, wherein the multiple video frames and the multiple video time stamps are identical.
  • N pose data of the display device mark N pose time stamps on the N pose data respectively, and cache N pose data and N pose time stamps, among which, N pose data One-to-one correspondence with the N pose timestamps, where N is a positive integer;
  • the video frame to be displayed is extracted from multiple video frames, and the video frame to be displayed is determined from the N pose data according to the video timestamp corresponding to the video frame to be displayed.
  • at least one pose data corresponding to the video frame to be displayed adjust the pose of the virtual model based on at least one pose data corresponding to the video frame to be displayed to obtain the virtual model to be displayed;
  • the displayed video frame and the virtual model to be displayed are examples of the virtual model based on at least one pose data corresponding to the video frame to be displayed.
  • the video processing method is used to synchronize the video frame and the camera pose data.
  • the AR special effects can be matched with the camera pose data.
  • the images displayed on the screen are aligned.
  • the display device realizes the AR special effects of landmarks, the AR special effects are more accurately aligned with the images displayed on the screen (for example, landmark buildings), thereby providing users with better visual effects of AR special effects.
  • the video processing method provided by the embodiment of the present disclosure may be configured on the display device provided by the embodiment of the present disclosure, for example, the video processing method may be configured in an application program of the display device.
  • the display device may be a personal computer, a mobile terminal, etc.
  • the mobile terminal may be a hardware device with various operating systems, such as a mobile phone and a tablet computer.
  • the application can be Douyin, etc.
  • FIG. 1 is a schematic flowchart of a video processing method provided by at least one embodiment of the present disclosure.
  • the video processing method can be applied to a display device. As shown in FIG. 1 , the video processing method includes steps S10 to S14.
  • Step S10 acquiring the video to be processed, and adding a plurality of video time stamps to multiple video frames of the video to be processed;
  • Step S11 acquiring N pose data of the display device, marking N pose time stamps on the N pose data respectively, and buffering the N pose data and N pose time stamps;
  • Step S12 extracting a video frame to be displayed from a plurality of video frames, and determining at least one pose data corresponding to the video frame to be displayed from the N pose data according to the video time stamp corresponding to the video frame to be displayed;
  • Step S13 Adjust the posture of the virtual model based on at least one pose data corresponding to the video frame to be displayed, so as to obtain the virtual model to be displayed;
  • Step S14 Simultaneously display the video frame to be displayed and the virtual model to be displayed through the display device.
  • step S10 multiple video frames are in one-to-one correspondence with multiple video timestamps.
  • the display device may include a video capture device for capturing images and/or videos, and the like.
  • the video capture device may include a camera, a video camera, and the like.
  • the video capture device may be set integrally with the display device, or the video capture device may be set separately from the display device, and communicated with the display device through wireless (eg, Bluetooth, etc.) or wired communication.
  • acquiring the video to be processed includes: using a video acquisition device to acquire the video of the target object to obtain the video to be processed.
  • the rate of acquiring the video to be processed may be 30 frames per second, that is, 30 video frames are acquired per second.
  • the present disclosure is not limited to this, and the rate of acquiring the video to be processed may be set according to the actual situation, for example Can be 60 frames per second.
  • the display device has an anti-shake function
  • the anti-shake function of the display device for example, a mobile phone
  • the video frame of the acquired video is a smoothed video frame;
  • the frame cannot be aligned on the time axis with the pose data of the video capture device (eg, camera) obtained by the sensor of the display device or the SLAM (Simultaneous localization and mapping) system, etc.
  • the anti-shake function needs to be turned off, so that the captured video frame and the pose data of the video capture device can be aligned on the time axis.
  • the video to be processed may be the video collected in real time by the video collection device, or may be the video collected in advance and stored in the display device.
  • the video to be processed may include target objects, including landmark buildings (eg, Yueyang Tower in Yueyang, Tengwang Pavilion in Nanchang, Yellow Crane Tower in Wuhan, Taikoo Li in Sanlitun, Beijing, etc.) and other outdoor objects, as well as tables and cabinets, etc.
  • landmark buildings eg, Yueyang Tower in Yueyang, Tengwang Pavilion in Nanchang, Yellow Crane Tower in Wuhan, Taikoo Li in Sanlitun, Beijing, etc.
  • Indoor objects but also natural scenery, such as California redwood trees, etc.
  • step S10 adding a plurality of video time stamps to the video frames of the video to be processed includes: when the video to be processed is collected by the video capture device, acquiring the system clock of the display device in real time, so as to obtain a system clock corresponding to the video to be processed in real time. Multiple video timestamps corresponding to the frames respectively.
  • the embodiments of the present disclosure do not limit the generation manner and precision of the system clock.
  • N pieces of pose data are in one-to-one correspondence with N pieces of pose timestamps, and N is a positive integer.
  • the number of N pose data is 100-200, that is, N is 100-200, for example, N is 100, 150, 180, 200, etc., but the present disclosure is not limited thereto, and the value of N can be based on Actual setting.
  • the display device may further include a pose collecting device, which is used for collecting pose data of the display device.
  • the pose collecting device may be integrated with the display device, or the pose collecting device may be separate from the display device, and communicate with the display device in a wireless (eg, Bluetooth, etc.) or wired manner.
  • the pose data of the display device may represent the pose data of the video capture device in the display device.
  • acquiring the N pieces of pose data includes: when using the video collection device to collect the video to be processed, using the pose collection device to collect the pose data of the video collection device to obtain N pieces of pose data.
  • the rate of acquiring pose data may be 80 pieces/second, that is, 80 pieces of pose data are acquired per second.
  • the present disclosure is not limited thereto, and the rate of acquiring pose data may also be 90 pieces/second. Second, 100 / second, etc., can be set according to the actual situation.
  • the collection of pose data and the collection of the video to be processed are performed at the same time, so as to ensure that the pose data and video frames collected at the same time correspond to each other, thereby ensuring that the AR special effects adjusted based on the pose data are compatible with each other.
  • the images displayed on the screen are aligned with each other.
  • adding N pose timestamps to the N pose data respectively includes: when the N pose data is collected by the pose acquisition device, acquiring the system clock of the display device in real time, so as to obtain the corresponding corresponding N pose data respectively.
  • the video time stamp is the system clock of the display device
  • the pose time stamp is also the system clock of the display device, so that both the video time stamp and the pose time stamp are obtained based on the same clock, so as to ensure the video time stamp and the pose time stamp.
  • the timestamps can correspond to each other.
  • each of the N pieces of pose data includes information such as the position and angle of the video capture device
  • the position may represent the GPS coordinates (eg, longitude, latitude, altitude) corresponding to the video capture device
  • the angle may represent The relative angular relationship between the video capture device and the target object.
  • the N pose data may be different from each other, or at least part of the pose data may be the same.
  • the pose collection device may continuously acquire the pose data of the video capture device and obtain the corresponding pose timestamp, but the display device only buffers N pose data and N pose time stamps, for example, when the pose collection device collects the first N pose data and the first N pose time stamps, the display device caches the N pose data and N pose data The pose timestamp, when the pose collection device collects the N+1 th pose data and the N+1 th pose timestamp, the display device caches the N+1 th pose data and the N+1 th pose data At the same time, delete the first pose data and the first pose timestamp; when the pose collection device collects the N+2th pose data and the N+2th pose timestamp, Then the display device caches the N+2 th pose data and the N+2 th pose timestamp, and deletes the second pose data and the second pose timestamp, and so on. Only N pose data and N pose timestamps are always cached
  • the display device may also cache all the pose data and the pose timestamps collected by the pose collection device.
  • the video frame to be displayed may be any frame of the video to be processed, and the video frame to be displayed may be the video frame corresponding to the current shooting time when the video to be processed is shot.
  • step S12 determining at least one pose data corresponding to the video frame to be displayed from the N pose data according to the video timestamp corresponding to the video frame to be displayed, including: according to The video time stamp corresponding to the video frame to be displayed is searched for the reference pose time stamp corresponding to the video time stamp corresponding to the video frame to be displayed on the pose time axis composed of N pose time stamps; The pose data corresponding to the pose timestamp is used as at least one pose data.
  • the at least one pose data is one pose data.
  • the absolute value of the time difference between the reference pose time stamp and the video time stamp is the smallest, that is, the absolute value of the time difference between the reference pose time stamp and the video time stamp is smaller than the video time stamp and the N pose time stamps.
  • the video timestamp is 1 minute 5 seconds 4, if the N pose timestamps are 1 minute 5 seconds 1, 1 minute 5 seconds 3, 1 minute 5 seconds 7, 1 minute 5 seconds 9 , 1 minute 6 seconds 1, etc.
  • the absolute value of the time difference between the pose timestamp of 1 minute 5 seconds 3 and the video timestamp is the smallest, that is, 1 minute 5 seconds 3 is the reference pose timestamp.
  • step S12 according to the video time stamp corresponding to the video frame to be displayed, at least one pose data corresponding to the video frame to be displayed is determined from the N pose data, including: According to the video time stamps corresponding to the video frames to be displayed, the first reference pose time stamps corresponding to the video time stamps corresponding to the video frames to be displayed and The second reference pose time stamp; the pose data corresponding to the first reference pose time stamp and the pose data corresponding to the second reference pose time stamp are used as at least one pose data.
  • the absolute value of the time difference between the first reference pose timestamp and the video timestamp and the absolute value of the time difference between the second reference pose timestamp and the video timestamp are the two smallest absolute values, that is, , the absolute value of the time difference between the first reference pose time stamp and the video time stamp and the absolute value of the time difference between the second reference pose time stamp and the video time stamp are less than the video time stamp and the N pose time stamps.
  • the absolute value of the time difference between the first reference pose time stamp and the video time stamp and the absolute value of the time difference between the second reference pose time stamp and the video time stamp may be the same.
  • the video timestamp is 1 minute 5 seconds 4, if the N pose timestamps are 1 minute 5 seconds 1, 1 minute 5 seconds 3, 1 minute 5 seconds 5, 1 minute 5 seconds 7 , 1 minute 5 seconds 8, etc.
  • the absolute value of the time difference between the pose timestamp of 1 minute 5 seconds 3 and the video timestamp and the difference between the pose timestamp of 1 minute 5 seconds 5 and the video timestamp is the smallest two absolute values, namely 1 minute 5 seconds 3 and 1 minute 5 seconds 5 are the first reference pose timestamp and the second reference pose timestamp, at this time, the first reference pose timestamp
  • the absolute value of the time difference between the pose timestamp and the video timestamp is the same as the absolute value of the time difference between the second reference pose timestamp and the video timestamp, both being 0.1 seconds.
  • the absolute value of the time difference between the first reference pose time stamp and the video time stamp and the absolute value of the time difference between the second reference pose time stamp and the video time stamp may also be different.
  • the video timestamp is 1 minute 5 seconds 4, if the N pose timestamps are 1 minute 5 seconds 1, 1 minute 5 seconds 3, 1 minute 5 seconds 6, 1 minute 5 seconds 8 , 1 minute 6 seconds 1, etc.
  • the absolute value of the time difference between the pose timestamp of 1 minute 5 seconds 3 and the video timestamp and the difference between the pose timestamp of 1 minute 5 seconds 6 and the video timestamp is the smallest two absolute values, namely 1 minute 5 seconds 3 and 1 minute 5 seconds 6 are the first reference pose timestamp and the second reference pose timestamp, at this time, the first reference pose timestamp
  • the absolute value of the time difference between the pose timestamp and the video timestamp (0.1 seconds) is not the same as the absolute value of the time difference between the second reference pose timestamp and the video timestamp (0.2 seconds).
  • the length of the pose time axis may be 1 second, that is, the length of time corresponding to the N pose data and the N pose timestamps buffered by the display device is 1 second.
  • the buffered N pose data and N pose timestamps may be N pose data and N pose timestamps that are closest to the current moment.
  • the current time is 10:10:20
  • the buffered N pose data and N pose timestamps may be from 10:10:19 to 10:10:20 The pose data and pose timestamps collected between.
  • step S13 may include: in response to the quantity of the at least one pose data corresponding to the video frame to be displayed being 1, using the at least one pose data corresponding to the video frame to be displayed as the target position pose data, based on the target pose data, the pose of the virtual model is adjusted to obtain the virtual model to be displayed; or, in response to the number of at least one pose data corresponding to the video frame to be displayed being 2, At least one pose data corresponding to the displayed video frame is subjected to interpolation processing to obtain target pose data corresponding to the video frame to be displayed, and based on the target pose data, the pose of the virtual model is adjusted to obtain the virtual model to be displayed.
  • Model in response to the quantity of the at least one pose data corresponding to the video frame to be displayed being 1, using the at least one pose data corresponding to the video frame to be displayed as the target position pose data, based on the target pose data, the pose of the virtual model is adjusted to obtain the virtual model to be displayed; or, in response to the number of at least one pose data corresponding to the
  • the at least one pose data is only one pose data
  • the at least one pose data corresponding to the video frame to be displayed is directly used as the target pose data, and then, based on the target pose data, the pose data of the virtual model is determined. Make adjustments to get the virtual model to be displayed.
  • interpolation processing eg, linear interpolation
  • performing interpolation processing on the two pose data includes performing interpolation processing on positions in the pose data and performing interpolation processing on angles in the pose data.
  • adjusting the posture of the virtual model based on the target posture data includes: calculating the posture corresponding to the virtual model on the display screen of the display device based on the target posture data, and then Various data in the three-dimensional reconstruction data set corresponding to the virtual model are projected onto the display screen according to the attitude corresponding to the target pose data, so as to obtain the virtual model in the attitude.
  • the virtual model is an augmented reality special effect model and the like.
  • the virtual model may include virtual special effects such as text, images, three-dimensional models, music, and videos.
  • the virtual model can be a pre-modeled model.
  • step S14 includes: displaying the video frame to be displayed; and superimposing the virtual model to be displayed on the video frame to be displayed for display.
  • the video frame to be displayed and the virtual model to be displayed are displayed on the display device at the same time, and the time corresponding to the video frame to be displayed and the virtual model to be displayed are the same, so that the AR special effect and the image displayed on the screen can be Alignment, which in turn can provide users with better visual effects of AR effects.
  • FIG. 2 is a schematic block diagram of a display device provided by at least one embodiment of the present disclosure.
  • the display device 20 includes a processor 200 and a memory 210 . It should be noted that the components of the display device 20 shown in FIG. 2 are only exemplary and not restrictive, and the display device 20 may also have other components according to actual application requirements.
  • processor 200 and the memory 210 may communicate with each other directly or indirectly.
  • the processor 200 and the memory 210 may communicate over a network.
  • the network may include a wireless network, a wired network, and/or any combination of wireless and wired networks.
  • the processor 200 and the memory 210 can also communicate with each other through a system bus, which is not limited in the present disclosure.
  • memory 210 is used for non-transitory storage of computer readable instructions.
  • the processor 200 is configured to execute the computer-readable instructions, the computer-readable instructions are executed by the processor 200 to implement the video processing method according to any of the foregoing embodiments.
  • the specific implementation of each step of the video processing method and related explanation contents reference may be made to the above-mentioned embodiments of the video processing method, and repeated descriptions are not repeated here.
  • the processor 200 and the memory 210 may be provided on the server side (or the cloud).
  • the processor 200 may control other components in the display device 20 to perform desired functions.
  • the processor 200 may be a central processing unit (CPU), a network processor (NP), etc.; it may also be a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable Logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • the central processing unit (CPU) can be an X86 or an ARM architecture or the like.
  • memory 210 may include any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • Volatile memory may include, for example, random access memory (RAM) and/or cache memory, among others.
  • Non-volatile memory may include, for example, read only memory (ROM), hard disk, erasable programmable read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, flash memory, and the like.
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • CD-ROM portable compact disk read only memory
  • USB memory flash memory
  • flash memory flash memory
  • the display device 20 further includes a video capture device and a pose capture device.
  • the video capture device is configured to capture the video of the target object to obtain the video to be processed;
  • the pose capture device is configured to capture pose data of the video capture device to obtain N pose data.
  • a video capture device may include a camera, a video camera, or the like, a device that can capture video and/or images.
  • the pose acquisition device includes a gyroscope, an acceleration sensor, or a satellite positioning device.
  • the pose collection device can also realize the function of collecting pose data through ARKit software, ARcore software, etc., and the pose collection device can also realize the function of collecting pose data through SLAM technology.
  • the display device 20 may be a mobile terminal, such as a mobile phone, a tablet computer, etc., and both the pose collecting device and the video collecting device are provided on the mobile terminal.
  • the pose collecting device may be an internal device of the mobile terminal.
  • the set gyroscope, the video capture device may be a camera on a mobile device (for example, it may include an off-screen camera, etc.).
  • the present disclosure is not limited thereto, and the video capture device can also be set outside the mobile terminal.
  • the video capture device can remotely capture video and transmit it to the mobile terminal through a network for subsequent processing by the mobile terminal.
  • the video collection device and the pose collection device need to be set in one body, so that the pose collection device can collect the pose data of the video collection device.
  • the display device 20 may further include a display panel for displaying the video frame to be displayed and the virtual model to be displayed.
  • the display panel may be a rectangular panel, a circular panel, an oval panel, a polygonal panel, or the like.
  • the display panel can be not only a flat panel, but also a curved panel, or even a spherical panel.
  • the display device 20 may have a touch function, that is, the display device 20 may be a touch display device.
  • FIG. 3 is a schematic diagram of a non-transitory computer-readable storage medium provided by at least one embodiment of the present disclosure.
  • one or more computer-readable instructions 310 may be non-transitory stored on storage medium 300 .
  • computer readable instructions 310 may perform one or more steps in a video processing method according to the above when executed by a processor.
  • the storage medium 300 can be applied to the above-mentioned display device 20 .
  • the storage medium 300 may include the memory 210 in the display apparatus 20 .
  • the description of the storage medium 300 reference may be made to the description of the memory 210 in the embodiment of the display apparatus 20 , and the repetition will not be repeated.
  • FIG. 4 shows a schematic structural diagram of an electronic device (eg, the electronic device may include the display device described in the above embodiments) 600 suitable for implementing an embodiment of the present disclosure.
  • Electronic devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, and the like.
  • the electronic device shown in FIG. 4 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • an electronic device 600 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 601 that may be loaded into random access according to a program stored in a read only memory (ROM) 602 or from a storage device 606 Various appropriate actions and processes are executed by the programs in the memory (RAM) 603 . In the RAM 603, various programs and data required for the operation of the electronic device 600 are also stored.
  • the processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to bus 604 .
  • I/O interface 605 input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 607 of a computer, etc.; a storage device 606 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 609.
  • Communication means 609 may allow electronic device 600 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 4 shows electronic device 600 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 609, or from the storage device 606, or from the ROM 602.
  • the processing apparatus 601 the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • a computer-readable medium may be a tangible medium that may contain or be stored for use by or in conjunction with an instruction execution system, apparatus, or apparatus. program.
  • the computer-readable medium can be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination of the above.
  • Computer readable storage media may include, but are not limited to, electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon.
  • Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages, such as Java, Smalltalk, C++, and This includes conventional procedural programming languages, such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider via Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider via Internet connection
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Among them, the name of the unit does not constitute a limitation of the unit itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices
  • a video processing method is used for a display device, and the video processing method includes: acquiring a video to be processed, and adding a plurality of video time stamps to a plurality of video frames of the video to be processed, wherein , and multiple video frames correspond to multiple video timestamps one-to-one; obtain N pose data of the display device, add N pose timestamps to the N pose data respectively, and cache N pose data and N pose data.
  • Pose timestamps where N pose data and N pose timestamps are in one-to-one correspondence, and N is a positive integer; the video frames to be displayed are extracted from multiple video frames, and the video frames corresponding to the video frames to be displayed are extracted according to the Time stamp, at least one pose data corresponding to the video frame to be displayed is determined from the N pose data; based on at least one pose data corresponding to the video frame to be displayed, the posture of the virtual model is adjusted to obtain The virtual model to be displayed; the video frame to be displayed and the virtual model to be displayed are simultaneously displayed by the display device.
  • the video to be processed includes a target object, the target object includes a landmark building, the display device includes a video capture device, and acquiring the video to be processed includes: using the video capture device to capture the video of the target object to obtain the video to be processed Video; marking the multiple video frames of the video to be processed with multiple video time stamps includes: acquiring the system clock of the display device in real time when using the video capture device to collect the to-be-processed video, so as to obtain multiple video frames corresponding to the multiple video frames respectively. video timestamp.
  • the display device includes a pose collection device
  • acquiring N pieces of pose data includes: when using the video capture device to collect the video to be processed, using the pose collection device to collect the pose of the video capture device to obtain N pose data; marking N pose time stamps on the N pose data respectively includes: when using the pose acquisition device to collect the N pose data, acquiring the system clock of the display device in real time to obtain N pose timestamps corresponding to N pose data respectively.
  • each of the N pose data includes a position and an angle of the video capture device.
  • the number of N pose data is 100 ⁇ 200.
  • determining at least one pose data corresponding to the video frame to be displayed from the N pose data according to the video time stamp corresponding to the video frame to be displayed includes: according to the to-be-displayed video frame The video time stamp corresponding to the video frame is searched for the reference pose time stamp corresponding to the video time stamp corresponding to the video frame to be displayed on the pose time axis composed of N pose time stamps, wherein the reference pose time stamp The absolute value of the time difference between the time stamp and the video time stamp is the smallest; the pose data corresponding to the reference pose time stamp is used as at least one pose data; or, according to the video time stamp corresponding to the video frame to be displayed, in the Find the first reference pose time stamp and the second reference pose time stamp corresponding to the video time stamp corresponding to the video frame to be displayed on the pose time axis composed of N pose time stamps, wherein the first reference pose time stamp The absolute value of the time difference between the pose timestamp and the video timestamp and the absolute value
  • adjusting the pose of the virtual model to obtain the virtual model to be displayed based on at least one pose data corresponding to the video frame to be displayed includes: The number of at least one pose data corresponding to the video frame is 1, and the at least one pose data corresponding to the video frame to be displayed is used as the target pose data, and based on the target pose data, the pose of the virtual model is adjusted to obtain The virtual model to be displayed; or, in response to the number of at least one pose data corresponding to the video frame to be displayed being 2, performing interpolation processing on the at least one pose data corresponding to the video frame to be displayed to obtain a The target pose data corresponding to the displayed video frame is adjusted based on the target pose data to obtain the virtual model to be displayed.
  • simultaneously displaying the video frame to be displayed and the virtual model to be displayed by the display device includes: displaying the video frame to be displayed; superimposing the virtual model to be displayed on the video frame to be displayed to display.
  • the virtual model is an augmented reality special effect model.
  • a display device includes: a memory for non-transitory storage of computer-readable instructions; a processor for executing the computer-readable instructions, the computer-readable instructions being processed by the processor
  • the video processing method according to any embodiment of the present disclosure is implemented at runtime.
  • the display device further includes: a video capture device and a pose capture device, the video capture device is configured to capture the video of the target object to obtain the video to be processed; the pose capture device is configured to capture pose data of the video acquisition device to obtain N pose data.
  • the video capture device includes a camera
  • the pose capture device includes a gyroscope, an acceleration sensor, or a satellite positioning device.
  • the display device is a mobile terminal, and both the pose collecting device and the video collecting device are provided on the mobile terminal.
  • a non-transitory computer-readable storage medium stores computer-readable instructions that, when executed by a processor, implement the video processing described in any of the embodiments of the present disclosure method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • User Interface Of Digital Computer (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

提供一种视频处理方法、显示装置和存储介质,方法包括:获取待处理视频,并对待处理视频的多个视频帧分别打上多个视频时间戳;获取显示装置的N个位姿数据,并对N个位姿数据分别打上N个位姿时间戳,缓存N个位姿数据和N个位姿时间戳;从多个视频帧中提取待显示的视频帧,根据待显示的视频帧对应的视频时间戳,从N个位姿数据中确定与待显示的视频帧对应的至少一个位姿数据;基于与待显示的视频帧对应的至少一个位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型;通过显示装置同时显示待显示的视频帧和待显示的虚拟模型。

Description

视频处理方法、显示装置和存储介质
相关申请的交叉引用
本申请要求于2020年09月23日提交的,申请名称为“视频处理方法、显示装置和存储介质”的、中国专利申请号为“202011011403.8”的优先权,该中国专利申请的全部内容通过引用结合在本申请中。
技术领域
本公开的实施例涉及一种视频处理方法、显示装置和存储介质。
背景技术
短视频具有社交属性强、易创作、时长短的特点,更符合移动互联网时代用户的碎片化内容消费习惯。增强现实(AR,Augmented Reality)技术是一种将虚拟信息与真实世界巧妙融合的技术,广泛运用了多媒体、三维建模、实时跟踪及注册、智能交互、传感等领域,将计算机生成的文字、图像、三维模型、音乐、视频等虚拟信息模拟仿真后,应用到真实世界中,真实世界的信息和虚拟信息互为补充,从而实现对真实世界的“增强”。AR独特的虚实融合特效,决定了AR独特在短视频领域具有无限的拓展空间。
发明内容
提供该公开内容部分以便以简要的形式介绍构思,这些构思将在后面的具体实施方式部分被详细描述。该公开内容部分并不旨在标识要求保护的技术方案的关键特征或必要特征,也不旨在用于限制所要求的保护的技术方案的范围。
本公开至少一实施例提供一种视频处理方法,用于显示装置,包括:获取待处理视频,并对所述待处理视频的多个视频帧分别打上多个视频时间戳,其中,所述多个视频帧与所述多个视频时间戳一一对应;获取所述显示装置的N个位姿数据,并对所述N个位姿数据分别打上N个位姿时间戳,缓存所述N个位姿数据和所述N个位姿时间戳,其中,所述N个位姿数据与所述N个位姿时间戳一一对应,N为正整数;从所述多个视频帧中提取待显示的视频帧,根据所述待显示的视频帧对应的视频时间戳,从所述N个位姿数据中确定与所述待显示的视频帧对应的至少一个位姿数据;基于与所述待显示的视频帧对应的所述至少一个位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型;通过所述显示装置同时显示所述待显示的视频帧和所述待显示的虚拟模型。
本公开至少一实施例提供一种显示装置,包括:存储器,用于非瞬时性地存储计算机可读指令;处理器,用于运行所述计算机可读指令,所述计算机可读指令被所述处理器运行时实现根据本公开任一实施例所述的视频处理方法。
本公开至少一实施例提供一种非瞬时性计算机可读存储介质,其中,所述非瞬时性计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现 根据本公开任一实施例所述的视频处理方法。
附图说明
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。
图1为本公开至少一实施例提供的一种视频处理方法的示意性流程图;
图2为本公开至少一实施例提供的一种参数获取装置的示意性框图;
图3为本公开至少一实施例提供的一种非瞬时性计算机可读存储介质的示意图;以及
图4为本公开至少一实施例提供的一种电子设备的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
目前,在触发电子装置(例如,手机等)中的地标AR特效后,AR特效可以跟随电子装置的移动或转动而相应在屏幕中实时(实际上可能有很小的可以忽略的延迟)地移动或转动(AR特效可能移出屏幕外),即AR特效的运动和电子装置的运动是一致 的;而屏幕显示的每一视频帧都有延迟,且视频帧的延迟的时间大于AR特效的延迟的时间,也就是说,当前显示的画面是根据处于之前某一时刻的相机位姿的相机拍摄得到的,而当前显示的AR特效是根据处于当前时刻的相机位姿的相机进行位置调整的,从而导致AR特效与屏幕显示的画面无法对齐,影响地标AR特效的效果。
本公开至少一实施例提供一种视频处理方法、显示装置和非瞬时性计算机可读存储介质。视频处理方法用于显示装置,视频处理方法包括:获取待处理视频,并对待处理视频的多个视频帧分别打上多个视频时间戳,其中,多个视频帧与所述多个视频时间戳一一对应;获取显示装置的N个位姿数据,并对N个位姿数据分别打上N个位姿时间戳,缓存N个位姿数据和N个位姿时间戳,其中,N个位姿数据与N个位姿时间戳一一对应,N为正整数;从多个视频帧中提取待显示的视频帧,根据待显示的视频帧对应的视频时间戳,从N个位姿数据中确定与待显示的视频帧对应的至少一个位姿数据;基于与待显示的视频帧对应的至少一个位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型;通过显示装置同时显示待显示的视频帧和待显示的虚拟模型。
该视频处理方法用于实现同步视频帧和相机位姿数据,通过对采集的视频帧打上时间戳和对显示装置的位姿数据打上时间戳,并基于时间戳进行匹配,从而可以使AR特效与屏幕显示的图像对齐,在显示装置实现地标AR特效时,AR特效与屏幕显示的图像(例如,地标建筑)更准确地对齐,进而可以向用户提供更好的AR特效的视觉效果。
需要说明的是,本公开实施例提供的视频处理方法可被配置于本公开实施例提供的显示装置上,例如,视频处理方法可以配置于显示装置的应用程序中。该显示装置可以是个人计算机、移动终端等,该移动终端可以是手机、平板电脑等具有各种操作系统的硬件设备。应用程序可以为抖音等。
下面结合附图对本公开的实施例进行详细说明,但是本公开并不限于这些具体的实施例。
图1为本公开至少一实施例提供的一种视频处理方法的示意性流程图。
例如,该视频处理方法可以应用于显示装置,如图1所示,该视频处理方法包括步骤S10至S14。
步骤S10:获取待处理视频,并对待处理视频的多个视频帧分别打上多个视频时间戳;
步骤S11:获取显示装置的N个位姿数据,并对N个位姿数据分别打上N个位姿时间戳,缓存N个位姿数据和N个位姿时间戳;
步骤S12:从多个视频帧中提取待显示的视频帧,根据待显示的视频帧对应的视频时间戳,从N个位姿数据中确定与待显示的视频帧对应的至少一个位姿数据;
步骤S13:基于与待显示的视频帧对应的至少一个位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型;
步骤S14:通过显示装置同时显示待显示的视频帧和待显示的虚拟模型。
例如,在步骤S10中,多个视频帧与多个视频时间戳一一对应。
例如,显示装置可以包括视频采集装置,视频采集装置用于拍摄图像和/或视频等。视频采集装置可以包括相机、摄像机等。视频采集装置可以与显示装置一体设置,视频采集装置也可以与显示装置分开设置,而通过无线(例如,蓝牙等)或有线等方式与显示装置通信连接。
例如,在步骤S10中,获取待处理视频包括:利用视频采集装置采集目标物体的视频以得到待处理视频。例如,在一些实施例中,获取待处理视频的速率可以为30帧/秒,即每秒钟获取30个视频帧,本公开不限于此,获取待处理视频的速率可以根据实际情况设置,例如可以为60帧/秒。
需要说明的是,若显示装置具有防抖功能,则当显示装置(例如,手机)的防抖功能被开启时,采集得到的视频的视频帧是经过平滑处理的视频帧;经过平滑处理的视频帧无法与显示装置的传感器或SLAM(Simultaneous localization and mapping,同步定位与建图)系统等获得的视频采集装置(例如,相机)的位姿数据等在时间轴上对齐,由此,当采用视频采集装置采集视频时,则需要关闭该防抖功能,以使得采集得到的视频帧和视频采集装置的位姿数据在时间轴上能够对齐。
例如,待处理视频可以是视频采集装置实时采集的视频,也可以是预先采集并存储在显示装置中的视频。例如,待处理视频可以包括目标物体,目标物体包括地标建筑(例如,岳阳的岳阳楼、南昌的滕王阁、武汉的黄鹤楼、北京三里屯的太古里等)等室外物体,也可以包括桌子和柜子等室内物体,还可以是自然景物,例如,加州红杉树等。
例如,在步骤S10中,将待处理视频的多个视频帧分别打上多个视频时间戳包括:在利用视频采集装置采集待处理视频时,实时获取显示装置的系统时钟,以得到与多个视频帧分别对应的多个视频时间戳。本公开的实施例对于系统时钟的产生方式以及精度不作限定。
例如,在步骤S11中,N个位姿数据与N个位姿时间戳一一对应,N为正整数。在一些实施例中,N个位姿数据的数量为100~200,即N为100~200,例如,N为100、150、180、200等,但是本公开不限于此,N的值可以根据实际情况设置。
例如,在一些实施例中,显示装置还可以包括位姿采集装置,位姿采集装置用于采集显示装置的位姿数据。位姿采集装置可以与显示装置一体设置,位姿采集装置也可以与显示装置分开设置,而通过无线(例如,蓝牙等)或有线等方式与显示装置通信连接。 需要说明的是,在本公开的实施例中,“显示装置的位姿数据”可以表示显示装置中的视频采集装置的位姿数据。
例如,在步骤S11中,获取N个位姿数据包括:在利用视频采集装置采集待处理视频时,利用位姿采集装置采集视频采集装置的位姿数据,以得到N个位姿数据。例如,在一些实施例中,获取位姿数据的速率可以为80个/秒,即每秒钟获取80个位姿数据,本公开不限于此,获取位姿数据的速率也可以为90个/秒、100个/秒等,可以根据实际情况设置。
例如,采集位姿数据和采集待处理视频的时间是同时执行的,从而保证在同一时刻采集得到的位姿数据和视频帧是彼此对应的,进而保证基于该位姿数据调整得到的AR特效与屏幕显示的图像彼此对齐。
例如,对N个位姿数据分别打上N个位姿时间戳包括:在利用位姿采集装置采集N个位姿数据时,实时获取显示装置的系统时钟,以得到与N个位姿数据分别对应的N个位姿时间戳。
例如,视频时间戳是显示装置的系统时钟,位姿时间戳也是显示装置的系统时钟,从而使得视频时间戳和位姿时间戳均是基于同样的时钟得到的,以保证视频时间戳和位姿时间戳能够彼此对应。
例如,N个位姿数据中的每个位姿数据包括视频采集装置的位置和角度等信息,位置可以表示该视频采集装置对应的GPS坐标(例如,经度、维度、海拔高度),角度可以表示视频采集装置与目标物体之间的相对角度关系。
例如,N个位姿数据可以各不相同,也可以至少部分位姿数据相同。
例如,在一些实施例中,在步骤S11中,在采集视频的过程中,位姿采集装置可以不断地获取视频采集装置的位姿数据并获取对应的位姿时间戳,但是,显示装置仅缓存N个位姿数据和N个位姿时间戳,例如,当位姿采集装置采集到前N个位姿数据和前N个位姿时间戳时,显示装置缓存该N个位姿数据和N个位姿时间戳,当位姿采集装置采集到第N+1个位姿数据和第N+1个位姿时间戳时,则显示装置缓存该第N+1个位姿数据和第N+1个位姿时间戳,同时删除第一个位姿数据和第一个位姿时间戳;当位姿采集装置采集到第N+2个位姿数据和第N+2个位姿时间戳时,则显示装置缓存该第N+2个位姿数据和第N+2个位姿时间戳,同时删除第二个位姿数据和第二个位姿时间戳,以此类推,由此,显示装置始终只缓存N个位姿数据和N个位姿时间戳,从而可以节省存储空间。
需要说明的是,在另一些实施例中,在步骤S11中,显示装置也可以缓存位姿采集装置采集的所有位姿数据和位姿时间戳。
例如,在步骤S12中,待显示的视频帧可以为待处理视频中的任一一帧,待显示的视频帧可以为在拍摄待处理视频时,与当前的拍摄时间对应的视频帧。
例如,在一些实施例中,在步骤S12中,根据待显示的视频帧对应的视频时间戳,从N个位姿数据中确定与待显示的视频帧对应的至少一个位姿数据,包括:根据待显示的视频帧对应的视频时间戳,在由N个位姿时间戳构成的位姿时间轴上查找与待显示的视频帧对应的视频时间戳相对应的参考位姿时间戳;将与参考位姿时间戳对应的位姿数据作为至少一个位姿数据。
例如,在一些实施例中,至少一个位姿数据为一个位姿数据。参考位姿时间戳和视频时间戳之间的时间差的绝对值最小,也就是说,参考位姿时间戳和视频时间戳之间的时间差的绝对值小于视频时间戳与N个位姿时间戳中除了参考位姿时间戳之外的任一一个位姿时间戳之间的时间差的绝对值。例如,在一些实施例中,视频时间戳为1分5秒4,若N个位姿时间戳分别为1分5秒1、1分5秒3、1分5秒7、1分5秒9、1分6秒1等等,此时,1分5秒3的位姿时间戳与视频时间戳之间的时间差的绝对值最小,即1分5秒3为参考位姿时间戳。
例如,在另一些实施例中,在步骤S12中,根据待显示的视频帧对应的视频时间戳,从N个位姿数据中确定与待显示的视频帧对应的至少一个位姿数据,包括:根据待显示的视频帧对应的视频时间戳,在由N个位姿时间戳构成的位姿时间轴上查找与待显示的视频帧对应的视频时间戳相对应的第一参考位姿时间戳和第二参考位姿时间戳;将与第一参考位姿时间戳对应的位姿数据和第二参考位姿时间戳对应的位姿数据作为至少一个位姿数据。
例如,第一参考位姿时间戳和视频时间戳之间的时间差的绝对值和第二参考位姿时间戳和视频时间戳之间的时间差的绝对值为最小的两个绝对值,也就是说,第一参考位姿时间戳和视频时间戳之间的时间差的绝对值和第二参考位姿时间戳和视频时间戳之间的时间差的绝对值小于视频时间戳与N个位姿时间戳中除了第一参考位姿时间戳和第二参考位姿时间戳之外的任一一个位姿时间戳之间的时间差的绝对值。
例如,第一参考位姿时间戳和视频时间戳之间的时间差的绝对值和第二参考位姿时间戳和视频时间戳之间的时间差的绝对值可以相同。例如,在一些实施例中,视频时间戳为1分5秒4,若N个位姿时间戳分别为1分5秒1、1分5秒3、1分5秒5、1分5秒7、1分5秒8等等,此时,1分5秒3的位姿时间戳与视频时间戳之间的时间差的绝对值和1分5秒5的位姿时间戳与视频时间戳之间的时间差的绝对值是最小的两个绝对值,即1分5秒3和1分5秒5分别为第一参考位姿时间戳和第二参考位姿时间戳,此时,第一参考位姿时间戳和视频时间戳之间的时间差的绝对值和第二参考位姿时间戳和 视频时间戳之间的时间差的绝对值相同,均为0.1秒。
例如,第一参考位姿时间戳和视频时间戳之间的时间差的绝对值和第二参考位姿时间戳和视频时间戳之间的时间差的绝对值也可以不相同。例如,在一些实施例中,视频时间戳为1分5秒4,若N个位姿时间戳分别为1分5秒1、1分5秒3、1分5秒6、1分5秒8、1分6秒1等等,此时,1分5秒3的位姿时间戳与视频时间戳之间的时间差的绝对值和1分5秒6的位姿时间戳与视频时间戳之间的时间差的绝对值是最小的两个绝对值,即1分5秒3和1分5秒6分别为第一参考位姿时间戳和第二参考位姿时间戳,此时,第一参考位姿时间戳和视频时间戳之间的时间差的绝对值(0.1秒)和第二参考位姿时间戳和视频时间戳之间的时间差的绝对值(0.2秒)不相同。
例如,在一些实施例中,位姿时间轴的长度可以为1秒,即显示装置缓存的N个位姿数据和N个位姿时间戳对应的时间长度为1秒。例如,该缓存的N个位姿数据和N个位姿时间戳可以为最接近当前时刻的N个位姿数据和N个位姿时间戳。例如,在一些实施例中,当前时刻为10点10分20秒,则该缓存的N个位姿数据和N个位姿时间戳可以为从10点10分19秒到10点10分20秒之间采集得到的位姿数据和位姿时间戳。
例如,在一些实施例中,步骤S13可以包括:响应于与待显示的视频帧对应的至少一个位姿数据的数量为1,将与待显示的视频帧对应的至少一个位姿数据作为目标位姿数据,基于目标位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型;或者,响应于与待显示的视频帧对应的至少一个位姿数据的数量为2,对与待显示的视频帧对应的至少一个位姿数据进行插值处理,以得到与待显示的视频帧对应的目标位姿数据,基于目标位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型。
例如,当至少一个位姿数据仅为一个位姿数据时,直接将与待显示的视频帧对应的至少一个位姿数据作为目标位姿数据,然后,基于目标位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型。当至少一个位姿数据包括两个位姿数据时,对两个位姿数据进行插值处理(例如,线性插值),以得到与待显示的视频帧对应的目标位姿数据,然后,基于目标位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型。需要说明的是,对两个位姿数据进行插值处理包括对位姿数据中的位置进行插值处理和对位姿数据中的角度进行插值处理。
例如,在本公开的一些实施例中,基于目标位姿数据,对虚拟模型的姿态进行调整包括:基于目标位姿数据,计算该虚拟模型在显示装置的显示屏上所对应的姿态,然后将该虚拟模型对应的三维重建数据集合中的各种数据按照该目标位姿数据对应的姿态投影到显示屏上,从而得到处于该姿态的虚拟模型。
例如,虚拟模型为增强现实特效模型等。虚拟模型可以包括文字、图像、三维模型、 音乐、视频等虚拟特效。虚拟模型可以为预先建模得到的模型。
例如,在一些实施例中,步骤S14包括:显示待显示的视频帧;将待显示的虚拟模型叠加在待显示的视频帧上进行显示。
例如,待显示的视频帧和待显示的虚拟模型同时显示在显示装置上,且待显示的视频帧和待显示的虚拟模型对应的时间是相同的,从而可以使得AR特效与屏幕显示的图像可以对齐,进而可以向用户提供更好的AR特效的视觉效果。
本公开至少一实施例还提供一种显示装置,图2为本公开至少一实施例提供的一种显示装置的示意性框图。
例如,如图2所示,显示装置20包括处理器200和存储器210。应当注意,图2所示的显示装置20的组件只是示例性的,而非限制性的,根据实际应用需要,该显示装置20还可以具有其他组件。
例如,处理器200和存储器210之间可以直接或间接地互相通信。
例如,处理器200和存储器210可以通过网络进行通信。网络可以包括无线网络、有线网络、和/或无线网络和有线网络的任意组合。处理器200和存储器210之间也可以通过系统总线实现相互通信,本公开对此不作限制。
例如,存储器210用于非瞬时性地存储计算机可读指令。处理器200用于运行计算机可读指令时,计算机可读指令被处理器200运行时实现根据上述任一实施例所述的视频处理方法。关于该视频处理方法的各个步骤的具体实现以及相关解释内容可以参见上述视频处理方法的实施例,重复之处在此不作赘述。
例如,处理器200和存储器210可以设置在服务器端(或云端)。
例如,处理器200可以控制显示装置20中的其它组件以执行期望的功能。处理器200可以是中央处理器(CPU)、网络处理器(NP)等;还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。中央处理元(CPU)可以为X86或ARM架构等。
例如,存储器210可以包括一个或多个计算机程序产品的任意组合,计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。非易失性存储器例如可以包括只读存储器(ROM)、硬盘、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机可读指令,处理器200可以运行所述计算机可读指令,以实现电子设备的各种功能。在存储介质中还可以存储各种应用程序和各种 数据等。
例如,在一些实施例中,显示装置20还包括视频采集装置和位姿采集装置。视频采集装置被配置为采集目标物体的视频以得到待处理视频;位姿采集装置被配置为采集视频采集装置的位姿数据,以得到N个位姿数据。
例如,视频采集装置可以包括相机、摄像机等可以拍摄视频和/或图像的装置。
例如,位姿采集装置包括陀螺仪、加速度传感器或卫星定位装置等。又例如,位姿采集装置也可以通过ARKit软件、ARcore软件等实现采集位姿数据的功能,位姿采集装置还可以通过SLAM技术实现采集位姿数据的功能。
例如,在一些实施例中,显示装置20可以为移动终端,例如,手机、平板电脑等,位姿采集装置和视频采集装置均设置在移动终端上,例如,位姿采集装置可以为移动终端内部设置的陀螺仪,视频采集装置可以为移动装置上的摄像头(例如,可以包括屏下摄像头等)。本公开不限于此,视频采集装置也可以设置在移动终端之外,例如,视频采集装置可以远程采集视频并通过网络传输到移动终端,以供移动终端进行后续处理。需要说明的是,视频采集装置和位姿采集装置需要一体设置,以便于位姿采集装置采集视频采集装置的位姿数据。
例如,显示装置20还可以包括显示面板,显示面板用于显示待显示的视频帧和待显示的虚拟模型。例如,显示面板可以为矩形面板、圆形面板、椭圆形面板或多边形面板等。另外,显示面板不仅可以为平面面板,也可以为曲面面板,甚至球面面板。
例如,显示装置20可以具备触控功能,即显示装置20可以为触控显示装置。
例如,关于显示装置20执行视频处理方法的过程的详细说明可以参考视频处理方法的实施例中的相关描述,重复之处不再赘述。
图3为本公开至少一实施例提供的一种非瞬时性计算机可读存储介质的示意图。例如,如图3所示,在存储介质300上可以非暂时性地存储一个或多个计算机可读指令310。例如,当计算机可读指令310由处理器执行时可以执行根据上文所述的视频处理方法中的一个或多个步骤。
例如,该存储介质300可以应用于上述显示装置20中。例如,存储介质300可以包括显示装置20中的存储器210。
例如,关于存储介质300的说明可以参考显示装置20的实施例中对于存储器210的描述,重复之处不再赘述。
下面参考图4,图4示出了适于用来实现本公开实施例的电子设备(例如电子设备可以包括上述实施例描述的显示装置)600的结构示意图。本公开实施例中的电子设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、 PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图4示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图4所示,电子设备600可以包括处理装置(例如中央处理器、图形处理器等)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储装置606加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有电子设备600操作所需的各种程序和数据。处理装置601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。
通常,以下装置可以连接至I/O接口605:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置606;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置607;包括例如磁带、硬盘等的存储装置606;以及通信装置609。通信装置609可以允许电子设备600与其他设备进行无线或有线通信以交换数据。虽然图4示出了具有各种装置的电子设备600,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置609从网络上被下载和安装,或者从存储装置606被安装,或者从ROM 602被安装。在该计算机程序被处理装置601执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,在本公开的上下文中,计算机可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是,但不限于:电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用 多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言,诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络(,包括局域网(LAN)或广域网(WAN))连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如, 非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。
根据本公开的一个或多个实施例,一种视频处理方法用于显示装置,该视频处理方法包括:获取待处理视频,并对待处理视频的多个视频帧分别打上多个视频时间戳,其中,多个视频帧与多个视频时间戳一一对应;获取显示装置的N个位姿数据,并对N个位姿数据分别打上N个位姿时间戳,缓存N个位姿数据和N个位姿时间戳,其中,N个位姿数据与N个位姿时间戳一一对应,N为正整数;从多个视频帧中提取待显示的视频帧,根据待显示的视频帧对应的视频时间戳,从N个位姿数据中确定与待显示的视频帧对应的至少一个位姿数据;基于与待显示的视频帧对应的至少一个位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型;通过显示装置同时显示待显示的视频帧和待显示的虚拟模型。
根据本公开的一个或多个实施例,待处理视频包括目标物体,目标物体包括地标建筑,显示装置包括视频采集装置,获取待处理视频包括:利用视频采集装置采集目标物体的视频以得到待处理视频;将待处理视频的多个视频帧分别打上多个视频时间戳包括:在利用视频采集装置采集待处理视频时,实时获取显示装置的系统时钟,以得到与多个视频帧分别对应的多个视频时间戳。
根据本公开的一个或多个实施例,显示装置包括位姿采集装置,获取N个位姿数据包括:在利用视频采集装置采集待处理视频时,利用位姿采集装置采集视频采集装置的位姿数据,以得到N个位姿数据;对N个位姿数据分别打上N个位姿时间戳包括:在利用位姿采集装置采集N个位姿数据时,实时获取显示装置的系统时钟,以得到与N个位姿数据分别对应的N个位姿时间戳。
根据本公开的一个或多个实施例,N个位姿数据中的每个位姿数据包括视频采集装置的位置和角度。
根据本公开的一个或多个实施例,N个位姿数据的数量为100~200。
根据本公开的一个或多个实施例,根据待显示的视频帧对应的视频时间戳,从N个位姿数据中确定与待显示的视频帧对应的至少一个位姿数据,包括:根据待显示的视频帧对应的视频时间戳,在由N个位姿时间戳构成的位姿时间轴上查找与待显示的视频帧对应的视频时间戳相对应的参考位姿时间戳,其中,参考位姿时间戳和视频时间戳之间的时间差的绝对值最小;将与参考位姿时间戳对应的位姿数据作为至少一个位姿数据;或者,根据待显示的视频帧对应的视频时间戳,在由N个位姿时间戳构成的位姿时间轴上查找与待显示的视频帧对应的视频时间戳相对应的第一参考位姿时间戳和第二参考 位姿时间戳,其中,第一参考位姿时间戳和视频时间戳之间的时间差的绝对值和第二参考位姿时间戳和视频时间戳之间的时间差的绝对值为最小的两个绝对值;将与第一参考位姿时间戳对应的位姿数据和第二参考位姿时间戳对应的位姿数据作为至少一个位姿数据。
根据本公开的一个或多个实施例,基于与待显示的视频帧对应的至少一个位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型,包括:响应于与待显示的视频帧对应的至少一个位姿数据的数量为1,将与待显示的视频帧对应的至少一个位姿数据作为目标位姿数据,基于目标位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型;或者,响应于与待显示的视频帧对应的至少一个位姿数据的数量为2,对与待显示的视频帧对应的至少一个位姿数据进行插值处理,以得到与待显示的视频帧对应的目标位姿数据,基于目标位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型。
根据本公开的一个或多个实施例,通过显示装置同时显示待显示的视频帧和待显示的虚拟模型包括:显示待显示的视频帧;将待显示的虚拟模型叠加在待显示的视频帧上进行显示。
根据本公开的一个或多个实施例,虚拟模型为增强现实特效模型。
根据本公开的一个或多个实施例,一种显示装置,包括:存储器,用于非瞬时性地存储计算机可读指令;处理器,用于运行计算机可读指令,计算机可读指令被处理器运行时实现根据本公开任一实施例所述的视频处理方法。
根据本公开的一个或多个实施例,显示装置还包括:视频采集装置和位姿采集装置,视频采集装置被配置为采集目标物体的视频以得到待处理视频;位姿采集装置被配置为采集视频采集装置的位姿数据,以得到N个位姿数据。
根据本公开的一个或多个实施例,视频采集装置包括相机,位姿采集装置包括陀螺仪、加速度传感器或卫星定位装置。
根据本公开的一个或多个实施例,显示装置为移动终端,位姿采集装置和视频采集装置均设置在移动终端上。
根据本公开的一个或多个实施例,一种非瞬时性计算机可读存储介质存储有计算机可读指令,计算机可读指令被处理器执行时实现根据本公开任一实施例所述的视频处理方法。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进 行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。
对于本公开,还有以下几点需要说明:
(1)本公开实施例附图只涉及到与本公开实施例涉及到的结构,其他结构可参考通常设计。
(2)为了清晰起见,在用于描述本公开的实施例的附图中,层或结构的厚度和尺寸被放大。可以理解,当诸如层、膜、区域或基板之类的元件被称作位于另一元件“上”或“下”时,该元件可以“直接”位于另一元件“上”或“下”,或者可以存在中间元件。
(3)在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合以得到新的实施例。
以上所述仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,本公开的保护范围应以所述权利要求的保护范围为准。

Claims (14)

  1. 一种视频处理方法,用于显示装置,包括:
    获取待处理视频,并对所述待处理视频的多个视频帧分别打上多个视频时间戳,其中,所述多个视频帧与所述多个视频时间戳一一对应;
    获取所述显示装置的N个位姿数据,并对所述N个位姿数据分别打上N个位姿时间戳,缓存所述N个位姿数据和所述N个位姿时间戳,其中,所述N个位姿数据与所述N个位姿时间戳一一对应,N为正整数;
    从所述多个视频帧中提取待显示的视频帧,根据所述待显示的视频帧对应的视频时间戳,从所述N个位姿数据中确定与所述待显示的视频帧对应的至少一个位姿数据;
    基于与所述待显示的视频帧对应的所述至少一个位姿数据,对虚拟模型的姿态进行调整,以得到待显示的虚拟模型;
    通过所述显示装置同时显示所述待显示的视频帧和所述待显示的虚拟模型。
  2. 根据权利要求1所述视频处理方法,其中,所述待处理视频包括目标物体,所述目标物体包括地标建筑,所述显示装置包括视频采集装置,
    获取待处理视频包括:利用所述视频采集装置采集所述目标物体的视频以得到所述待处理视频;
    将所述待处理视频的多个视频帧分别打上多个视频时间戳包括:在利用所述视频采集装置采集所述待处理视频时,实时获取所述显示装置的系统时钟,以得到与所述多个视频帧分别对应的所述多个视频时间戳。
  3. 根据权利要求2所述视频处理方法,其中,所述显示装置包括位姿采集装置,
    获取所述N个位姿数据包括:在利用所述视频采集装置采集所述待处理视频时,利用所述位姿采集装置采集所述视频采集装置的位姿数据,以得到所述N个位姿数据;
    对所述N个位姿数据分别打上N个位姿时间戳包括:在利用所述位姿采集装置采集所述N个位姿数据时,实时获取所述显示装置的系统时钟,以得到与所述N个位姿数据分别对应的所述N个位姿时间戳。
  4. 根据权利要求3所述视频处理方法,其中,所述N个位姿数据中的每个位姿数据包括所述视频采集装置的位置和角度。
  5. 根据权利要求1所述视频处理方法,其中,所述N个位姿数据的数量为100~200。
  6. 根据权利要求1~5任一项所述视频处理方法,其中,根据所述待显示的视频帧对应的视频时间戳,从所述N个位姿数据中确定与所述待显示的视频帧对应的至少一个位姿数据,包括:
    根据所述待显示的视频帧对应的视频时间戳,在由所述N个位姿时间戳构成的位姿 时间轴上查找与所述待显示的视频帧对应的所述视频时间戳相对应的参考位姿时间戳,其中,所述参考位姿时间戳和所述视频时间戳之间的时间差的绝对值最小;
    将与所述参考位姿时间戳对应的位姿数据作为所述至少一个位姿数据;或者,
    根据所述待显示的视频帧对应的视频时间戳,在由所述N个位姿时间戳构成的位姿时间轴上查找与所述待显示的视频帧对应的所述视频时间戳相对应的第一参考位姿时间戳和第二参考位姿时间戳,其中,所述第一参考位姿时间戳和所述视频时间戳之间的时间差的绝对值和所述第二参考位姿时间戳和所述视频时间戳之间的时间差的绝对值为最小的两个绝对值;
    将与所述第一参考位姿时间戳对应的位姿数据和所述第二参考位姿时间戳对应的位姿数据作为所述至少一个位姿数据。
  7. 根据权利要求1~5任一项所述视频处理方法,其中,基于与所述待显示的视频帧对应的所述至少一个位姿数据,对所述虚拟模型的姿态进行调整,以得到所述待显示的虚拟模型,包括:
    响应于与所述待显示的视频帧对应的所述至少一个位姿数据的数量为1,将与所述待显示的视频帧对应的所述至少一个位姿数据作为目标位姿数据,基于所述目标位姿数据,对所述虚拟模型的姿态进行调整,以得到所述待显示的虚拟模型;或者
    响应于与所述待显示的视频帧对应的所述至少一个位姿数据的数量为2,对与所述待显示的视频帧对应的所述至少一个位姿数据进行插值处理,以得到与所述待显示的视频帧对应的目标位姿数据,基于所述目标位姿数据,对所述虚拟模型的姿态进行调整,以得到所述待显示的虚拟模型。
  8. 根据权利要求1~5任一项所述视频处理方法,其中,通过所述显示装置同时显示所述待显示的视频帧和所述待显示的虚拟模型包括:
    显示所述待显示的视频帧;
    将所述待显示的虚拟模型叠加在所述待显示的视频帧上进行显示。
  9. 根据权利要求1~5任一项所述视频处理方法,其中,所述虚拟模型为增强现实特效模型。
  10. 一种显示装置,包括:
    存储器,用于非瞬时性地存储计算机可读指令;
    处理器,用于运行所述计算机可读指令,所述计算机可读指令被所述处理器运行时实现根据权利要求1~9任一项所述的视频处理方法。
  11. 根据权利要求10所述的显示装置,还包括:视频采集装置和位姿采集装置,其中,所述视频采集装置被配置为采集所述目标物体的视频以得到所述待处理视 频;
    所述位姿采集装置被配置为采集所述视频采集装置的位姿数据,以得到所述N个位姿数据。
  12. 根据权利要求11所述的显示装置,其中,所述视频采集装置包括相机,所述位姿采集装置包括陀螺仪、加速度传感器或卫星定位装置。
  13. 根据权利要求11或12所述的显示装置,其中,所述显示装置为移动终端,所述位姿采集装置和所述视频采集装置均设置在所述移动终端上。
  14. 一种非瞬时性计算机可读存储介质,其中,所述非瞬时性计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现根据权利要求1~9中任一项所述的视频处理方法。
PCT/CN2021/109020 2020-09-23 2021-07-28 视频处理方法、显示装置和存储介质 WO2022062642A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011011403.8A CN112333491B (zh) 2020-09-23 2020-09-23 视频处理方法、显示装置和存储介质
CN202011011403.8 2020-09-23

Publications (1)

Publication Number Publication Date
WO2022062642A1 true WO2022062642A1 (zh) 2022-03-31

Family

ID=74303548

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/109020 WO2022062642A1 (zh) 2020-09-23 2021-07-28 视频处理方法、显示装置和存储介质

Country Status (2)

Country Link
CN (1) CN112333491B (zh)
WO (1) WO2022062642A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115396644A (zh) * 2022-07-21 2022-11-25 贝壳找房(北京)科技有限公司 基于多段外参数据的视频融合方法及装置

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112333491B (zh) * 2020-09-23 2022-11-01 字节跳动有限公司 视频处理方法、显示装置和存储介质
CN113766119B (zh) * 2021-05-11 2023-12-05 腾讯科技(深圳)有限公司 虚拟形象显示方法、装置、终端及存储介质
CN113850746A (zh) * 2021-09-29 2021-12-28 北京字跳网络技术有限公司 图像处理方法、装置、电子设备及存储介质
CN115761114B (zh) * 2022-10-28 2024-04-30 如你所视(北京)科技有限公司 视频生成方法、装置及计算机可读存储介质
CN117294832B (zh) * 2023-11-22 2024-03-26 湖北星纪魅族集团有限公司 数据处理方法、装置、电子设备和计算机可读存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170193686A1 (en) * 2015-12-30 2017-07-06 Daqri, Llc 3d video reconstruction system
CN109168034A (zh) * 2018-08-28 2019-01-08 百度在线网络技术(北京)有限公司 商品信息显示方法、装置、电子设备和可读存储介质
CN110379017A (zh) * 2019-07-12 2019-10-25 北京达佳互联信息技术有限公司 一种场景构建方法、装置、电子设备及存储介质
CN110555882A (zh) * 2018-04-27 2019-12-10 腾讯科技(深圳)有限公司 界面显示方法、装置及存储介质
CN110858414A (zh) * 2018-08-13 2020-03-03 北京嘀嘀无限科技发展有限公司 图像处理方法、装置、可读存储介质与增强现实系统
CN112333491A (zh) * 2020-09-23 2021-02-05 字节跳动有限公司 视频处理方法、显示装置和存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018113759A1 (zh) * 2016-12-22 2018-06-28 大辅科技(北京)有限公司 基于定位系统和ar/mr的探测系统及探测方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170193686A1 (en) * 2015-12-30 2017-07-06 Daqri, Llc 3d video reconstruction system
CN110555882A (zh) * 2018-04-27 2019-12-10 腾讯科技(深圳)有限公司 界面显示方法、装置及存储介质
CN110858414A (zh) * 2018-08-13 2020-03-03 北京嘀嘀无限科技发展有限公司 图像处理方法、装置、可读存储介质与增强现实系统
CN109168034A (zh) * 2018-08-28 2019-01-08 百度在线网络技术(北京)有限公司 商品信息显示方法、装置、电子设备和可读存储介质
CN110379017A (zh) * 2019-07-12 2019-10-25 北京达佳互联信息技术有限公司 一种场景构建方法、装置、电子设备及存储介质
CN112333491A (zh) * 2020-09-23 2021-02-05 字节跳动有限公司 视频处理方法、显示装置和存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115396644A (zh) * 2022-07-21 2022-11-25 贝壳找房(北京)科技有限公司 基于多段外参数据的视频融合方法及装置
CN115396644B (zh) * 2022-07-21 2023-09-15 贝壳找房(北京)科技有限公司 基于多段外参数据的视频融合方法及装置

Also Published As

Publication number Publication date
CN112333491B (zh) 2022-11-01
CN112333491A (zh) 2021-02-05

Similar Documents

Publication Publication Date Title
WO2022062642A1 (zh) 视频处理方法、显示装置和存储介质
WO2022088799A1 (zh) 三维重建方法、三维重建装置、存储介质
CN106133795B (zh) 用于对3d渲染应用中地理定位的媒体内容进行视觉化的方法和装置
US9558559B2 (en) Method and apparatus for determining camera location information and/or camera pose information according to a global coordinate system
US9699375B2 (en) Method and apparatus for determining camera location information and/or camera pose information according to a global coordinate system
US9870429B2 (en) Method and apparatus for web-based augmented reality application viewer
CN112907652B (zh) 相机姿态获取方法、视频处理方法、显示设备和存储介质
US11776209B2 (en) Image processing method and apparatus, electronic device, and storage medium
WO2021088498A1 (zh) 虚拟物体显示方法以及电子设备
WO2018214778A1 (zh) 一种虚拟对象的展示方法及装置
WO2023103999A1 (zh) 3d目标点渲染方法、装置、设备及存储介质
WO2019007372A1 (zh) 模型显示方法和装置
WO2023029893A1 (zh) 纹理映射方法、装置、设备及存储介质
CN111597466A (zh) 展示方法、装置和电子设备
WO2023109564A1 (zh) 视频图像处理方法、装置、电子设备及存储介质
WO2023151558A1 (zh) 用于显示图像的方法、装置和电子设备
CN112887793B (zh) 视频处理方法、显示设备和存储介质
CN112132909B (zh) 参数获取方法及装置、媒体数据处理方法和存储介质
WO2022227918A1 (zh) 视频处理方法、设备及电子设备
CN111597414B (zh) 显示方法、装置和电子设备
CN114332224A (zh) 3d目标检测样本的生成方法、装置、设备及存储介质
US9596404B2 (en) Method and apparatus for generating a media capture request using camera pose information
WO2023029892A1 (zh) 视频处理方法、装置、设备及存储介质
WO2022135022A1 (zh) 动态流体显示方法、装置、电子设备和可读介质
CN114359362A (zh) 房源信息采集方法、装置和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21871020

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21871020

Country of ref document: EP

Kind code of ref document: A1