WO2022020996A1 - 视频拼接的方法、装置及系统 - Google Patents

视频拼接的方法、装置及系统 Download PDF

Info

Publication number
WO2022020996A1
WO2022020996A1 PCT/CN2020/104852 CN2020104852W WO2022020996A1 WO 2022020996 A1 WO2022020996 A1 WO 2022020996A1 CN 2020104852 W CN2020104852 W CN 2020104852W WO 2022020996 A1 WO2022020996 A1 WO 2022020996A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
target vehicle
frame
boundary
clips
Prior art date
Application number
PCT/CN2020/104852
Other languages
English (en)
French (fr)
Inventor
张桂成
赵帅
李大锋
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202080004396.0A priority Critical patent/CN112544071B/zh
Priority to EP20946683.8A priority patent/EP4030751A4/en
Priority to PCT/CN2020/104852 priority patent/WO2022020996A1/zh
Publication of WO2022020996A1 publication Critical patent/WO2022020996A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C5/00Registering or indicating the working of vehicles
    • G07C5/08Registering or indicating performance data other than driving, working, idle, or waiting time, with or without registering driving, working, idle or waiting time
    • G07C5/0841Registering performance data
    • G07C5/085Registering performance data using electronic data carriers
    • G07C5/0866Registering performance data using electronic data carriers the electronic data carrier being a digital video recorder in combination with video camera
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0108Measuring and analyzing of parameters relative to traffic conditions based on the source of data
    • G08G1/0112Measuring and analyzing of parameters relative to traffic conditions based on the source of data from the vehicle, e.g. floating car data [FCD]
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0108Measuring and analyzing of parameters relative to traffic conditions based on the source of data
    • G08G1/0116Measuring and analyzing of parameters relative to traffic conditions based on the source of data from roadside infrastructure, e.g. beacons
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/017Detecting movement of traffic to be counted or controlled identifying vehicles
    • G08G1/0175Detecting movement of traffic to be counted or controlled identifying vehicles by photographing vehicles, e.g. when violating traffic rules
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/056Detecting movement of traffic to be counted or controlled with provision for distinguishing direction of travel
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/90Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums

Definitions

  • the present application relates to the technical field of image processing, and in particular, to a method, device and system for video splicing.
  • video stitching is a more commonly used image processing method.
  • video splicing is to splicing multiple pieces of video based on time sequence. For example, according to the shooting time sequence of each video, multiple videos can be spliced into a complete video in time sequence.
  • the temporal splicing method based on time sequence has high requirements on the accuracy of time, and the time obtained from the video is usually not accurate enough, resulting in poor video splicing effect.
  • the present application provides a method, device and system for video splicing, so as to improve the accuracy of video splicing and obtain better video splicing effect.
  • the method for video splicing provided in the embodiments of the present application may be performed by a video splicing system.
  • the system for video splicing includes a video segment acquiring unit and a splicing unit.
  • the video clip obtaining unit is used to obtain multiple video clips related to the target vehicle.
  • the splicing unit is used for splicing the first video clip and the second video clip according to the spliced video frame; wherein, the first video clip and the second video clip are two video clips that are adjacent in time among the multiple video clips, and the spliced video frame is According to the positioning position information of the target vehicle in the first boundary video, the video frame for video splicing determined in the second boundary video, the first boundary video is the multi-frame video at the end position of the first video clip, the second boundary video The boundary video is a multi-frame video at the start position of the second video segment.
  • the video splicing system in this embodiment of the present application may be a single device with a video splicing function. It can also be a combination of at least two devices, that is, at least two devices are combined into a system with a video splicing function as a whole, when the video splicing system is a combination of at least two devices, between the two devices in the video splicing system, Communication can be performed through one of Bluetooth, wired connection, or wireless transmission.
  • the video splicing system in the embodiment of the present application can be installed on a mobile device, such as a vehicle, for splicing the video of the vehicle by the vehicle, so as to obtain a video with a better splicing effect.
  • a mobile device such as a vehicle
  • the video splicing system can also be installed on fixed devices, such as roadside units (RSUs), servers, etc. Video splicing is performed to obtain a video with better splicing effect.
  • RSUs roadside units
  • an embodiment of the present application provides a video splicing method, including: acquiring multiple video clips related to a target vehicle; splicing a first video clip and a second video clip according to a spliced video frame; wherein the first video clip and The second video clip is two video clips that are adjacent in time among the multiple video clips, and the spliced video frame is a frame used for video splicing determined in the second boundary video according to the positioning position information of the target vehicle in the first boundary video.
  • Video frames, the first boundary video is the multi-frame video at the end position of the first video segment, and the second boundary video is the multi-frame video at the start position of the second video segment.
  • an embodiment of the present application provides a video splicing method, including: receiving multiple video clips related to a target vehicle; acquiring a first video clip and a second video clip that are temporally adjacent in the multiple video clips; The video frame splices the first video clip and the second video clip; wherein, the spliced video frame is a video frame for video splicing determined in the second boundary video according to the positioning position information of the target vehicle in the first boundary video, and the first video frame is used for video splicing.
  • a boundary video is the multi-frame video at the end position of the first video segment; the second boundary video is the multi-frame video at the start position of the second video segment.
  • the video clip includes: the shooting time of the video frame in the video clip, and the positioning position information of the target vehicle in the video frame; splicing the video frames is specifically according to the first boundary video.
  • the target vehicle is in multiple frames.
  • the positioning position information in the video frame predicts the first position information, and the video frame with the smallest distance from the first position information is selected from the second boundary video.
  • the video frame closest to the predicted position of the target vehicle can be selected as the spliced video frame, because the position of the target vehicle usually does not change abruptly.
  • the video frame with the closest predicted position can be used as a spliced video frame to obtain a more accurate spliced video frame, so that a better splicing effect can be achieved.
  • predicting the first position information according to the positioning position information of the target vehicle in the multi-frame video frames in the first boundary video includes: determining at least three video frames in the first boundary video, at least three video frames; The video frame includes the last video frame of the first boundary video; according to the shooting time of at least three video frames and the positioning position information of the target vehicle in the at least three video frames, the speed information and direction information of the target vehicle are calculated; the speed information and The direction information is used to predict the first position information.
  • relatively accurate first position information of the target position can be predicted based on the speed and direction of the target vehicle, so that an accurate spliced video frame can be selected from the second boundary video subsequently.
  • the first position information is obtained by adding a first value to the positioning position information of the target vehicle in the last video frame of the first boundary video in the direction indicated by the direction information; the first value is negatively correlated with the speed information. . Because the higher the speed of the target vehicle, the greater the change of the position within a period of time. When predicting the position of the target vehicle, if the first value is large, the prediction is likely to be inaccurate. Therefore, set the first value and the speed information to be negative. Correlation, more accurate first position information of the target position can be obtained, so that an accurate spliced video frame can be selected from the second boundary video subsequently.
  • the positioning position information of the target vehicle includes: position information assisted and confirmed based on the angle between the driving speed of the target vehicle and the steering wheel of the target vehicle. In this way, relatively accurate position information of the target vehicle can be determined according to the steering wheel angle.
  • the positioning position information of the target vehicle includes: position information obtained by performing deviation compensation based on the shooting angle of view of the video frame. In this way, more accurate position information can be obtained by performing deviation compensation based on the shooting angle of view of the video frame.
  • acquiring multiple video clips related to the target vehicle includes: receiving a video synthesis request; the video synthesis request includes the identification, trajectory information and time information of the target vehicle; The source video containing the target vehicle; the source video is determined to contain multiple video clips related to the target vehicle. In this way, the relevant video of the target vehicle can be synthesized based on the user's request, which is more in line with the user's needs.
  • acquiring a source video related to the trajectory information and time information and including the target vehicle includes: acquiring a position difference with the trajectory information within a first threshold range, and a time difference with the time information within a second threshold range. and contains the source video of the target vehicle. In this way, more effective source video data for the target vehicle can be obtained, and waste of subsequent computing resources caused by useless videos in the source video data can be avoided.
  • the first threshold range is negatively correlated with the speed of the target vehicle
  • the second threshold range is negatively correlated with the speed of the target vehicle
  • the source video includes videos shot by other vehicles and/or videos shot by road equipment. In this way, a multi-angle video of the target vehicle can be obtained, so that a richer splicing material can be obtained.
  • determining to include multiple video clips related to the target vehicle in the source video includes: filtering the source video to obtain multiple first video clips including the target vehicle; When there are video clips with a shooting time overlap ratio greater than the third threshold, the video clips whose shooting time overlap ratio is greater than the third threshold are scored for quality; according to the quality score, select among the video clips with a shooting time overlap ratio greater than the third threshold a video clip. In this way, a video clip with better quality can be used as a spliced video clip, thereby obtaining a spliced video with better quality.
  • scoring the quality of the video clips whose shooting time overlap is greater than a third threshold includes: according to the proportion and centering degree of the target vehicle in the video clips whose shooting time overlap is greater than the third threshold, classifying the video clips with a shooting time overlap greater than The quality of the video clips with the third threshold is scored, and the ratio of the target vehicle is the ratio of the size of the target vehicle to the size of the video frame.
  • the multiple video clips are video clips sorted according to the shooting time of the multiple video clips; in the case where the shooting time interval of adjacent video clips in the multiple video clips is greater than the fourth threshold, Insert preset video in adjacent video clips.
  • a preset video can be inserted, so as to avoid excessive jumping of the scene transition position after splicing, and increase the coherence of the spliced video.
  • the method in the embodiment of the present application may be executed locally or in the cloud, which is not specifically limited in the embodiment of the present application.
  • an embodiment of the present application further provides a video splicing apparatus, which can be used to perform the operations in the first aspect, the second aspect, or any possible implementation manners described above.
  • the apparatus may include modules or units for performing various operations in the first aspect, the second aspect, or any of the possible implementations described above.
  • it includes a transceiver module and a processing module.
  • the transceiver module is used to obtain multiple video clips related to the target vehicle; the processing module is used to splicing the first video clip and the second video clip according to the spliced video frame; wherein, the first video clip and the second video clip
  • the clips are two video clips that are adjacent in time among the multiple video clips, and the spliced video frame is a video frame determined in the second boundary video for video splicing according to the positioning position information of the target vehicle in the first boundary video,
  • the first boundary video is the multi-frame video at the end position of the first video segment
  • the second boundary video is the multi-frame video at the start position of the second video segment.
  • the boundary videos in the video clips can be spliced with high precision according to the position information of the target vehicle in the video object, so as to improve the accuracy of video splicing, and better results can be obtained.
  • Video stitching effect according to.
  • a transceiver module is used to obtain multiple video clips related to the target vehicle; a processing module is used to obtain a first video clip and a second video clip that are adjacent in time among the multiple video clips; splicing according to the spliced video frame The first video clip and the second video clip; wherein, the spliced video frame is a video frame for video splicing determined in the second boundary video according to the positioning position information of the target vehicle in the first boundary video, and the first boundary video is the multi-frame video at the end position of the first video segment; the second boundary video is the multi-frame video at the start position of the second video segment.
  • the video clip includes the shooting time of the video frame in the video clip, and the positioning position information of the target vehicle in the video frame; splicing the video frame is specifically according to the multi-frame video of the target vehicle in the first boundary video.
  • the positioning position information in the frame predicts the first position information, and the video frame with the smallest distance from the first position information is selected from the second boundary video.
  • the video frame closest to the predicted position of the target vehicle can be selected as the spliced video frame, because the position of the target vehicle usually does not change abruptly.
  • the video frame with the closest predicted position can be used as a spliced video frame to obtain a more accurate spliced video frame, so that a better splicing effect can be achieved.
  • the processing module is specifically configured to determine at least three video frames in the first boundary video, and the at least three video frames include the last video frame of the first boundary video; The speed information and direction information of the target vehicle are calculated; the speed information and direction information are used to predict the first position information. In this way, relatively accurate first position information of the target position can be predicted based on the speed and direction of the target vehicle, so that an accurate spliced video frame can be selected from the second boundary video subsequently.
  • the first position information is obtained by adding a first value to the positioning position information of the target vehicle in the last video frame of the first boundary video in the direction indicated by the direction information; the first value is negatively correlated with the speed information. . Because the higher the speed of the target vehicle, the greater the change of the position within a period of time. When predicting the position of the target vehicle, if the first value is large, the prediction is likely to be inaccurate. Therefore, set the first value and the speed information to be negative. Correlation, more accurate first position information of the target position can be obtained, so that an accurate spliced video frame can be selected from the second boundary video subsequently.
  • the positioning position information of the target vehicle includes: position information assisted and confirmed based on the angle between the driving speed of the target vehicle and the steering wheel of the target vehicle. In this way, relatively accurate position information of the target vehicle can be determined according to the steering wheel angle.
  • the positioning position information of the target vehicle includes: position information obtained by performing deviation compensation based on the shooting angle of view of the video frame. In this way, more accurate position information can be obtained by performing deviation compensation based on the shooting angle of view of the video frame.
  • the transceiver module is specifically configured to receive a video synthesis request; the video synthesis request includes the identification, trajectory information and time information of the target vehicle; obtains a source video related to the trajectory information and time information and including the target vehicle; The source video is determined to contain multiple video clips related to the target vehicle. In this way, the relevant video of the target vehicle can be synthesized based on the user's request, which is more in line with the user's needs.
  • the transceiver module is specifically configured to acquire a source video whose position difference with the trajectory information is within a first threshold range, and the time difference with the time information is within a second threshold range and includes the target vehicle. In this way, more effective source video data for the target vehicle can be obtained, and waste of subsequent computing resources caused by useless videos in the source video data can be avoided.
  • the first threshold range is negatively correlated with the speed of the target vehicle
  • the second threshold range is negatively correlated with the speed of the target vehicle
  • the source video includes videos shot by other vehicles and/or videos shot by road equipment. In this way, a multi-angle video of the target vehicle can be obtained, so that a richer splicing material can be obtained.
  • the processing module is further configured to filter the source video to obtain multiple first video clips containing the target vehicle; among the multiple first video clips, there are videos with a shooting time overlap ratio greater than a third threshold.
  • a quality score is performed on the video clips whose shooting time overlap ratio is greater than the third threshold; according to the quality score, a video clip is selected from the video clips whose shooting time overlap ratio is greater than the third threshold. In this way, a video clip with better quality can be used as a spliced video clip, thereby obtaining a spliced video with better quality.
  • the processing module is specifically configured to score the quality of the video clips whose shooting time overlap is greater than the third threshold according to the proportion and centering degree of the target vehicle in the video clips whose shooting time overlap is greater than the third threshold, and the target The ratio of the vehicle is the ratio of the size of the target vehicle to the size of the video frame.
  • the multiple video clips are video clips sorted according to the shooting time of the multiple video clips; in the case where the shooting time interval of adjacent video clips in the multiple video clips is greater than the fourth threshold, Insert preset video in adjacent video clips.
  • a preset video can be inserted, so as to avoid excessive jumping of the scene transition position after splicing, and increase the coherence of the spliced video.
  • an embodiment of the present application provides a chip system, including a processor, and optionally a memory; wherein, the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that the installed
  • the video splicing apparatus of the chip system performs any method in the first aspect, the second aspect or any possible implementation manner of the first aspect.
  • embodiments of the present application provide a vehicle, at least one camera, at least one memory, at least one transceiver, and at least one processor.
  • a camera for capturing video clips a memory for storing one or more programs and data information; wherein one or more programs include instructions; a transceiver for data transmission with communication equipment in the vehicle, and for Data transmission with the cloud to obtain multiple video clips related to the target vehicle; a processor for splicing the first video clip and the second video clip according to the spliced video frame; wherein the first video clip and the second video clip are Two video clips that are adjacent in time among the multiple video clips, the spliced video frame is the video frame used for video splicing determined in the second boundary video according to the positioning position information of the target vehicle in the first boundary video.
  • the boundary video is the multi-frame video at the end position of the first video segment
  • the second boundary video is the multi-frame video at the start position of the second video segment.
  • the vehicle further includes a display screen and a voice broadcasting device; a display screen for displaying the spliced video; and a voice broadcasting device for broadcasting the audio of the spliced video.
  • the transceiver and processor of the embodiment of the present application may also perform steps corresponding to the transceiver module and the processing module in any possible implementation manner of the third aspect. For details, refer to the description of the third aspect, which will not be repeated here.
  • the camera described in the embodiment of the present application may be a camera of a driver monitoring system, a cockpit camera, an infrared camera, a driving recorder (ie a video recording terminal), a reversing image camera, etc., which are not limited in the specific embodiment of the present application. .
  • the shooting area of the camera may be the external environment or the internal environment of the vehicle.
  • the photographing area is the area in front of the vehicle; when the vehicle is reversing, the photographing area is the area behind the rear of the vehicle; when the camera is a 360-degree multi-angle camera, the photographing area is It can be a 360-degree area around the vehicle, etc.; when the camera is set in the car, the shooting area can be an area inside the car, and the like.
  • an embodiment of the present application provides a computer program product, the computer program product includes: computer program code, when the computer program code is run by a communication module, a processing module, a transceiver, or a processor of a video splicing device, so that all The video splicing apparatus executes any method in the first aspect, the second aspect, or any possible implementation manner of the first aspect.
  • an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a program, and the program enables the video splicing apparatus to perform the first aspect, the second aspect, or any possible implementation of the first aspect any of the methods.
  • FIG. 1 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 2 is a functional block diagram of a vehicle 100 provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of data preparation provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of video acquisition provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of video clip selection provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of video scoring provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of low-precision splicing provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of determining a spliced video frame provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a vehicle video trajectory provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of a video splicing method provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a video splicing apparatus provided by an embodiment of the application.
  • FIG. 12 is a schematic structural diagram of another video splicing apparatus provided by an embodiment of the application.
  • FIG. 13 is a schematic structural diagram of a vehicle according to an embodiment of the application.
  • words such as “first” and “second” are used to distinguish the same or similar items with basically the same function and effect.
  • the first video clip and the second video clip are only for distinguishing different video clips, and the sequence of the video clips is not limited.
  • the words “first”, “second” and the like do not limit the quantity and execution order, and the words “first”, “second” and the like are not necessarily different.
  • At least one means one or more, and “plurality” means two or more.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
  • at least one item (a) of a, b, or c may represent: a, b, c, ab, ac, bc, or abc, where a, b, and c may be single or multiple .
  • the video splicing method, device, and system of the embodiments of the present application can be applied to scenarios such as multi-source video splicing of automobiles.
  • the video splicing method, device, and system provided by the embodiments of the present application can be applied to the scene shown in FIG. 1 .
  • the video synthesis method of the embodiment of the present application may be applied to a video synthesis device, and the video synthesis device may be a cloud server, a vehicle-mounted device, a vehicle, or the like.
  • the vehicle when the vehicle is driving on the road, it can collect the driving video of the vehicle and report the driving video of the vehicle to the data center (for example, the cloud server that communicates with the vehicle, or the data processing device inside the vehicle. , or a device for storing video data, etc.), the data center can also receive videos from other vehicles around the vehicle (such as the two surrounding vehicles in Figure 1), as well as video capture devices in the road (such as road video taken by the camera installed in the .
  • the data center for example, the cloud server that communicates with the vehicle, or the data processing device inside the vehicle. , or a device for storing video data, etc.
  • the data center can also receive videos from other vehicles around the vehicle (such as the two surrounding vehicles in Figure 1), as well as video capture devices in the road (such as road video taken by the camera installed in the .
  • the user can send a video synthesis request about the target vehicle to the video synthesis device through the terminal device, and the video synthesis device can obtain the multi-source video (which can be understood as the video from different shooting devices) related to the target vehicle from the data center.
  • the synthesizing device can splicing multi-source videos according to the video splicing method of the embodiment of the present application, and then obtain the driving video of the target vehicle.
  • the user can send a video synthesis request about the target vehicle to the video synthesis device through the on-board device of the vehicle, and the video synthesis device can obtain the multi-source video related to the target vehicle from the data center.
  • the video splicing method splices multi-source videos, and then obtains the driving video of the target vehicle.
  • the spliced driving video of the target vehicle may be stored in a storage device, or sent to a terminal device or in-vehicle device requesting the driving video of the target vehicle, or the driving video of the target vehicle may be displayed on the display interface.
  • the number of data centers may be one or more, or no data center may be set.
  • video splicing method, device, and system provided by the embodiments of the present application may also be applied to other scenarios, which are not limited in the embodiments of the present application.
  • FIG. 2 is a functional block diagram of the vehicle 100 provided by the embodiment of the present application.
  • the vehicle 100 is configured in a fully or partially autonomous driving mode, or as a vehicle with camera capabilities and communication capabilities.
  • the vehicle 100 can also determine the current state of the vehicle and its surrounding environment through human operation while in the autonomous driving mode, and determine the possible behavior of at least one other vehicle in the surrounding environment , and determine a confidence level corresponding to the possibility of the other vehicle performing the possible behavior, and control the vehicle 100 based on the determined information.
  • the vehicle 100 may be placed to operate without human interaction.
  • Vehicle 100 may include various subsystems, such as travel system 102 , sensor system 104 , control system 106 , one or more peripherals 108 and power supply 110 , computer system 112 , and user interface 116 .
  • vehicle 100 may include more or fewer subsystems, and each subsystem may include multiple elements. Additionally, each of the subsystems and elements of the vehicle 100 may be interconnected by wire or wirelessly.
  • the travel system 102 may include components that provide powered motion for the vehicle 100 .
  • travel system 102 may include engine 118 , energy source 119 , transmission 120 , and wheels/tires 121 .
  • the engine 118 may be an internal combustion engine, an electric motor, an air compression engine, or other types of engine combinations, such as a gasoline engine and electric motor hybrid engine, an internal combustion engine and an air compression engine hybrid engine.
  • Engine 118 converts energy source 119 into mechanical energy.
  • Examples of energy sources 119 include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other sources of electricity.
  • the energy source 119 may also provide energy to other systems of the vehicle 100 .
  • Transmission 120 may transmit mechanical power from engine 118 to wheels 121 .
  • Transmission 120 may include a gearbox, a differential, and a driveshaft.
  • transmission 120 may also include other devices, such as clutches.
  • the drive shaft may include one or more axles that may be coupled to one or more wheels 121 .
  • the sensor system 104 may include several sensors that sense information about the environment surrounding the vehicle 100 .
  • the sensor system 104 may include a positioning system 122 (which may be a GPS system, a Beidou system or other positioning system), an inertial measurement unit (IMU) 124, a radar 126, a laser rangefinder 128, and camera 130.
  • the sensor system 104 may also include sensors of the internal systems of the vehicle 100 being monitored (eg, an in-vehicle air quality monitor, a fuel gauge, an oil temperature gauge, etc.). Sensor data from one or more of these sensors can be used to detect objects and their corresponding characteristics (position, shape, orientation, velocity, etc.). This detection and identification is a critical function for the safe operation of the autonomous vehicle 100 .
  • the positioning system 122 may be used to estimate the geographic location of the vehicle 100 .
  • the IMU 124 is used to sense position and orientation changes of the vehicle 100 based on inertial acceleration.
  • IMU 124 may be a combination of an accelerometer and a gyroscope.
  • Radar 126 may utilize radio signals to sense objects within the surrounding environment of vehicle 100 . In some embodiments, in addition to sensing objects, radar 126 may be used to sense the speed and/or heading of objects.
  • the laser rangefinder 128 may utilize laser light to sense objects in the environment in which the vehicle 100 is located.
  • the laser rangefinder 128 may include one or more laser sources, laser scanners, and one or more detectors, among other system components.
  • Camera 130 may be used to capture multiple images of the surrounding environment of vehicle 100 .
  • Camera 130 may be a still camera or a video camera.
  • Control system 106 controls the operation of the vehicle 100 and its components.
  • Control system 106 may include various elements including steering system 132 , throttle 134 , braking unit 136 , sensor fusion algorithms 138 , computer vision system 140 , route control system 142 , and obstacle avoidance system 144 .
  • the steering system 132 is operable to adjust the heading of the vehicle 100 .
  • it may be a steering wheel system.
  • the throttle 134 is used to control the operating speed of the engine 118 and thus the speed of the vehicle 100 .
  • the braking unit 136 is used to control the deceleration of the vehicle 100 .
  • the braking unit 136 may use friction to slow the wheels 121 .
  • the braking unit 136 may convert the kinetic energy of the wheels 121 into electrical current.
  • the braking unit 136 may also take other forms to slow the wheels 121 to control the speed of the vehicle 100.
  • Computer vision system 140 may be operable to process and analyze images captured by camera 130 in order to identify objects and/or features in the environment surrounding vehicle 100 .
  • the objects and/or features may include traffic signals, road boundaries and obstacles.
  • Computer vision system 140 may use object recognition algorithms, structure from motion (SFM) algorithms, video tracking, and other computer vision techniques.
  • SFM structure from motion
  • the computer vision system 140 may be used to map the environment, track objects, estimate the speed of objects, and the like.
  • the route control system 142 is used to determine the travel route of the vehicle 100 .
  • route control system 142 may combine data from sensors 138 , global positioning system (GPS) 122 , and one or more predetermined maps to determine a route for vehicle 100 .
  • GPS global positioning system
  • the obstacle avoidance system 144 is used to identify, evaluate, and avoid or otherwise traverse potential obstacles in the environment of the vehicle 100 .
  • control system 106 may additionally or alternatively include components other than those shown and described. Alternatively, some of the components shown above may be reduced.
  • Peripherals 108 may include a wireless communication system 146 , an onboard computer 148 , a microphone 150 and/or a speaker 152 .
  • peripherals 108 provide a means for a user of vehicle 100 to interact with user interface 116 .
  • the onboard computer 148 may provide information to the user of the vehicle 100 .
  • User interface 116 may also operate on-board computer 148 to receive user input.
  • the onboard computer 148 can be operated via a touch screen.
  • peripheral devices 108 may provide a means for vehicle 100 to communicate with other devices located within the vehicle.
  • microphone 150 may receive audio (eg, voice commands or other audio input) from a user of vehicle 100 .
  • speakers 152 may output audio to a user of vehicle 100 .
  • Wireless communication system 146 may wirelessly communicate with one or more devices, either directly or via a communication network.
  • wireless communication system 146 may use 3G cellular communications, such as code division multiple access (CDMA), EVD0, global system for mobile communications (GSM)/general packet radio service, GPRS), or 4G cellular communications such as LTE. Or 5G cellular communications.
  • the wireless communication system 146 may utilize wireless-fidelity (WiFi) to communicate with a wireless local area network (WLAN).
  • WiFi wireless local area network
  • WLAN wireless local area network
  • the wireless communication system 146 may communicate directly with the device using an infrared link, Bluetooth, or ZigBee.
  • Other wireless protocols, such as various vehicle communication systems, for example, wireless communication system 146 may include one or more dedicated short range communications (DSRC) devices, which may include communication between vehicles and/or roadside stations public and/or private data communications.
  • DSRC dedicated short range communications
  • the power supply 110 may provide power to various components of the vehicle 100 .
  • the power source 110 may be a rechargeable lithium-ion or lead-acid battery.
  • One or more battery packs of such a battery may be configured as a power source to provide power to various components of the vehicle 100 .
  • power source 110 and energy source 119 may be implemented together, such as in some all-electric vehicles.
  • Computer system 112 may include at least one processor 113 that executes instructions 115 stored in a non-transitory computer-readable medium such as data storage device 114 .
  • Computer system 112 may also be multiple computing devices that control individual components or subsystems of vehicle 100 in a distributed fashion.
  • the processor 113 may be any conventional processor, such as a commercially available central processing unit (CPU). Alternatively, the processor may be a special-purpose device such as an application specific integrated circuit (ASIC) or other hardware-based processor for use in a specific application.
  • FIG. 2 functionally illustrates a processor, memory, and other elements of the computer system 112 in the same blocks, one of ordinary skill in the art will understand that the processor, computer, or memory may actually include a can or Multiple processors, computers, or memories that are not stored within the same physical enclosure.
  • the memory may be a hard drive or other storage medium located within an enclosure other than a computer.
  • reference to a processor or computer will be understood to include reference to a collection of processors or computers or memories that may or may not operate in parallel.
  • some components such as the steering and deceleration components may each have their own processor that only performs computations related to component-specific functions .
  • a processor may be located remotely from the vehicle and in wireless communication with the vehicle. In other aspects, some of the processes described herein are performed on a processor disposed within the vehicle while others are performed by a remote processor, including taking steps necessary to perform a single maneuver.
  • data storage 114 may include instructions 115 (eg, program logic) executable by processor 113 to perform various functions of vehicle 100 , including those described above.
  • Data storage 114 may also contain additional instructions, including sending data to, receiving data from, interacting with, and/or performing data processing on one or more of propulsion system 102 , sensor system 104 , control system 106 , and peripherals 108 . control commands.
  • the data storage device 114 may store data such as road maps, route information, the vehicle's position, direction, speed, and other such vehicle data, among other information. Such information may be used by the vehicle 100 and the computer system 112 during operation of the vehicle 100 in autonomous, semi-autonomous and/or manual modes.
  • a user interface 116 for providing information to or receiving information from a user of the vehicle 100 .
  • the user interface 116 may include one or more input/output devices within the set of peripheral devices 108 , such as a wireless communication system 146 , an onboard computer 148 , a microphone 150 and a speaker 152 .
  • Computer system 112 may control functions of vehicle 100 based on input received from various subsystems (eg, travel system 102 , sensor system 104 , and control system 106 ) and from user interface 116 .
  • computer system 112 may utilize input from control system 106 in order to control steering unit 132 to avoid obstacles detected by sensor system 104 and obstacle avoidance system 144 .
  • computer system 112 is operable to provide control of various aspects of vehicle 100 and its subsystems.
  • one or more of these components described above may be installed or associated with the vehicle 100 separately.
  • data storage device 114 may exist partially or completely separate from vehicle 100 .
  • the above-described components may be communicatively coupled together in a wired and/or wireless manner.
  • the above component is just an example.
  • components in each of the above modules may be added or deleted according to actual needs, and FIG. 2 should not be construed as a limitation on the embodiments of the present application.
  • the driving video of the vehicle is obtained from the driving recorder in the vehicle.
  • the video file captured by the driving recorder can be stored locally in the vehicle or uploaded to the cloud space.
  • the video file can be used to record important driving events. For example, it can include Incident playback, event viewing or roadside scenery, etc.
  • the driving recorder focuses on the driving video recording device of the vehicle, and what is obtained is the in-vehicle video of the personal use scene.
  • the current road equipment or vehicle or terminal, etc. may capture a video containing the target vehicle, for multiple videos from multiple sources
  • the usual video splicing method is direct splicing based on time sequence, which has no effect on time.
  • the accuracy requirements are high, otherwise, after multiple splicing, the time of the video will be more disordered. For fast-moving cars, the deviation caused by the inaccuracy of the time will be greatly expanded.
  • the boundary videos in the video clips can be processed according to the position information of the target vehicle in the video object.
  • High-precision splicing can improve the accuracy of video splicing and get better video splicing effect.
  • a review video similar to a “drive recorder” may be provided for the target vehicle based on the equipment around the target vehicle and the video shot by the target vehicle, and this application implements
  • the video obtained by splicing the target vehicle may be shot from multiple devices.
  • the spliced video obtained in the embodiment of the present application has a more open and multi-source angle, which may be The user of the target vehicle provides a more vivid driving video review, which can also be applied to scenes such as road conditions, monitoring, tracking and security.
  • the vehicles involved in the embodiments of the present application may be cars, trucks, motorcycles, buses, boats, airplanes, helicopters, lawn mowers, recreational vehicles, playground vehicles, construction equipment, trams, golf carts, trains, and carts, etc., the embodiments of the present application are not particularly limited.
  • the target vehicle described in the embodiments of the present application may be a specific vehicle, or may be a type of vehicle (such as a bus, a car, a transport vehicle, etc.) Color-related classifications, etc.), or also multiple classes of vehicles, etc.
  • a specific vehicle such as a bus, a car, a transport vehicle, etc.
  • Color-related classifications, etc. or also multiple classes of vehicles, etc.
  • the target vehicle is a bus
  • a mosaic video including all buses can be generated.
  • the target vehicle in the embodiment of the present application is not specifically limited.
  • the source video described in this embodiment of the present application may be a video captured by a shooting device (for example, the vehicle itself, vehicles around the vehicle, road equipment, or other equipment for shooting vehicles, etc.).
  • the source video may include video frames, time information for shooting the video frames, location information of vehicles in the video frames, and vehicle identifications, and the like.
  • the time information for shooting a video frame may be a low-precision time marked by the device that shoots the video frame, or may be a time determined according to a network time protocol (NTP), which is not the case in this embodiment of the present application. Make specific restrictions.
  • NTP network time protocol
  • the time information may be set in the video frame in the form of timestamps.
  • the positioning position information of the vehicle in the video frame may be global positioning system (global positioning system, GPS) information of the vehicle, etc., which is not specifically limited in this embodiment of the present application.
  • GPS global positioning system
  • the subsequent embodiments take GPS as the positioning location information as an example for illustration, and the example is not used to limit the embodiment of the present application.
  • the positioning position information of the target vehicle is position information that is assisted and confirmed based on the angle between the driving speed of the target vehicle and the steering wheel of the target vehicle.
  • the angle between the driving speed of the target vehicle and the steering wheel of the target vehicle can be obtained based on equipment such as a gyroscope and a steering wheel, so as to assist in determining a more accurate target vehicle location information.
  • the positioning position information of the target vehicle is position information obtained by performing deviation compensation based on the shooting angle of view of the video frame.
  • different shooting angles of view of the cameras may cause deviations in the positioning position information, and more accurate position information can be obtained by performing deviation compensation based on the shooting angles of view of the video frames.
  • the GPS range corresponding to its viewing angle range can be obtained based on the existing video preprocessing method.
  • the GPS range value is used for matching, and according to the speed and direction of the vehicle, when the vehicle enters The moment of viewing angle starts to measure, and more specific coordinates can be calculated; for mobile cameras, the camera itself has GPS information, and the location information can be calculated based on the GPS distance difference between the two vehicles.
  • the identification of the vehicle may include one or more of the following: the license plate of the vehicle, the color of the vehicle, the color bar of the vehicle, the owner information of the vehicle, the shape of the vehicle, or information that can identify the vehicle, the embodiment of the present application. There is no specific limitation on this.
  • the stage of obtaining the source video in the embodiment of the present application may be understood as a data preparation stage or a video data upload stage.
  • the vehicle itself A, other vehicles around the vehicle, etc. can upload the device identification (identify, ID) and two sets of data, one of which includes: "Video and time", a group of which includes “GPS and time”, video and GPS can be associated with time.
  • ID identify, ID
  • Video and time a group of which includes “GPS and time”
  • video and GPS can be associated with time.
  • This embodiment of the present application may not require high precision for time, for example, the time measurement deviation between different devices may be less than about 5-10s, which can reduce the performance requirements of the device itself and provide the possibility to obtain rich source videos.
  • the two groups of data in FIG. 3 can also be uploaded simultaneously as a group of data, for example, a group of data including “device identification, video, GPS, and time” can be uploaded.
  • the upload method is not specifically limited.
  • the video synthesis request described in the embodiments of the present application may also be referred to as a user request, a request message, or a request information, etc., and the video synthesis request is used for requesting to synthesize relevant videos of the target vehicle.
  • the video synthesis request includes: one or more of the identification of the target vehicle, track information or time information.
  • the identifier of the target vehicle included in the video synthesis request is used to instruct to synthesize a video including the identifier of the target vehicle.
  • the trajectory information contained in the video synthesis request is used to instruct to synthesize the video of the target vehicle at the trajectory corresponding to the trajectory information.
  • the time information in the video synthesis request is used to instruct to synthesize the video of the target vehicle within the time information.
  • the video synthesis request may include either trajectory information or time information, which is not specifically limited in this embodiment of the present application.
  • the device for performing video synthesis when receiving the video synthesis request, may also perform permission verification on the video synthesis request, for example, to verify whether the video synthesis request is sent by the owner of the target vehicle, or Whether the video synthesis request is sent by a legal and compliant user, if the video synthesis request passes the verification, the subsequent video synthesis steps can be performed; if the video synthesis request fails to pass the verification, the subsequent video synthesis steps can be refused.
  • permission verification on the video synthesis request, for example, to verify whether the video synthesis request is sent by the owner of the target vehicle, or Whether the video synthesis request is sent by a legal and compliant user, if the video synthesis request passes the verification, the subsequent video synthesis steps can be performed; if the video synthesis request fails to pass the verification, the subsequent video synthesis steps can be refused.
  • the source video may also be processed to reduce the video range.
  • the video synthesis request is used to request the synthesized video of vehicle A
  • the trajectory of vehicle A in a certain period of time can be obtained (as shown in FIG. 4 ).
  • time 1, GPS1; time 2, GPS2, etc.) based on the trajectory of vehicle A, add the time offset ⁇ t (time difference) to the time t of vehicle A, and add the position offset to the GPS information p of vehicle A Shift amount ⁇ p (position difference), calculate the video recorded by other vehicles or equipment within the time t ⁇ t and p ⁇ p, as the video for subsequent splicing.
  • ⁇ t and ⁇ p can be related to the speed of the vehicle.
  • the video segments described in the embodiments of the present application may also be referred to as valid segments and the like.
  • the video clips can be obtained by processing the source video (or the above-screened source video), for example, the source video can be fragmented, the fragmented video can be filtered, and the clips containing the target vehicle or related to the target vehicle can be extracted. , to get multiple video clips.
  • the target vehicle can be identified from the source video in segments, and a video of x seconds before and after the time 51 when the target vehicle is identified is extracted as a video segment.
  • x is a positive number, and the value of x may be set according to an actual application scenario, which is not specifically limited in this embodiment of the present application.
  • the respective durations of the multiple video clips may also be different, which is not specifically limited in this embodiment of the present application.
  • the source video may include videos including the target vehicle shot by different devices in the same time period, and there may be duplications in the extracted video clips.
  • a third threshold this value can be set according to the actual application scenario, which is not specifically limited in this embodiment of the present application
  • the value can be set in the Among the multiple video clips, a video clip with better video quality is selected as a video clip used for subsequent splicing.
  • the video clips whose shooting time overlap is greater than a third threshold can be quality-scored; High or tallest video clip.
  • scoring the video clips whose shooting time overlap is greater than a third threshold includes: according to the proportion and centering degree of the target vehicle in the video clips whose shooting time overlap is greater than the third threshold.
  • the video clips with three thresholds are scored for quality, and the ratio of the target vehicle is the ratio of the size of the target vehicle to the size of the video frame. For example, the larger the proportion of the target vehicle in the video frame, or the more centered it is, the higher the quality score of the video frame can be.
  • the video clips may be scored according to the quality of the video frame (such as definition, etc.) or the appearance time and angle of the vehicle in the video frame, which is not specifically limited in this embodiment of the present application.
  • FIG. 6 shows a schematic diagram of scoring a video segment.
  • vehicle-related features such as license plate features, vehicle features, or other features
  • evaluation algorithms such as BLINDS, BIQI, or other algorithms
  • a combination of low-precision splicing and high-precision boundary splicing may be used to achieve a better splicing effect.
  • the implementation of low-precision splicing is: sorting multiple video clips according to the shooting time of the video clips. For example, a plurality of video clips may be arranged in a queue in the order of time information from first to last.
  • a third threshold can be set according to the actual situation, such as It can be set to any value between 0.5-1, etc.
  • the third threshold is not considered to be an overlapping video, but in actual cases, the two video clips may partially overlap). you can For truncation, the overlapping boundary video can be reserved during truncation to facilitate subsequent high-precision boundary splicing.
  • FIG. 7 shows a schematic diagram of a low-precision stitching.
  • Video clips can be spliced based on time sequence. For overlapping video clips, the video clips with higher video scores can be further selected and retained, and the video clips with lower video scores can be truncated based on the splicing boundary, which can be reserved when truncation. Repeated boundary videos are convenient for high-precision boundary stitching next time.
  • the implementation of high-precision boundary splicing is: based on the positioning position information of the target vehicle in the video frame, high-precision splicing is implemented on the boundary of the video frame.
  • the first video clip can be selected.
  • the multi-frame video at the end position is used as the first boundary video
  • the multi-frame video at the starting position of the second video segment is selected as the second boundary video
  • the target vehicle is predicted according to the position information of the target vehicle in the first boundary video.
  • the video frame corresponding to the position is used as the spliced video frame
  • the first video segment and the second video segment are spliced by using the spliced video frame.
  • the positioning of the target vehicle in the multi-frame video frames in the first boundary video may be used.
  • the position information predicts the first position information, and selects the video frame with the smallest distance from the first position information from the second boundary video.
  • FIG. 8 shows a schematic diagram of predicting the first position information according to an embodiment of the present application.
  • at least three video frames may be determined in the first boundary video (each circle in FIG. 8 may represent a video frame), and the at least three video frames include the last video frame of the first boundary video ( The circle filled with black in video A as shown in Figure 8); according to the shooting time of at least three video frames and the positioning position information of the target vehicle in at least three video frames, calculate the speed information and direction information of the target vehicle (for example, according to the distance and time, the speed information of the vehicle can be calculated, and the direction information of the vehicle can be obtained according to the change of the positioning position); the speed information and the direction information can be used to predict the first position information.
  • the first position information is: the positioning position information of the target vehicle in the last video frame of the first boundary video is obtained by adding a first value in the direction indicated by the direction information; the first value is negatively correlated with the speed information (the first value is also The confirmation can be assisted based on the high-precision map, for example, the first value can be assisted in determining the first value according to conditions such as roads in the map).
  • the video frame with the smallest distance from the first position information (the circle filled with black in the video B in FIG. 8 ) may be selected from the second boundary video as the spliced video frame.
  • the second boundary video can be spliced from the spliced video frame (possibly adjacent to several frames of the spliced video frame) with the first boundary video, because the position of the vehicle Usually there is no sudden change, and a more accurate splicing position can be obtained by predicting and splicing video frames based on the positioning position information, so that a better splicing effect can be obtained based on the more accurate splicing position.
  • the at least three video frames may be several consecutive video frames, or may be several discontinuous video frames extracted from the video segment.
  • the distance between the position information of the vehicle corresponding to the two video frames can be selected to be greater than 30cm, for example, kept at 50cm to 5m, etc.
  • the distance can be dynamically adjusted according to the speed of the vehicle, so that the distance obtained from the three video frames can be obtained. Accurate directional information and location information.
  • a preset can be inserted into the adjacent video clips.
  • the preset video may be, for example, a video recorded by the driving recorder, a landscape video, a map video, a road video, or any other possible video, so that a relatively process spliced video can be obtained.
  • the trajectory video of the target vehicle can be obtained.
  • the video may be formed on the map to form a video track of the target vehicle.
  • FIG. 10 is a schematic flowchart of a video splicing method provided by an embodiment of the application. As shown in FIG. 10 , the method includes:
  • S101 Acquire multiple video clips related to the target vehicle.
  • S102 Splicing the first video clip and the second video clip according to the spliced video frame; wherein, the first video clip and the second video clip are two video clips that are adjacent in time among the multiple video clips, and the spliced video frame is based on the target vehicle
  • the positioning position information in the first boundary video, the video frame used for video splicing determined in the second boundary video, the first boundary video is the multi-frame video at the end position of the first video segment, and the second boundary video is Multi-frame video at the start position of the second video segment.
  • the above implementing devices include hardware structures and/or software units corresponding to executing the functions.
  • the present application can be implemented in hardware or a combination of hardware and computer software with the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
  • an embodiment of the present application is a device for video splicing, and the device for video splicing includes a processor 1100, a memory 1101, and a transceiver 1102;
  • the processor 1100 is responsible for managing the bus architecture and general processing, and the memory 1101 may store data used by the processor 1100 when performing operations.
  • the transceiver 1102 is used to receive and transmit data under the control of the processor 1100 for data communication with the memory 1101 .
  • the bus architecture may include any number of interconnected buses and bridges, in particular one or more processors represented by processor 1100 and various circuits of memory represented by memory 1101 linked together.
  • the bus architecture may also link together various other circuits, such as peripherals, voltage regulators, and power management circuits, which are well known in the art and, therefore, will not be described further herein.
  • the bus interface provides the interface.
  • the processor 1100 is responsible for managing the bus architecture and general processing, and the memory 1101 may store data used by the processor 1100 in performing operations.
  • each step of the video splicing process can be completed by an integrated logic circuit of hardware in the processor 1100 or an instruction in the form of software.
  • the processor 1100 may be a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, and may implement or execute the embodiments of the present application.
  • a general purpose processor may be a microprocessor or any conventional processor or the like.
  • the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory 1101, and the processor 1100 reads the information in the memory 1101, and completes the steps of the signal processing flow in combination with its hardware.
  • the processor 1100 is configured to read the program in the memory 1101 and execute the method flow in S101-S102 shown in FIG. 10 .
  • the present application provides an apparatus for video splicing, and the apparatus includes a transceiver module 1200 and a processing module 1201 .
  • the transceiver module 1200 is configured to acquire multiple video clips related to the target vehicle.
  • the processing module 1201 is used for splicing the first video clip and the second video clip according to the spliced video frame; wherein, the first video clip and the second video clip are two video clips that are adjacent in time among the multiple video clips, and the splicing
  • the video frame is a video frame for video splicing determined in the second boundary video according to the positioning position information of the target vehicle in the first boundary video
  • the first boundary video is the multi-frame video at the end position of the first video segment.
  • the second boundary video is a multi-frame video at the starting position of the second video segment.
  • the boundary videos in the video clips can be spliced with high precision according to the position information of the target vehicle in the video object, so as to improve the accuracy of video splicing, and better results can be obtained.
  • Video stitching effect according to.
  • the video clip includes: the shooting time of the video frame in the video clip, and the positioning position information of the target vehicle in the video frame; splicing the video frames is specifically according to the first boundary video.
  • the target vehicle is in multiple frames.
  • the positioning position information in the video frame predicts the first position information, and the video frame with the smallest distance from the first position information is selected from the second boundary video.
  • the video frame closest to the predicted position of the target vehicle can be selected as the spliced video frame, because the position of the target vehicle usually does not change abruptly.
  • the video frame with the closest predicted position can be used as a spliced video frame to obtain a more accurate spliced video frame, so that a better splicing effect can be achieved.
  • the processing module is specifically configured to determine at least three video frames in the first boundary video, and the at least three video frames include the last video frame of the first boundary video; The speed information and direction information of the target vehicle are calculated; the speed information and direction information are used to predict the first position information. In this way, relatively accurate first position information of the target position can be predicted based on the speed and direction of the target vehicle, so that an accurate spliced video frame can be selected from the second boundary video subsequently.
  • the first position information is obtained by adding a first value to the positioning position information of the target vehicle in the last video frame of the first boundary video in the direction indicated by the direction information; the first value is negatively correlated with the speed information. . Because the higher the speed of the target vehicle, the greater the change of the position within a period of time. When predicting the position of the target vehicle, if the first value is large, the prediction is likely to be inaccurate. Therefore, set the first value and the speed information to be negative. Correlation, more accurate first position information of the target position can be obtained, so that an accurate spliced video frame can be selected from the second boundary video subsequently.
  • the positioning position information of the target vehicle includes: position information assisted and confirmed based on the angle between the driving speed of the target vehicle and the steering wheel of the target vehicle. In this way, relatively accurate position information of the target vehicle can be determined according to the steering wheel angle.
  • the positioning position information of the target vehicle includes: position information obtained by performing deviation compensation based on the shooting angle of view of the video frame. In this way, more accurate position information can be obtained by performing deviation compensation based on the shooting angle of view of the video frame.
  • the transceiver module is specifically configured to receive a video synthesis request; the video synthesis request includes the identification, trajectory information and time information of the target vehicle; obtains a source video related to the trajectory information and time information and including the target vehicle; It is determined that the source video contains multiple video clips related to the target vehicle. In this way, the relevant video of the target vehicle can be synthesized based on the user's request, which is more in line with the user's needs.
  • the transceiver module is specifically configured to acquire a source video whose position difference with the trajectory information is within a first threshold range, and the time difference with the time information is within a second threshold range and includes the target vehicle. In this way, more effective source video data for the target vehicle can be obtained, and waste of subsequent computing resources caused by useless videos in the source video data can be avoided.
  • the first threshold range is negatively correlated with the speed of the target vehicle
  • the second threshold range is negatively correlated with the speed of the target vehicle
  • the source video is a video shot by other vehicles and/or a video shot by road equipment.
  • a multi-angle video of the target vehicle can be obtained, so that a richer splicing material can be obtained.
  • the processing module is further configured to filter the source video to obtain multiple first video clips containing the target vehicle; among the multiple first video clips, there are videos with a shooting time overlap ratio greater than a third threshold.
  • a quality score is performed on the video clips whose shooting time overlap ratio is greater than the third threshold; according to the quality score, a video clip is selected from the video clips whose shooting time overlap ratio is greater than the third threshold. In this way, a video clip with better quality can be used as a spliced video clip, thereby obtaining a spliced video with better quality.
  • the processing module is specifically configured to score the quality of the video clips whose shooting time overlap is greater than the third threshold according to the proportion and centering degree of the target vehicle in the video clips whose shooting time overlap is greater than the third threshold, and the target The ratio of the vehicle is the ratio of the size of the target vehicle to the size of the video frame.
  • the multiple video clips are video clips sorted according to the shooting time of the multiple video clips; in the case where the shooting time interval of adjacent video clips in the multiple video clips is greater than the fourth threshold, Insert preset video in adjacent video clips.
  • a preset video can be inserted, so as to avoid excessive jumping of the scene transition position after splicing, and increase the coherence of the spliced video.
  • the functions of the transceiver module 1200 and the processing module 1201 shown in FIG. 12 may be executed by the processor 1100 running a program in the memory 1101 , or executed by the processor 1100 alone.
  • the present application provides a vehicle, the device includes at least one camera 1301 , at least one memory 1302 , at least one transceiver 1303 and at least one processor 1304 ;
  • the camera 1301 is used to acquire video clips.
  • the memory 1302 is used to store one or more programs and data information; wherein the one or more programs include instructions.
  • the transceiver 1303 is used for data transmission with the communication device in the vehicle and data transmission with the cloud, so as to acquire multiple video clips related to the target vehicle.
  • the processor 1304 is configured to splicing the first video clip and the second video clip according to the spliced video frame; wherein the first video clip and the second video clip are two video clips that are adjacent in time among the multiple video clips, and the splicing
  • the video frame is a video frame for video splicing determined in the second boundary video according to the positioning position information of the target vehicle in the first boundary video
  • the first boundary video is the multi-frame video at the end position of the first video segment.
  • the second boundary video is a multi-frame video at the starting position of the second video segment.
  • the vehicle further includes a display screen 1305 and a voice announcement device 1306 .
  • the display screen 1305 is used to display the spliced video.
  • the voice broadcast device 1306 is used to broadcast the audio of the spliced video.
  • various aspects of the video splicing method provided by the embodiments of the present application may also be implemented in the form of a program product, which includes program code, and when the program code runs on a computer device, all The program code is used to cause the computer device to execute the steps in the video stitching method according to various exemplary embodiments of the present application described in this specification.
  • the program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a program product for video splicing may employ a portable compact disc read only memory (CD-ROM) and include program codes, and may run on a server device.
  • CD-ROM portable compact disc read only memory
  • the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be transmitted by communication, used by an apparatus or device, or used in combination therewith.
  • a readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, carrying readable program code therein. Such propagated data signals may take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport a program for use by or in connection with a periodic network action system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for carrying out the operations of the present application may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device.
  • LAN local area network
  • WAN wide area network
  • the embodiments of the present application further provide a storage medium readable by a computing device for the video splicing method, that is, the content will not be lost after the power is turned off.
  • Software programs are stored in the storage medium, including program codes. When the program codes are run on a computing device, the software programs can implement any of the above embodiments of the present application when read and executed by one or more processors. Video stitching scheme.
  • the embodiment of the present application also provides an electronic device.
  • the electronic device includes: a processing module configured to support the video splicing apparatus to perform the steps in the above embodiments.
  • S101 may be performed.
  • S102 or other processes of the technology described in the embodiments of this application.
  • the video splicing device includes but is not limited to the unit modules listed above.
  • the specific functions that can be implemented by the above functional units also include but are not limited to the functions corresponding to the method steps described in the above examples.
  • the detailed description of other units of the electronic device please refer to the detailed description of the corresponding method steps. This application implements Examples are not repeated here.
  • the electronic device involved in the above embodiments may include: a processing module, a storage module and a communication module.
  • the storage module is used to save the program codes and data of the electronic device.
  • the communication module is used to support the communication between the electronic device and other network entities, so as to realize the functions of the electronic device's call, data interaction, Internet access and so on.
  • the processing module is used to control and manage the actions of the electronic device.
  • the processing module may be a processor or a controller.
  • the communication module may be a transceiver, an RF circuit or a communication interface or the like.
  • the storage module may be a memory.
  • the electronic device may further include an input module and a display module.
  • the display module can be a screen or a display.
  • the input module can be a touch screen, a voice input device, or a fingerprint sensor.
  • the present application may also be implemented in hardware and/or software (including firmware, resident software, microcode, etc.). Still further, the present application may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by an instruction execution system or Used in conjunction with an instruction execution system.
  • a computer-usable or computer-readable medium can be any medium that can contain, store, communicate, transmit, or transmit a program for use by, or in connection with, an instruction execution system, apparatus, or device. device or equipment use.

Abstract

本申请提供一种视频拼接的方法、装置及系统,涉及数据处理技术领域。方法包括:获取与目标车辆相关的多个视频片段;根据拼接视频帧拼接第一视频片段和第二视频片段;其中,第一视频片段和第二视频片段为多个视频片段中时间相邻的两个视频片段,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频,第二边界视频为第二视频片段的起始位置处的多帧视频。本申请实施例在对多个视频片段拼接时,可以根据视频对象中目标车辆的位置信息,对视频片段中的边界视频进行高精度拼接,提升视频拼接的准确度,可以得到较好的视频拼接效果。

Description

视频拼接的方法、装置及系统 技术领域
本申请涉及图像处理技术领域,尤其涉及一种视频拼接的方法、装置及系统。
背景技术
在图像处理中,视频拼接是较为常用的图像处理方式。
通常的,视频拼接是基于时间顺序对多段视频进行拼接。例如,可以根据各视频的拍摄时间顺序,将多个视频拼接为按照时间顺序的完整视频。
然而,基于时间顺序的时间拼接方法,对时间的准确性要求高,而视频获取的时间通常不够精准,导致视频拼接的效果较差。
发明内容
本申请提供一种视频拼接的方法、装置及系统,以提升视频拼接的准确度,得到较好的视频拼接效果。
应理解,本申请实施例中提供的进行视频拼接的方法可以由视频拼接系统执行。
在一种可能的实现方式中,视频拼接的系统包括视频片段获取单元和拼接单元。
其中,视频片段获取单元,用于获取与目标车辆相关的多个视频片段。
拼接单元,用于根据拼接视频帧拼接第一视频片段和第二视频片段;其中,第一视频片段和第二视频片段为多个视频片段中时间相邻的两个视频片段,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频,第二边界视频为第二视频片段的起始位置处的多帧视频。
需要说明的是,本申请实施例中视频拼接系统可以是单独一个具有视频拼接功能的装置。也可以是至少两个装置的组合,即至少两个装置组合成一个整体具有视频拼接功能的系统,当视频拼接系统为至少两个装置的组合时,视频拼接系统中的两个装置之间,可以通过蓝牙、有线连接或者无线传输中的一种通信方式进行通信。
其中,本申请实施例中的视频拼接系统可以安装在移动设备上,例如车辆中,用于该车辆对自车视频的拼接,得到拼接效果较好的视频。另外,视频拼接系统除了安装在移动设备上以外,还也可以安装在固定的设备上,例如,安装在路侧单元(road side unit,RSU)、服务器等设备上,用于对目标车辆相关的视频进行视频拼接,得到拼接效果较好的视频。
第一方面,本申请实施例提供一种视频拼接方法,包括:获取与目标车辆相关的多个视频片段;根据拼接视频帧拼接第一视频片段和第二视频片段;其中,第一视频片段和第二视频片段为多个视频片段中时间相邻的两个视频片段,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频,第二边界视频为 第二视频片段的起始位置处的多帧视频。
第二方面,本申请实施例提供一种视频拼接方法,包括:接收与目标车辆相关的多个视频片段;获取多个视频片段中时间相邻的第一视频片段和第二视频片段;根据拼接视频帧拼接第一视频片段和第二视频片段;其中,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频;第二边界视频为第二视频片段的起始位置处的多帧视频。
在上述第一方面和第二方面的视频拼接方法的基础上,可以有下述可能的实现方式。
一种可能的实现方式中,视频片段中包括:视频片段中视频帧的拍摄时间,以及目标车辆在视频帧中的定位位置信息;拼接视频帧具体为根据第一边界视频中目标车辆在多帧视频帧中的定位位置信息预测第一位置信息,从第二边界视频中选择与第一位置信息距离最小的视频帧。这样,可以根据在第二边界视频中,选择与目标车辆的预测位置最接近的视频帧作为拼接视频帧,因为目标车辆的位置通常不会突变,因此,本申请实施例中,选择与目标车辆的预测位置最接近的视频帧作为拼接视频帧可以得到较为准确的拼接视频帧,从而可以达到较好的拼接效果。
一种可能的实现方式中,根据第一边界视频中目标车辆在多帧视频帧中的定位位置信息预测第一位置信息,包括:在第一边界视频中确定至少三个视频帧,至少三个视频帧中包括第一边界视频的最后一个视频帧;根据至少三个视频帧的拍摄时间和至少三个视频帧中目标车辆的定位位置信息,计算目标车辆的速度信息和方向信息;速度信息和方向信息用于预测第一位置信息。这样,可以基于目标车辆的速度和方向预测得到较为准确的目标位置的第一位置信息,从而后续可以再第二边界视频中选择准确的拼接视频帧。
一种可能的实现方式中,第一位置信息为第一边界视频的最后一个视频帧中目标车辆的定位位置信息在方向信息指示的方向增加第一值得到的;第一值与速度信息负相关。因为目标车辆的速度越高,在一段时间内位置的变化越大,预测目标车辆的位置时,如果第一值较大,容易出现预测不准确的现象,因此,设置第一值与速度信息负相关,可以得到较准确的目标位置的第一位置信息,从而后续可以再第二边界视频中选择准确的拼接视频帧。
一种可能的实现方式中,目标车辆的定位位置信息包括:基于目标车辆的行驶速度与目标车辆的方向盘夹角辅助确认的位置信息。这样,可以根据方向盘夹角辅助确定较为准确的目标车辆的位置信息。
一种可能的实现方式中,目标车辆的定位位置信息包括:基于视频帧的拍摄视角进行偏差补偿得到的位置信息。这样,基于视频帧的拍摄视角进行偏差补偿可以得到较为准确的位置信息。
一种可能的实现方式中,获取与目标车辆相关的多个视频片段,包括:接收视频合成请求;视频合成请求包括目标车辆的标识、轨迹信息和时间信息;获取与轨迹信息和时间信息相关且包含目标车辆的源视频;在源视频中确定包含目标车辆相关的多个视频片段。这样,可以基于用户的请求合成目标车辆的相关视频,更加符合用户需 求。
一种可能的实现方式中,获取与轨迹信息和时间信息相关且包含目标车辆的源视频,包括:获取与轨迹信息的位置差在第一阈值范围内、与时间信息的时间差在第二阈值范围内且包含目标车辆的源视频。这样,可以得到针对目标车辆的较为有效的源视频数据,避免源视频数据中存在无用视频对后续计算资源造成的浪费。
一种可能的实现方式中,第一阈值范围与目标车辆的速度负相关,和/或,第二阈值范围与目标车辆的速度负相关。
一种可能的实现方式中,源视频包括其他车辆拍摄的视频和/或道路设备拍摄的视频。这样,可以得到多角度的目标车辆的视频,从而可以得到较为丰富的拼接素材。
一种可能的实现方式中,在源视频中确定包含目标车辆相关的多个视频片段,包括:对源视频过滤,得到包含目标车辆的多个第一视频片段;在多个第一视频片段中存在拍摄时间重叠比例大于第三阈值的视频片段的情况下,对拍摄时间重叠比例大于第三阈值的视频片段进行质量打分;根据质量打分,在拍摄时间重叠比例大于第三阈值的视频片段中选择一个视频片段。这样,可以将质量较好的视频片段作为拼接视频片段,从而得到质量较好的拼接视频。
一种可能的实现方式中,对拍摄时间重叠大于第三阈值的视频片段进行质量打分,包括:根据拍摄时间重叠大于第三阈值的视频片段中目标车辆的比例和居中程度,对拍摄时间重叠大于第三阈值的视频片段进行质量打分,目标车辆的比例为目标车辆的大小占视频帧大小的比值。
一种可能的实现方式中,多个视频片段为根据多个视频片段的拍摄时间排序的视频片段;在多个视频片段中存在相邻的视频片段的拍摄时间间隔大于第四阈值的情况下,在相邻的视频片段中插入预设视频。这样,在两个时间间隔较大的视频片段拼接时,可以插入预设视频,避免拼接后场景转换位置跳跃过大,增加拼接后视频的连贯性。
需要说明的是,本申请实施例方法可以在本地执行,也可以在云端执行,具体本申请实施例不做限定。
第三方面,本申请实施例还提供一种视频拼接装置,该装置可以用来执行上述第一方面、第二方面或上述任意可能的实现方式中的操作。例如,所述装置可以包括用于执行上述第一方面、第二方面或上述任意可能的实现方式中的各个操作的模块或单元。比如包括收发模块和处理模块。
示例性的,收发模块,用于获取与目标车辆相关的多个视频片段;处理模块,用于根据拼接视频帧拼接第一视频片段和第二视频片段;其中,第一视频片段和第二视频片段为多个视频片段中时间相邻的两个视频片段,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频,第二边界视频为第二视频片段的起始位置处的多帧视频。本申请实施例中,在对多个视频片段拼接时,可以根据视频对象中目标车辆的位置信息,对视频片段中的边界视频进行高精度拼接,提升视频拼接的准确度,可以得到较好的视频拼接效果根据。
示例性的,收发模块,用于获取与目标车辆相关的多个视频片段;处理模块,用 于获取多个视频片段中时间相邻的第一视频片段和第二视频片段;根据拼接视频帧拼接第一视频片段和第二视频片段;其中,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频;第二边界视频为第二视频片段的起始位置处的多帧视频。
一种可能的实现方式中,视频片段中包括视频片段中视频帧的拍摄时间,以及目标车辆在视频帧中的定位位置信息;拼接视频帧具体为根据第一边界视频中目标车辆在多帧视频帧中的定位位置信息预测第一位置信息,从第二边界视频中选择与第一位置信息距离最小的视频帧。这样,可以根据在第二边界视频中,选择与目标车辆的预测位置最接近的视频帧作为拼接视频帧,因为目标车辆的位置通常不会突变,因此,本申请实施例中,选择与目标车辆的预测位置最接近的视频帧作为拼接视频帧可以得到较为准确的拼接视频帧,从而可以达到较好的拼接效果。
一种可能的实现方式中,处理模块,具体用于在第一边界视频中确定至少三个视频帧,至少三个视频帧中包括第一边界视频的最后一个视频帧;根据至少三个视频帧的拍摄时间和至少三个视频帧中目标车辆的定位位置信息,计算目标车辆的速度信息和方向信息;速度信息和方向信息用于预测第一位置信息。这样,可以基于目标车辆的速度和方向预测得到较为准确的目标位置的第一位置信息,从而后续可以再第二边界视频中选择准确的拼接视频帧。
一种可能的实现方式中,第一位置信息为第一边界视频的最后一个视频帧中目标车辆的定位位置信息在方向信息指示的方向增加第一值得到的;第一值与速度信息负相关。因为目标车辆的速度越高,在一段时间内位置的变化越大,预测目标车辆的位置时,如果第一值较大,容易出现预测不准确的现象,因此,设置第一值与速度信息负相关,可以得到较准确的目标位置的第一位置信息,从而后续可以再第二边界视频中选择准确的拼接视频帧。
一种可能的实现方式中,目标车辆的定位位置信息包括:基于目标车辆的行驶速度与目标车辆的方向盘夹角辅助确认的位置信息。这样,可以根据方向盘夹角辅助确定较为准确的目标车辆的位置信息。
一种可能的实现方式中,目标车辆的定位位置信息包括:基于视频帧的拍摄视角进行偏差补偿得到的位置信息。这样,基于视频帧的拍摄视角进行偏差补偿可以得到较为准确的位置信息。
一种可能的实现方式中,收发模块,具体用于接收视频合成请求;视频合成请求包括目标车辆的标识、轨迹信息和时间信息;获取与轨迹信息和时间信息相关且包含目标车辆的源视频;在源视频中确定包含目标车辆相关的多个视频片段。这样,可以基于用户的请求合成目标车辆的相关视频,更加符合用户需求。
一种可能的实现方式中,收发模块,具体用于获取与轨迹信息的位置差在第一阈值范围内、与时间信息的时间差在第二阈值范围内且包含目标车辆的源视频。这样,可以得到针对目标车辆的较为有效的源视频数据,避免源视频数据中存在无用视频对后续计算资源造成的浪费。
一种可能的实现方式中,第一阈值范围与目标车辆的速度负相关,和/或,第二阈 值范围与目标车辆的速度负相关。
一种可能的实现方式中,源视频包括其他车辆拍摄的视频和/或道路设备拍摄的视频。这样,可以得到多角度的目标车辆的视频,从而可以得到较为丰富的拼接素材。
一种可能的实现方式中,处理模块,还用于对源视频过滤,得到包含目标车辆的多个第一视频片段;在多个第一视频片段中存在拍摄时间重叠比例大于第三阈值的视频片段的情况下,对拍摄时间重叠比例大于第三阈值的视频片段进行质量打分;根据质量打分,在拍摄时间重叠比例大于第三阈值的视频片段中选择一个视频片段。这样,可以将质量较好的视频片段作为拼接视频片段,从而得到质量较好的拼接视频。
一种可能的实现方式中,处理模块,具体用于根据拍摄时间重叠大于第三阈值的视频片段中目标车辆的比例和居中程度,对拍摄时间重叠大于第三阈值的视频片段进行质量打分,目标车辆的比例为目标车辆的大小占视频帧大小的比值。
一种可能的实现方式中,多个视频片段为根据多个视频片段的拍摄时间排序的视频片段;在多个视频片段中存在相邻的视频片段的拍摄时间间隔大于第四阈值的情况下,在相邻的视频片段中插入预设视频。这样,在两个时间间隔较大的视频片段拼接时,可以插入预设视频,避免拼接后场景转换位置跳跃过大,增加拼接后视频的连贯性。
第四方面,本申请实施例提供了一种芯片系统,包括处理器,可选的还包括存储器;其中,存储器用于存储计算机程序,处理器用于从存储器中调用并运行计算机程序,使得安装有芯片系统的视频拼接装置执行上述第一方面、第二方面或第一方面的任意可能的实现方式中的任一方法。
第五方面,本申请实施例提供了一种车辆,至少一个摄像器,至少一个存储器,至少一个收发器以及至少一个处理器。
摄像器,用于获取视频片段;存储器,用于存储一个或多个程序以及数据信息;其中一个或多个程序包括指令;收发器,用于与车辆中的通讯设备进行数据传输,以及用于与云端进行数据传输,以获取与目标车辆相关的多个视频片段;处理器,用于根据拼接视频帧拼接第一视频片段和第二视频片段;其中,第一视频片段和第二视频片段为多个视频片段中时间相邻的两个视频片段,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频,第二边界视频为第二视频片段的起始位置处的多帧视频。
一种可能的实现方式中,车辆还包括显示屏,和语音播报装置;显示屏,用于显示拼接后的视频;语音播报装置,用于播报拼接后的视频的音频。
本申请实施例的收发器和处理器,还可以执行如第三方面任一项可能的实现方式中收发模块和处理模块对应的步骤,具体可以参照第三方面的描述,在此不再赘述。
其中,本申请实施例中所述的摄像器可以是驾驶员监测系统的摄像机、座舱型摄像机、红外摄像机、行车记录仪(即录像终端)、倒车影像摄像头等,具体本申请实施例不进行限制。
所述摄像器的拍摄区域可以为所述车辆的外部环境或内部环境。例如,当车辆前行时,所述拍摄区域为车头前方区域;当车辆进行倒车,所述拍摄区域为车尾后方区 域;当所述摄像器为360度多角度摄像器时,所述拍摄区域可以为所述车辆周边360度区域等;当摄像器设置在车中是,拍摄区域可以为车内区域等。
第六方面,本申请实施例提供了一种计算机程序产品,计算机程序产品包括:计算机程序代码,当计算机程序代码被视频拼接装置的通信模块、处理模块或收发器、处理器运行时,使得所述视频拼接装置执行上述第一方面、第二方面或第一方面的任意可能的实现方式中的任一方法。
第七方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质存储有程序,程序使得视频拼接装置执行上述第一方面、第二方面或第一方面的任意可能的实现方式中的任一方法。
应当理解的是,本申请实施例的第二方面至第七方面与本申请实施例的第一方面的技术方案相对应,各方面及对应的可行实施方式所取得的有益效果相似,不再赘述。
附图说明
图1为本申请实施例提供的应用场景示意图;
图2是本申请实施例提供的车辆100的功能框图;
图3是本申请实施例提供的数据准备示意图;
图4是本申请实施例提供的视频获取示意图;
图5是本申请实施例提供的视频片段选取示意图;
图6是本申请实施例提供的视频打分示意图;
图7是本申请实施例提供的低精度拼接示意图;
图8是本申请实施例提供的拼接视频帧确定示意图;
图9为本申请实施例提供的车辆视频轨迹示意图;
图10为本申请实施例提供的一种视频拼接方法的流程示意图;
图11为本申请实施例提供的一种视频拼接装置的结构示意图;
图12为本申请实施例提供的另一种视频拼接装置的结构示意图;
图13为本申请实施例提供的一种车辆的结构示意图。
具体实施方式
为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。例如,第一视频片段和第二视频片段仅仅是为了区分不同的视频片段,并不对其先后顺序进行限定。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
需要说明的是,本申请中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单 独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
本申请实施例的视频拼接方法、装置及系统可以应用于汽车的多源视频拼接等场景。示例性的,本申请实施例提供的视频拼接方法、装置及系统能够应用如图1所示的场景中。
如图1所示,本申请实施例的视频合成方法可以应用于视频合成设备中,视频合成设备可以是云端服务器、车载设备、车辆等。
一种可能的应用场景中,车辆在道路的行驶过程中,可以采集本车的行车视频,将本车的行车视频上报到数据中心(例如与车辆通信的云端服务器,或车辆内部的数据处理装置,或用于存储视频数据的设备等),数据中心还可以接收来自该车辆周围的其他车辆(如图1中的周围两个车辆)所拍摄的视频,以及道路中的视频采集设备(例如道路中安装的摄像头)拍摄的视频。
例如,用户可以通过终端设备向视频合成设备发出关于目标车辆的视频合成请求,视频合成设备可以从数据中心得到与该目标车辆相关的多源视频(可以理解为来自不同拍摄设备的视频),视频合成设备可以根据本申请实施例的视频拼接方法拼接多源视频,进而得到该目标车辆的行驶视频。
例如,用户可以通过车辆的车载设备向视频合成设备发出关于目标车辆的视频合成请求,视频合成设备可以从数据中心得到与该目标车辆相关的多源视频,视频合成设备可以根据本申请实施例的视频拼接方法拼接多源视频,进而得到该目标车辆的行驶视频。
可能的实现方式中,拼接得到的目标车辆的行驶视频可以存储在存储设备中,或发送给请求目标车辆的行驶视频的终端设备或车载设备,或在显示界面中显示目标车辆的行驶视频。
可能的实现中,数据中心的数量可以为一个或多个,或者也可以不设置数据中心,车辆或道路中的视频采集设备在采集到视频数据后,可以实时或随机或周期性等发送到视频合成设备。
当然,本申请实施例提供的视频拼接方法、装置及系统还可应用在其它场景,本申请实施例中对此并不作限制。
图2是本申请实施例提供的车辆100的功能框图。在一个实施例中,将车辆100配置为完全或部分地自动驾驶模式,或配置为具备拍摄能力和通信能力的车辆。例如,当车辆100配置为部分地自动驾驶模式时,车辆100在处于自动驾驶模式时还可通过人为操作来确定车辆及其周边环境的当前状态,确定周边环境中的至少一个其他车辆的可能行为,并确定该其他车辆执行可能行为的可能性相对应的置信水平,基于所确定的信息来控制车辆100。在车辆100处于自动驾驶模式中时,可以将车辆100置为在没有和人交互的情况下操作。
车辆100可包括各种子系统,例如行进系统102、传感器系统104、控制系统106、 一个或多个外围设备108以及电源110、计算机系统112和用户接口116。可选地,车辆100可包括更多或更少的子系统,并且每个子系统可包括多个元件。另外,车辆100的每个子系统和元件可以通过有线或者无线互连。
行进系统102可包括为车辆100提供动力运动的组件。在一个实施例中,行进系统102可包括引擎118、能量源119、传动装置120和车轮/轮胎121。引擎118可以是内燃引擎、电动机、空气压缩引擎或其他类型的引擎组合,例如汽油发动机和电动机组成的混动引擎,内燃引擎和空气压缩引擎组成的混动引擎。引擎118将能量源119转换成机械能量。
能量源119的示例包括汽油、柴油、其他基于石油的燃料、丙烷、其他基于压缩气体的燃料、乙醇、太阳能电池板、电池和其他电力来源。能量源119也可以为车辆100的其他系统提供能量。
传动装置120可以将来自引擎118的机械动力传送到车轮121。传动装置120可包括变速箱、差速器和驱动轴。在一个实施例中,传动装置120还可以包括其他器件,比如离合器。其中,驱动轴可包括可耦合到一个或多个车轮121的一个或多个轴。
传感器系统104可包括感测关于车辆100周边的环境的信息的若干个传感器。例如,传感器系统104可包括定位系统122(定位系统可以是GPS系统,也可以是北斗系统或者其他定位系统)、惯性测量单元(inertial measurement unit,IMU)124、雷达126、激光测距仪128以及相机130。传感器系统104还可包括被监视车辆100的内部系统的传感器(例如,车内空气质量监测器、燃油量表、机油温度表等)。来自这些传感器中的一个或多个的传感器数据可用于检测对象及其相应特性(位置、形状、方向、速度等)。这种检测和识别是自主车辆100的安全操作的关键功能。
定位系统122可用于估计车辆100的地理位置。IMU 124用于基于惯性加速度来感测车辆100的位置和朝向变化。在一个实施例中,IMU 124可以是加速度计和陀螺仪的组合。
雷达126可利用无线电信号来感测车辆100的周边环境内的物体。在一些实施例中,除了感测物体以外,雷达126还可用于感测物体的速度和/或前进方向。
激光测距仪128可利用激光来感测车辆100所位于的环境中的物体。在一些实施例中,激光测距仪128可包括一个或多个激光源、激光扫描器以及一个或多个检测器,以及其他系统组件。
相机130可用于捕捉车辆100的周边环境的多个图像。相机130可以是静态相机或视频相机。
控制系统106为控制车辆100及其组件的操作。控制系统106可包括各种元件,其中包括转向系统132、油门134、制动单元136、传感器融合算法138、计算机视觉系统140、路线控制系统142以及障碍物避免系统144。
转向系统132可操作来调整车辆100的前进方向。例如在一个实施例中可以为方向盘系统。
油门134用于控制引擎118的操作速度并进而控制车辆100的速度。
制动单元136用于控制车辆100减速。制动单元136可使用摩擦力来减慢车轮121。在其他实施例中,制动单元136可将车轮121的动能转换为电流。制动单元136也可 采取其他形式来减慢车轮121转速从而控制车辆100的速度。
计算机视觉系统140可以操作来处理和分析由相机130捕捉的图像以便识别车辆100周边环境中的物体和/或特征。所述物体和/或特征可包括交通信号、道路边界和障碍物。计算机视觉系统140可使用物体识别算法、运动中恢复结构(structure from motion,SFM)算法、视频跟踪和其他计算机视觉技术。在一些实施例中,计算机视觉系统140可以用于为环境绘制地图、跟踪物体、估计物体的速度等等。
路线控制系统142用于确定车辆100的行驶路线。在一些实施例中,路线控制系统142可结合来自传感器138、全球定位系统(global positioning system,GPS)122和一个或多个预定地图的数据以为车辆100确定行驶路线。
障碍物规避系统144用于识别、评估和避开或者以其他方式越过车辆100的环境中的潜在障碍物。
当然,在一个实例中,控制系统106可以增加或替换地包括除了所示出和描述的那些以外的组件。或者也可以减少一部分上述示出的组件。
车辆100通过外围设备108与外部传感器、其他车辆、其他计算机系统或用户之间进行交互。外围设备108可包括无线通信系统146、车载电脑148、麦克风150和/或扬声器152。
在一些实施例中,外围设备108提供车辆100的用户与用户接口116交互的手段。例如,车载电脑148可向车辆100的用户提供信息。用户接口116还可操作车载电脑148来接收用户的输入。车载电脑148可以通过触摸屏进行操作。在其他情况中,外围设备108可提供用于车辆100与位于车内的其它设备通信的手段。例如,麦克风150可从车辆100的用户接收音频(例如,语音命令或其他音频输入)。类似地,扬声器152可向车辆100的用户输出音频。
无线通信系统146可以直接地或者经由通信网络来与一个或多个设备无线通信。例如,无线通信系统146可使用3G蜂窝通信,例如码分多址(code division multiple access,CDMA)、EVD0、全球移动通信系统(global system for mobile communications,GSM)/通用分组无线服务(general packet radio service,GPRS),或者4G蜂窝通信,例如LTE。或者5G蜂窝通信。无线通信系统146可利用无线保真(wireless-fidelity,WiFi)与无线局域网(wireless local area network,WLAN)通信。在一些实施例中,无线通信系统146可利用红外链路、蓝牙或紫蜂协议(ZigBee)与设备直接通信。其他无线协议,例如各种车辆通信系统,例如,无线通信系统146可包括一个或多个专用短程通信(dedicated short range communications,DSRC)设备,这些设备可包括车辆和/或路边台站之间的公共和/或私有数据通信。
电源110可向车辆100的各种组件提供电力。在一个实施例中,电源110可以为可再充电锂离子或铅酸电池。这种电池的一个或多个电池组可被配置为电源为车辆100的各种组件提供电力。在一些实施例中,电源110和能量源119可一起实现,例如一些全电动车中那样。
车辆100的部分或所有功能受计算机系统112控制。计算机系统112可包括至少一个处理器113,处理器113执行存储在例如数据存储装置114这样的非暂态计算机可读介质中的指令115。计算机系统112还可以是采用分布式方式控制车辆100的个 体组件或子系统的多个计算设备。
处理器113可以是任何常规的处理器,诸如商业可获得的中央处理器(central processing unit,CPU)。替选地,该处理器可以是诸如用于供专门应用的集成电路(application specific integrated circuit,ASIC)或其它基于硬件的处理器的专用设备。尽管图2功能性地图示了处理器、存储器、和在相同块中的计算机系统112的其它元件,但是本领域的普通技术人员应该理解该处理器、计算机、或存储器实际上可以包括可以或者可以不存储在相同的物理外壳内的多个处理器、计算机、或存储器。例如,存储器可以是硬盘驱动器或位于不同于计算机的外壳内的其它存储介质。因此,对处理器或计算机的引用将被理解为包括对可以或者可以不并行操作的处理器或计算机或存储器的集合的引用。不同于使用单一的处理器来执行此处所描述的步骤,诸如转向组件和减速组件的一些组件每个都可以具有其自己的处理器,所述处理器只执行与特定于组件的功能相关的计算。
在此处所描述的各个方面中,处理器可以位于远离该车辆并且与该车辆进行无线通信。在其它方面中,此处所描述的过程中的一些在布置于车辆内的处理器上执行而其它则由远程处理器执行,包括采取执行单一操纵的必要步骤。
在一些实施例中,数据存储装置114可包含指令115(例如,程序逻辑),指令115可被处理器113执行来执行车辆100的各种功能,包括以上描述的那些功能。数据存储装置114也可包含额外的指令,包括向推进系统102、传感器系统104、控制系统106和外围设备108中的一个或多个发送数据、从其接收数据、与其交互和/或对其进行控制的指令。
除了指令115以外,数据存储装置114还可存储数据,例如道路地图、路线信息,车辆的位置、方向、速度以及其它这样的车辆数据,以及其他信息。这种信息可在车辆100在自主、半自主和/或手动模式中操作期间被车辆100和计算机系统112使用。
用户接口116,用于向车辆100的用户提供信息或从其接收信息。可选地,用户接口116可包括在外围设备108的集合内的一个或多个输入/输出设备,例如无线通信系统146、车车在电脑148、麦克风150和扬声器152。
计算机系统112可基于从各种子系统(例如,行进系统102、传感器系统104和控制系统106)以及从用户接口116接收的输入来控制车辆100的功能。例如,计算机系统112可利用来自控制系统106的输入以便控制转向单元132来避免由传感器系统104和障碍物避免系统144检测到的障碍物。在一些实施例中,计算机系统112可操作来对车辆100及其子系统的许多方面提供控制。
可选地,上述这些组件中的一个或多个可与车辆100分开安装或关联。例如,数据存储装置114可以部分或完全地与车辆100分开存在。上述组件可以按有线和/或无线方式来通信地耦合在一起。
可选地,上述组件只是一个示例,实际应用中,上述各个模块中的组件有可能根据实际需要增添或者删除,图2不应理解为对本申请实施例的限制。
通常的,车辆的行驶视频是从车辆中的行车记录仪获取的,行车记录仪拍摄的视频文件可以存储在车辆本地,或上传至云空间,视频文件可以用于记录重要行车事件,例如可以包括事故回放、事件查看或沿路风景等。但是,行车记录仪聚焦为本车的行 车视频录制设备,得到的是个人使用场景的车载视频。
即使当前的道路设备或车辆或终端等,可能拍摄到包含目标车辆的视频,但是,对于来自多源的多个视频,通常的视频拼接方法是基于时间顺序的直接拼接,该拼接方法对时间的准确性要求高,否则多次拼接后,视频的时间会发生较多的错乱,对于快速行驶的汽车而言,时间的不准确导致的偏差会极具扩大。
基于此,本申请实施例提供的视频拼接方法中,在对多个视频片段拼接时,可以在利用时间进行粗略排序后,根据视频对象中目标车辆的位置信息,对视频片段中的边界视频进行高精度拼接,提升视频拼接的准确度,可以得到较好的视频拼接效果。
一种可能的理解中,本申请实施例的方法中,可以基于目标车辆周围的设备和目标车辆拍摄的视频,为目标车辆提供类似于“行车记录仪”的回看视频,且,本申请实施例的为目标车辆拼接得到的视频,可以是来自于多个设备拍摄的,相较于真实的行车记录仪拍摄的视频,本申请实施例得到的拼接视频具有更加开阔多源的角度,可以为目标车辆的用户提供更加生动的行驶视频回看,也可以适用于查看路况、监控、追踪及安防等场景。
本申请实施例所涉及的车辆可以为轿车、卡车、摩托车、公共汽车、船、飞机、直升飞机、割草机、娱乐车、游乐场车辆、施工设备、电车、高尔夫球车、火车、和手推车等,本申请实施例不做特别的限定。
本申请实施例所描述的目标车辆可以是具体的一辆车辆,也可以是一类车辆(例如公交车、小轿车、运输车等与车型相关的分类,或黑色车、白色车、红色车等与颜色相关的分类,等),或者也可以是多类车辆,等。例如,目标车辆为公交车的情况下,可以生成包含全部公交车的拼接视频等。本申请实施例目标车辆不作具体限定。
本申请实施例所描述的源视频可以是拍摄设备(例如车辆本身、车辆周围的车辆、道路设备或其他拍摄车辆的设备等)拍摄得到的视频。源视频中可以包括视频帧、拍摄视频帧的时间信息、视频帧中车辆的定位位置信息、以及车辆的标识等。
示例性的,拍摄视频帧的时间信息可以是拍摄该视频帧的设备自身标记的低精度时间,也可以是根据网络时间协议(network time protocol,NTP)确定的时间,本申请实施例对此不做具体限定。可能的实现方式中,时间信息可以以时间戳的形式设置在视频帧中。
示例性的,视频帧中车辆的定位位置信息可以是车辆的全球定位系统(global positioning system,GPS)信息等,本申请实施例对此不做具体限定。为描述方便,后续实施例以GPS作为定位位置信息为例进行示例说明,该示例并不用于限定本申请实施例。
一种可能的实现方式中,目标车辆的定位位置信息为基于目标车辆的行驶速度与目标车辆的方向盘夹角辅助确认的位置信息。
本申请实施例中,在目标车辆有转角或车速较低的情况下,可以基于陀螺仪、方向盘等设备得到目标车辆的行驶速度与目标车辆的方向盘夹角,从而可以辅助确定较为准确的目标车辆的位置信息。
一种可能的实现方式中,目标车辆的定位位置信息为基于视频帧的拍摄视角进行偏差补偿得到的位置信息。
本申请实施例中,不同的摄像头的拍摄视角等可能导致定位位置信息的偏差,基于视频帧的拍摄视角进行偏差补偿可以得到较为准确的位置信息。例如,针对固定式摄像头,可以基于已有视频预处理的方式,获取其视角范围对应的GPS范围,在处理视频时,使用该GPS范围值进行匹配,且根据车辆的速度和方向,在车辆进入视角的瞬间开始测算,可以算出较为具体的坐标;针对移动式摄像头,摄像头本身是有GPS信息的,位置信息可以基于两车的GPS距离差计算。
示例性的,车辆的标识可以包括下述的一种或多种:车辆的车牌、车辆的颜色、车辆的色条、车辆的车主信息、车辆的形状或能够标识车辆的信息,本申请实施例对此不做具体限定。
本申请实施例中的得到源视频的阶段可以理解为数据准备阶段或视频数据上传阶段。
示例性的,如图3所示,在视频数据上传时,车辆本身A、车辆周围的其他车辆等在上传源视频时,可以上传设备标识(identify,ID)和两组数据,一组为包括“视频和时间”,一组为包括“GPS和时间”,视频与GPS可以采用时间进行关联。本申请实施例对时间可以不要求高精度,例如不同设备之间的时间计量偏差小于5-10s左右即可,这样可以对设备本身的性能减少要求,为得到丰富的源视频提供可能。
可能的实现方式中,图3中的两组数据也可以为一组数据同时上传,例如,可以上传一组包括“设备标识、视频、GPS和时间”的数据,本申请实施例对源视频的上传方式不做具体限定。
本申请实施例所描述的视频合成请求也可能称为用户请求或请求消息或请求信息等,视频合成请求用于请求合成目标车辆的相关视频。示例性的,视频合成请求中包括:目标车辆的标识、轨迹信息或时间信息的一种或多种。
本申请实施例中,视频合成请求中所包含的目标车辆的标识用于指示合成包含该目标车辆的标识的视频。视频合成请求中所包含的轨迹信息用于指示合成该目标车辆在该轨迹信息对应的轨迹的视频。视频合成请求中的时间信息用于指示合成该目标车辆在该时间信息内的视频。
可能的实现方式中,视频合成请求中可以包含轨迹信息或时间信息的一种,本申请实施例对此不做具体限定。
一种可能的实现方式中,用于执行视频合成的设备,在接收到视频合成请求时,还可以对视频合成请求进行权限验证,例如,验证视频合成请求是否为目标车辆的车主发出的,或视频合成请求是否是合法合规的用户发出的,如果视频合成请求通过验证,可以执行后续的视频合成的步骤;如果视频合成请求不能通过验证,则可以拒绝执行后续的视频合成的步骤,本申请实施例对此不作具体限定。
一种可能的实现方式中,在执行视频合成的设备执行合成视频之前,还可以对源视频进行处理,缩小视频范围。
示例性的,如图4所示,如果视频合成请求用于请求车辆A的合成视频,基于车辆A的标识信息(例如车牌号),可以获取车辆A在一定时间内的轨迹(如图4中的时间1,GPS1;时间2,GPS2,等),可以基于车辆A的轨迹,增加在车辆A的时间t中添加时间偏移量Δt(时间差),在车辆A的GPS信息p中添加位置偏移量Δp(位 置差),计算在t±Δt时间内,和在p±Δp内,其它车辆或设备所录制的视频,作为后续用于拼接的视频。
可能的实现方式中,为提高精度,±Δt与±Δp可以与车辆的速度相关联。例如,±Δt与±Δp均与速度成反比,比如,Δt=系数k1/速度v,Δp=系数k2/速度v;其中,k1和k2均可以根据实际的应用场景设定,本申请实施例对k1和k2不作具体限定。
本申请实施例所描述的视频片段也可能称为有效片段等。视频片段可以是对源视频(或者如上述的筛选后的源视频)处理得到的,例如可以在源视频中分片,对分片后的视频过滤,提取包含目标车辆或与目标车辆相关的片段,得到多个视频片段。
示例性的,如图5所示,可以从源视频中分片识别目标车辆,提取识别到目标车辆的时间51的前后共x秒的视频,作为一个视频片段。其中,x为正数,x的值可以根据实际的应用场景设定,本申请实施例对此不做具体限定。可能的实现方式中,多个视频片段各自的时长也可以不同,本申请实施例对此不做具体限定。
可能的实现方式中,源视频中可能包含同一时间段的不同设备拍摄的包含目标车辆的视频,则提取的视频片段中可能存在重复。例如,多个视频片段的拍摄时间重叠大于第三阈值(该值可以根据实际应用场景设定,本申请实施例对此不做具体限定),可以认为该多个视频片段存在重复,则可以在多个视频片段中选择视频质量较好的视频片段作为后续拼接使用的视频片段。
示例性的,可以对拍摄时间重叠大于第三阈值(可以根据实际应用场景设定)的视频片段进行质量打分;根据质量打分,在拍摄时间重叠大于第三阈值的视频片段中选择一个质量打分较高或最高的视频片段。
一种可能的实现中,对拍摄时间重叠大于第三阈值的视频片段进行质量打分,包括:根据拍摄时间重叠大于第三阈值的视频片段中目标车辆的比例和居中程度,对拍摄时间重叠大于第三阈值的视频片段进行质量打分,目标车辆的比例为目标车辆的大小占视频帧大小的比值。比如,目标车辆在视频帧中的比例越大,或越居中可以任务该视频帧的质量打分越高。
一种可能的实现中,可以根据视频帧的质量情况(例如清晰度等)或视频帧中车型出现的时间、角度等对视频片段打分,本申请实施例对此不做具体限定。
示例性的,图6示出了一种对视频片段进行打分的示意图。如图6所示,在提取到视频片段中的视频帧后,可以提取视频帧中的车辆相关特征(例如车牌特征、车辆特征或其他特征),然后结合评估算法(例如BLINDS、BIQI或其他算法)得到单帧视频帧的质量评分,之后,综合加权视频片段中全部视频帧的评分,得到该视频片段的质量打分。
本申请实施例在对多个视频片段进行拼接时,可以采用低精度拼接与高精度边界拼接相结合的方式,以达到较好的拼接效果。
可能的实现方式中,低精度拼接的实现为:将多个视频片段按照视频片段的拍摄时间进行排序。例如,可以将多个视频片段按照时间信息从先到后的顺序设置在队列中。可能的实现方式中,如果多个视频片段中存在重叠(例如上述在判断两个视频片段是否存在重叠时,是判断重叠的比例是否达到第三阈值,第三阈值可以根据实际情 况设定,例如可以设置为0.5-1之间的任意值等,如果没有达到第三阈值,则不认为是重叠视频,但实际情况下,该两个视频片段可能存在部分重叠)的情况下,可以对重叠部分进行截断,截断时可以预留存在重叠部分边界视频,以便于后续的高精度边界拼接。
示例性的,图7示出了一种低精度拼接的示意图。可以基于时间顺序,对视频片段进行拼接,对于存在重叠的视频片段,可以进一步取视频打分较高的视频片段保留,并基于拼接边界对视频打分较低的视频片段进行截断,截断时可以预留重复的边界视频,便于下一下进行高精度边界拼接。
可能的实现方式中,高精度边界拼接的实现为:基于视频帧中的目标车辆的定位位置信息,在视频帧的边界实现高精度拼接。
示例性的,对多个视频片段按照从前到后的时间顺序进行排序为例,对于多个视频片段中时间相邻的第一视频帧片段和第二视频帧片段,可以选择第一视频片段的结束位置处的多帧视频作为第一边界视频,选择第二视频片段的起始位置处的多帧视频作为第二边界视频,进而根据目标车辆在第一边界视频中的位置信息,预测目标车辆在第二视频边界中的位置,将对应于该位置的视频帧作为拼接视频帧,利用该拼接视频帧拼接第一视频片段和第二视频片段。
可能的实现方式中,在根据目标车辆在第一边界视频中的位置信息,预测目标车辆在第二视频边界中的位置时,可以根据第一边界视频中目标车辆在多帧视频帧中的定位位置信息预测第一位置信息,从第二边界视频中选择与第一位置信息距离最小的视频帧。
示例性的,图8示出了本申请实施例的预测第一位置信息的示意图。如图8所示,可以在第一边界视频中确定至少三个视频帧(图8中每个圆可以代表一个视频帧),至少三个视频帧中包括第一边界视频的最后一个视频帧(如图8中视频A中填充为黑的圆);根据至少三个视频帧的拍摄时间和至少三个视频帧中目标车辆的定位位置信息,计算目标车辆的速度信息和方向信息(例如根据距离和时间,可以计算车辆的速度信息,根据定位位置的变化,可以得到车辆的方向信息);速度信息和方向信息可以用于预测第一位置信息。例如,第一位置信息为:第一边界视频的最后一个视频帧中目标车辆的定位位置信息在方向信息指示的方向增加第一值得到的;第一值与速度信息负相关(第一值也可以基于高精地图辅助确认,例如,可以根据地图中的道路等情况辅助确定第一值)。进而,可以从第二边界视频中选择与第一位置信息距离最小的视频帧(如图8中视频B中填充为黑的圆)作为拼接视频帧。
可能的实现方式中,在得到拼接视频帧后,可以将第二边界视频从拼接视频帧处(也可能是拼接视频帧相邻的几帧处)与第一边界视频进行拼接,因为车辆的位置通常不会突变,基于定位位置信息预测拼接视频帧可以得到较准确的拼接位置,从而基于较准确的拼接位置得到较好的拼接效果。
本申请实施例中,至少三个视频帧可以是连续的几个视频帧,也可以是视频片段中的抽出来的几个不连续的视频帧。示例性的,可以选取两个视频帧帧对应的车辆的位置信息的距离可以大于30cm,例如保持在50cm~5m等,该距离可以根据车辆的速度动态调整,这样可以使得根据三个视频帧得到准确的方向信息和位置信息。
一种可能的实现方式中,多个视频片段中存在相邻的视频片段的拍摄时间间隔大于第四阈值(可以根据实际情况设定)的情况下,可以在相邻的视频片段中插入预设视频。预设视频例如可以是本车行车记录仪记载的视频、风景视频、地图视频、道路视频或其他任意可能的视频,从而可以得到较为流程的拼接视频。
本申请实施例中,在完成视频拼接后,可以得到目标车辆的轨迹视频。示例性的,如图9所示,基于视频片段的GPS信息,可以将视频在地图上形成目标车辆的视频轨迹。
下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以独立实现,也可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。
图10为本申请实施例提供的一种视频拼接方法的流程示意图,如图10所示,该方法包括:
S101:获取与目标车辆相关的多个视频片段。
S102:根据拼接视频帧拼接第一视频片段和第二视频片段;其中,第一视频片段和第二视频片段为多个视频片段中时间相邻的两个视频片段,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频,第二边界视频为第二视频片段的起始位置处的多帧视频。
本申请实施例中,S101至S102的具体实现可以参照上述实施例的记载,在此不再赘述。本申请实施例在对多个视频片段拼接时,可以根据视频对象中目标车辆的位置信息,对视频片段中的边界视频进行高精度拼接,提升视频拼接的准确度,可以得到较好的视频拼接效果。
通过上述对本申请方案的介绍,可以理解的是,上述实现各设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件单元。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
如图11所示,本申请实施例一种视频拼接的装置,该视频拼接的装置包括处理器1100、存储器1101和收发机1102;
处理器1100负责管理总线架构和通常的处理,存储器1101可以存储处理器1100在执行操作时所使用的数据。收发机1102用于在处理器1100的控制下接收和发送数据与存储器1101进行数据通信。
总线架构可以包括任意数量的互联的总线和桥,具体由处理器1100代表的一个或多个处理器和存储器1101代表的存储器的各种电路链接在一起。总线架构还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路链接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口提供接口。处理器1100负责管理总线架构和通常的处理,存储器1101可以存储处理器1100在执行操作时所 使用的数据。
本申请实施例揭示的流程,可以应用于处理器1100中,或者由处理器1100实现。在实现过程中,视频拼接的流程的各步骤可以通过处理器1100中的硬件的集成逻辑电路或者软件形式的指令完成。处理器1100可以是通用处理器、数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1101,处理器1100读取存储器1101中的信息,结合其硬件完成信号处理流程的步骤。
本申请一种可选的方式,所述处理器1100用于读取存储器1101中的程序并以执行如图10所示的S101-S102中的方法流程。
如图12所示,本申请提供一种视频拼接的装置,所述装置包括收发模块1200和处理模块1201。
所述收发模块1200,用于获取与目标车辆相关的多个视频片段。
所述处理模块1201,用于根据拼接视频帧拼接第一视频片段和第二视频片段;其中,第一视频片段和第二视频片段为多个视频片段中时间相邻的两个视频片段,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频,第二边界视频为第二视频片段的起始位置处的多帧视频。本申请实施例中,在对多个视频片段拼接时,可以根据视频对象中目标车辆的位置信息,对视频片段中的边界视频进行高精度拼接,提升视频拼接的准确度,可以得到较好的视频拼接效果根据。
一种可能的实现方式中,视频片段中包括:视频片段中视频帧的拍摄时间,以及目标车辆在视频帧中的定位位置信息;拼接视频帧具体为根据第一边界视频中目标车辆在多帧视频帧中的定位位置信息预测第一位置信息,从第二边界视频中选择与第一位置信息距离最小的视频帧。这样,可以根据在第二边界视频中,选择与目标车辆的预测位置最接近的视频帧作为拼接视频帧,因为目标车辆的位置通常不会突变,因此,本申请实施例中,选择与目标车辆的预测位置最接近的视频帧作为拼接视频帧可以得到较为准确的拼接视频帧,从而可以达到较好的拼接效果。
一种可能的实现方式中,处理模块,具体用于在第一边界视频中确定至少三个视频帧,至少三个视频帧中包括第一边界视频的最后一个视频帧;根据至少三个视频帧的拍摄时间和至少三个视频帧中目标车辆的定位位置信息,计算目标车辆的速度信息和方向信息;速度信息和方向信息用于预测第一位置信息。这样,可以基于目标车辆的速度和方向预测得到较为准确的目标位置的第一位置信息,从而后续可以再第二边界视频中选择准确的拼接视频帧。
一种可能的实现方式中,第一位置信息为第一边界视频的最后一个视频帧中目标 车辆的定位位置信息在方向信息指示的方向增加第一值得到的;第一值与速度信息负相关。因为目标车辆的速度越高,在一段时间内位置的变化越大,预测目标车辆的位置时,如果第一值较大,容易出现预测不准确的现象,因此,设置第一值与速度信息负相关,可以得到较准确的目标位置的第一位置信息,从而后续可以再第二边界视频中选择准确的拼接视频帧。
一种可能的实现方式中,目标车辆的定位位置信息包括:基于目标车辆的行驶速度与目标车辆的方向盘夹角辅助确认的位置信息。这样,可以根据方向盘夹角辅助确定较为准确的目标车辆的位置信息。
一种可能的实现方式中,目标车辆的定位位置信息包括:基于视频帧的拍摄视角进行偏差补偿得到的位置信息。这样,基于视频帧的拍摄视角进行偏差补偿可以得到较为准确的位置信息。
一种可能的实现方式中,收发模块,具体用于接收视频合成请求;视频合成请求包括目标车辆的标识、轨迹信息和时间信息;获取与轨迹信息和时间信息相关且包含目标车辆的源视频;对源视频中确定包含目标车辆相关的多个视频片段。这样,可以基于用户的请求合成目标车辆的相关视频,更加符合用户需求。
一种可能的实现方式中,收发模块,具体用于获取与轨迹信息的位置差在第一阈值范围内、与时间信息的时间差在第二阈值范围内且包含目标车辆的源视频。这样,可以得到针对目标车辆的较为有效的源视频数据,避免源视频数据中存在无用视频对后续计算资源造成的浪费。
一种可能的实现方式中,第一阈值范围与目标车辆的速度负相关,和/或,第二阈值范围与目标车辆的速度负相关。
一种可能的实现方式中,源视频为其他车辆拍摄的视频和/或道路设备拍摄的视频。这样,可以得到多角度的目标车辆的视频,从而可以得到较为丰富的拼接素材。
一种可能的实现方式中,处理模块,还用于对源视频过滤,得到包含目标车辆的多个第一视频片段;在多个第一视频片段中存在拍摄时间重叠比例大于第三阈值的视频片段的情况下,对拍摄时间重叠比例大于第三阈值的视频片段进行质量打分;根据质量打分,在拍摄时间重叠比例大于第三阈值的视频片段中选择一个视频片段。这样,可以将质量较好的视频片段作为拼接视频片段,从而得到质量较好的拼接视频。
一种可能的实现方式中,处理模块,具体用于根据拍摄时间重叠大于第三阈值的视频片段中目标车辆的比例和居中程度,对拍摄时间重叠大于第三阈值的视频片段进行质量打分,目标车辆的比例为目标车辆的大小占视频帧大小的比值。
一种可能的实现方式中,多个视频片段为根据多个视频片段的拍摄时间排序的视频片段;在多个视频片段中存在相邻的视频片段的拍摄时间间隔大于第四阈值的情况下,在相邻的视频片段中插入预设视频。这样,在两个时间间隔较大的视频片段拼接时,可以插入预设视频,避免拼接后场景转换位置跳跃过大,增加拼接后视频的连贯性。
可能的实现方式中,上述图12所示的收发模块1200和处理模块1201的功能可以由处理器1100运行存储器1101中的程序执行,或者由处理器1100单独执行。
如图13所示,本申请提供一种车辆,所述装置包括至少一个摄像器1301,至少一个存储器1302,至少一个收发器1303以及至少一个处理器1304;
所述摄像器1301,用于获取视频片段。
所述存储器1302,用于存储一个或多个程序以及数据信息;其中所述一个或多个程序包括指令。
所述收发器1303,用于与所述车辆中的通讯设备进行数据传输,以及用于与云端进行数据传输,以获取与目标车辆相关的多个视频片段。
所述处理器1304,用于根据拼接视频帧拼接第一视频片段和第二视频片段;其中,第一视频片段和第二视频片段为多个视频片段中时间相邻的两个视频片段,拼接视频帧为根据目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,第一边界视频为第一视频片段的结束位置处的多帧视频,第二边界视频为第二视频片段的起始位置处的多帧视频。
在一种实现方式中,所述车辆还包括显示屏1305以及语音播报装置1306。
所述显示屏1305,用于显示拼接后的视频。
所述语音播报装置1306,用于播报拼接后的视频的音频。
在一些可能的实施方式中,本申请实施例提供的视频拼接的方法的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序代码在计算机设备上运行时,所述程序代码用于使所述计算机设备执行本说明书中描述的根据本申请各种示例性实施方式的视频拼接的方法中的步骤。
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
根据本申请的实施方式的用于视频拼接的程序产品,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在服务器设备上运行。然而,本申请的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被通信传输、装置或者器件使用或者与其结合使用。
可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由周期网络动作系统、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于——无线、有线、光缆、RF等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常 规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算设备,或者,可以连接到外部计算设备。
本申请实施例针对视频拼接的方法还提供一种计算设备可读存储介质,即断电后内容不丢失。该存储介质中存储软件程序,包括程序代码,当所述程序代码在计算设备上运行时,该软件程序在被一个或多个处理器读取并执行时可实现本申请实施例上面任何一种视频拼接的方案。
本申请实施例还提供一种电子设备,在采用对应各个功能划分各个功能模块的情况下,该电子设备包括:处理模块,用于支持视频拼接装置执行上述实施例中的步骤,例如可以执行S101至S102的操作,或者本申请实施例所描述的技术的其他过程。
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
当然,视频拼接装置包括但不限于上述所列举的单元模块。并且,上述功能单元的具体所能够实现的功能也包括但不限于上述实例所述的方法步骤对应的功能,电子设备的其他单元的详细描述可以参考其所对应方法步骤的详细描述,本申请实施例这里不予赘述。
在采用集成的单元的情况下,上述实施例中所涉及的电子设备可以包括:处理模块、存储模块和通信模块。存储模块,用于保存电子设备的程序代码和数据。该通信模块用于支持电子设备与其他网络实体的通信,以实现电子设备的通话,数据交互,Internet访问等功能。
其中,处理模块用于对电子设备的动作进行控制管理。处理模块可以是处理器或控制器。通信模块可以是收发器、RF电路或通信接口等。存储模块可以是存储器。
进一步的,该电子设备还可以包括输入模块和显示模块。显示模块可以是屏幕或显示器。输入模块可以是触摸屏,语音输入装置,或指纹传感器等。
以上参照示出根据本申请实施例的方法、装置(系统)和/或计算机程序产品的框图和/或流程图描述本申请。应理解,可以通过计算机程序指令来实现框图和/或流程图示图的一个块以及框图和/或流程图示图的块的组合。可以将这些计算机程序指令提供给通用计算机、专用计算机的处理器和/或其它可编程数据处理装置,以产生机器,使得经由计算机处理器和/或其它可编程数据处理装置执行的指令创建用于实现框图和/或流程图块中所指定的功能/动作的方法。
相应地,还可以用硬件和/或软件(包括固件、驻留软件、微码等)来实施本申请。更进一步地,本申请可以采取计算机可使用或计算机可读存储介质上的计算机程序产品的形式,其具有在介质中实现的计算机可使用或计算机可读程序代码,以由指令执行系统来使用或结合指令执行系统而使用。在本申请上下文中,计算机可使用或计算机可读介质可以是任意介质,其可以包含、存储、通信、传输、或传送程序,以由指令执行系统、装置或设备使用,或结合指令执行系统、装置或设备使用。
本申请结合多个流程图详细描述了多个实施例,但应理解,这些流程图及其相应的实施例的相关描述仅为便于理解而示例,不应对本申请构成任何限定。各流程图中的每一个步骤并不一定是必须要执行的,例如有些步骤是可以跳过的。并且,各个步骤的执行顺序也不是固定不变的,也不限于图中所示,各个步骤的执行顺序应以其功能和内在逻辑确定。
本申请描述的多个实施例之间可以任意组合或步骤之间相互交叉执行,各个实施例的执行顺序和各个实施例的步骤之间的执行顺序均不是固定不变的,也不限于图中所示,各个实施例的执行顺序和各个实施例的各个步骤的交叉执行顺序应以其功能和内在逻辑确定。
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包括这些改动和变型在内。

Claims (19)

  1. 一种视频拼接的方法,其特征在于,包括:
    获取与目标车辆相关的多个视频片段;
    根据拼接视频帧拼接第一视频片段和第二视频片段;
    其中,所述第一视频片段和所述第二视频片段为所述多个视频片段中时间相邻的两个视频片段,所述拼接视频帧为根据所述目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,所述第一边界视频为所述第一视频片段的结束位置处的多帧视频,所述第二边界视频为所述第二视频片段的起始位置处的多帧视频。
  2. 根据权利要求1所述的方法,其特征在于,所述视频片段中包括所述视频片段中视频帧的拍摄时间,以及所述目标车辆在所述视频帧中的定位位置信息;
    所述拼接视频帧具体为根据所述第一边界视频中所述目标车辆在多帧视频帧中的定位位置信息预测第一位置信息,从所述第二边界视频中选择的与所述第一位置信息距离最小的视频帧。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述第一边界视频中所述目标车辆在多帧视频帧中的定位位置信息预测第一位置信息,包括:
    在所述第一边界视频中确定至少三个视频帧,所述至少三个视频帧中包括所述第一边界视频的最后一个视频帧;
    根据所述至少三个视频帧的拍摄时间和所述至少三个视频帧中所述目标车辆的定位位置信息,计算所述目标车辆的速度信息和方向信息;其中,所述速度信息和所述方向信息用于预测所述第一位置信息。
  4. 根据权利要求3所述的方法,其特征在于,所述第一位置信息为所述第一边界视频的最后一个视频帧中所述目标车辆的定位位置信息在所述方向信息指示的方向增加第一值得到的,所述第一值与所述速度信息负相关。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述目标车辆的定位位置信息包括:基于所述目标车辆的行驶速度与所述目标车辆的方向盘夹角辅助确认的位置信息。
  6. 根据权利要求1-4任一项所述的方法,其特征在于,所述目标车辆的定位位置信息包括:基于所述视频帧的拍摄视角进行偏差补偿得到的位置信息。
  7. 根据权利要求1所述的方法,其特征在于,所述获取与目标车辆相关的多个视频片段,包括:
    接收视频合成请求,所述视频合成请求包括所述目标车辆的标识、轨迹信息和时间信息;
    获取与所述轨迹信息和所述时间信息相关且包含所述目标车辆的源视频;
    在所述源视频中确定包含所述目标车辆相关的多个视频片段。
  8. 根据权利要求7所述的方法,其特征在于,获取与所述轨迹信息和所述时间信息相关且包含所述目标车辆的源视频,包括:
    获取与所述轨迹信息的位置差在第一阈值范围内、与所述时间信息的时间差在第 二阈值范围内且包含所述目标车辆的源视频。
  9. 根据权利要求8所述的方法,其特征在于,所述第一阈值范围与所述目标车辆的速度负相关,和/或,所述第二阈值范围与所述目标车辆的速度负相关。
  10. 根据权利要求8或9所述的方法,其特征在于,所述源视频包括:其他车辆拍摄的视频和/或道路设备拍摄的视频。
  11. 根据权利要求7所述的方法,其特征在于,所述在所述源视频中确定包含所述目标车辆相关的多个视频片段,包括:
    对所述源视频过滤,得到包含所述目标车辆的多个第一视频片段;
    在所述多个第一视频片段中存在拍摄时间重叠比例大于第三阈值的视频片段的情况下,对所述拍摄时间重叠比例大于第三阈值的视频片段进行质量打分;
    根据所述质量打分,在所述拍摄时间重叠比例大于第三阈值的视频片段中选择一个视频片段。
  12. 根据权利要求11所述的方法,其特征在于,所述对所述拍摄时间重叠大于第三阈值的视频片段进行质量打分,包括:
    根据所述拍摄时间重叠大于第三阈值的视频片段中所述目标车辆的比例和居中程度,对所述拍摄时间重叠大于第三阈值的视频片段进行质量打分,所述目标车辆的比例为所述目标车辆的大小占视频帧大小的比值。
  13. 根据权利要求1-12任一项所述的方法,其特征在于,所述多个视频片段为根据所述多个视频片段的拍摄时间排序的视频片段;
    在所述多个视频片段中存在相邻的视频片段的拍摄时间间隔大于第四阈值的情况下,在所述相邻的视频片段中插入预设视频。
  14. 一种视频拼接的装置,其特征在于,包括处理器和接口电路,所述接口电路用于接收代码指令并传输至所述处理器;所述处理器用于运行所述代码指令,以执行如权利要求1-13任一项所述的方法。
  15. 一种电子设备,其特征在于,包括:一个或多个处理器、收发器、存储器和接口电路;所述一个或多个处理器、所述收发器、所述存储器和和所述接口电路通过一个或多个通信总线通信;所述接口电路用于与其它装置通信,一个或多个计算机程序被存储在所述存储器中,并被配置为被所述一个或多个处理器或所述收发器执行以使得所述电子设备执行如权利要求1-13任一项所述的方法。
  16. 一种车辆,其特征在于,包括:至少一个摄像器,至少一个存储器,至少一个收发器以及至少一个处理器;
    所述摄像器,用于获取视频片段;
    所述存储器,用于存储一个或多个程序以及数据信息;其中所述一个或多个程序包括指令;
    所述收发器,用于与所述车辆中的通讯设备进行数据传输,以及用于与云端进行数据传输,以获取与目标车辆相关的多个视频片段;
    所述处理器,用于根据拼接视频帧拼接第一视频片段和第二视频片段;其中,所述第一视频片段和所述第二视频片段为所述多个视频片段中时间相邻的两个视频片段,所述拼接视频帧为根据所述目标车辆在第一边界视频中的定位位置信息,在第二边界 视频中确定的用于视频拼接的视频帧,所述第一边界视频为所述第一视频片段的结束位置处的多帧视频,所述第二边界视频为所述第二视频片段的起始位置处的多帧视频。
  17. 根据权利要求16所述的装置,其特征在于,所述车辆还包括显示屏,和语音播报装置;
    所述显示屏,用于显示拼接后的视频;
    所述语音播报装置,用于播报所述拼接后的视频的音频。
  18. 一种视频拼接系统,其特征在于,所述视频拼接系统包括:视频片段获取单元和拼接单元;
    其中,所述视频片段获取单元,用于获取与目标车辆相关的多个视频片段;
    所述拼接单元,用于根据拼接视频帧拼接第一视频片段和第二视频片段;其中,所述第一视频片段和所述第二视频片段为所述多个视频片段中时间相邻的两个视频片段,所述拼接视频帧为根据所述目标车辆在第一边界视频中的定位位置信息,在第二边界视频中确定的用于视频拼接的视频帧,所述第一边界视频为所述第一视频片段的结束位置处的多帧视频,所述第二边界视频为所述第二视频片段的起始位置处的多帧视频。
  19. 一种可读计算机存储产品,其特征在于,所述可读计算机存储产品用于存储计算机程序,所述计算机程序用于实现如权利要求1-13任一项所述的方法。
PCT/CN2020/104852 2020-07-27 2020-07-27 视频拼接的方法、装置及系统 WO2022020996A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202080004396.0A CN112544071B (zh) 2020-07-27 2020-07-27 视频拼接的方法、装置及系统
EP20946683.8A EP4030751A4 (en) 2020-07-27 2020-07-27 METHOD, DEVICE AND SYSTEM FOR VIDEO ASSEMBLY
PCT/CN2020/104852 WO2022020996A1 (zh) 2020-07-27 2020-07-27 视频拼接的方法、装置及系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/104852 WO2022020996A1 (zh) 2020-07-27 2020-07-27 视频拼接的方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2022020996A1 true WO2022020996A1 (zh) 2022-02-03

Family

ID=75017315

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/104852 WO2022020996A1 (zh) 2020-07-27 2020-07-27 视频拼接的方法、装置及系统

Country Status (3)

Country Link
EP (1) EP4030751A4 (zh)
CN (1) CN112544071B (zh)
WO (1) WO2022020996A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114401378A (zh) * 2022-03-25 2022-04-26 北京壹体科技有限公司 一种用于田径项目的多段视频自动拼接方法、系统和介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4030751A4 (en) * 2020-07-27 2022-11-23 Huawei Technologies Co., Ltd. METHOD, DEVICE AND SYSTEM FOR VIDEO ASSEMBLY
WO2022204925A1 (zh) * 2021-03-30 2022-10-06 华为技术有限公司 一种图像的获取方法以及相关设备
CN113099266B (zh) * 2021-04-02 2023-05-26 云从科技集团股份有限公司 基于无人机pos数据的视频融合方法、系统、介质及装置
CN113473224B (zh) * 2021-06-29 2023-05-23 北京达佳互联信息技术有限公司 视频处理方法、装置、电子设备及计算机可读存储介质
CN114245033A (zh) * 2021-11-03 2022-03-25 浙江大华技术股份有限公司 视频合成方法及装置
CN115297342A (zh) * 2022-08-03 2022-11-04 广州文远知行科技有限公司 多摄像头视频处理方法、装置、存储介质及计算机设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101304490A (zh) * 2008-06-20 2008-11-12 北京六维世纪网络技术有限公司 一种拼合视频的方法和装置
EP3314609A1 (en) * 2016-05-04 2018-05-02 Canon Europa N.V. Method and apparatus for generating a composite video stream from a plurality of video segments
CN109788212A (zh) * 2018-12-27 2019-05-21 北京奇艺世纪科技有限公司 一种分段视频的处理方法、装置、终端和存储介质
CN111385641A (zh) * 2018-12-29 2020-07-07 深圳Tcl新技术有限公司 一种视频处理方法、智能电视及存储介质
CN112544071A (zh) * 2020-07-27 2021-03-23 华为技术有限公司 视频拼接的方法、装置及系统

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4321357B2 (ja) * 2004-05-27 2009-08-26 株式会社デンソー 駐車支援装置
US7688229B2 (en) * 2007-04-30 2010-03-30 Navteq North America, Llc System and method for stitching of video for routes
US9224425B2 (en) * 2008-12-17 2015-12-29 Skyhawke Technologies, Llc Time stamped imagery assembly for course performance video replay
TWI405457B (zh) * 2008-12-18 2013-08-11 Ind Tech Res Inst 應用攝影機換手技術之多目標追蹤系統及其方法,與其智慧節點
CN103379307A (zh) * 2012-04-13 2013-10-30 何磊 基于无线定位的视频轨迹追踪监控与检索回放系统
JP5804007B2 (ja) * 2013-09-03 2015-11-04 カシオ計算機株式会社 動画生成システム、動画生成方法及びプログラム
US20150155009A1 (en) * 2013-12-03 2015-06-04 Nokia Corporation Method and apparatus for media capture device position estimate- assisted splicing of media
CN104200651B (zh) * 2014-09-10 2017-05-10 四川九洲电器集团有限责任公司 基于dsrc和北斗卫星的车载通信装置和方法
WO2016138121A1 (en) * 2015-02-24 2016-09-01 Plaay, Llc System and method for creating a sports video
WO2017049612A1 (en) * 2015-09-25 2017-03-30 Intel Corporation Smart tracking video recorder
US10592771B2 (en) * 2016-12-30 2020-03-17 Accenture Global Solutions Limited Multi-camera object tracking
RU2670429C1 (ru) * 2017-11-24 2018-10-23 ООО "Ай Ти Ви групп" Системы и способы отслеживания движущихся объектов на видеоизображении
CN111405196B (zh) * 2019-12-31 2022-08-02 智慧互通科技股份有限公司 一种基于视频拼接的车辆管理的方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101304490A (zh) * 2008-06-20 2008-11-12 北京六维世纪网络技术有限公司 一种拼合视频的方法和装置
EP3314609A1 (en) * 2016-05-04 2018-05-02 Canon Europa N.V. Method and apparatus for generating a composite video stream from a plurality of video segments
CN109788212A (zh) * 2018-12-27 2019-05-21 北京奇艺世纪科技有限公司 一种分段视频的处理方法、装置、终端和存储介质
CN111385641A (zh) * 2018-12-29 2020-07-07 深圳Tcl新技术有限公司 一种视频处理方法、智能电视及存储介质
CN112544071A (zh) * 2020-07-27 2021-03-23 华为技术有限公司 视频拼接的方法、装置及系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114401378A (zh) * 2022-03-25 2022-04-26 北京壹体科技有限公司 一种用于田径项目的多段视频自动拼接方法、系统和介质

Also Published As

Publication number Publication date
CN112544071B (zh) 2021-09-14
EP4030751A1 (en) 2022-07-20
EP4030751A4 (en) 2022-11-23
CN112544071A (zh) 2021-03-23

Similar Documents

Publication Publication Date Title
WO2022020996A1 (zh) 视频拼接的方法、装置及系统
US20220017014A1 (en) Adaptive Rearview Mirror Adjustment Method and Apparatus
WO2021204189A1 (zh) 一种驾驶性能调节方法和装置
CN114842075B (zh) 数据标注方法、装置、存储介质及车辆
EP4180272A1 (en) Picture display method, intelligent vehicle, storage medium, and picture display device
CN115100377A (zh) 地图构建方法、装置、车辆、可读存储介质及芯片
CN114882464A (zh) 多任务模型训练方法、多任务处理方法、装置及车辆
CN112810603A (zh) 定位方法和相关产品
WO2021217575A1 (zh) 用户感兴趣对象的识别方法以及识别装置
CN115221151B (zh) 车辆数据的传输方法、装置、车辆、存储介质及芯片
CN115297461B (zh) 数据交互方法、装置、车辆、可读存储介质及芯片
CN114782638B (zh) 生成车道线的方法、装置、车辆、存储介质及芯片
CN115164910B (zh) 行驶路径生成方法、装置、车辆、存储介质及芯片
CN114771539B (zh) 车辆变道决策方法、装置、存储介质及车辆
CN115056784A (zh) 车辆控制方法、装置、车辆、存储介质及芯片
CN114537450A (zh) 车辆控制方法、装置、介质、芯片、电子设备及车辆
CN115042814A (zh) 交通灯状态识别方法、装置、车辆及存储介质
CN114863717A (zh) 车位推荐方法、装置、存储介质及车辆
WO2021057662A1 (zh) 地图级别指示方法、地图级别获取方法及相关产品
CN111775962B (zh) 自动行驶策略的确定方法及装置
CN114822216B (zh) 生成车位地图的方法、装置、车辆、存储介质及芯片
CN114842454B (zh) 障碍物检测方法、装置、设备、存储介质、芯片及车辆
WO2023010267A1 (zh) 确定泊出方向的方法和装置
CN115115822A (zh) 车端图像处理方法、装置、车辆、存储介质及芯片
CN115123304A (zh) 故障追踪方法、装置、介质及芯片

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20946683

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020946683

Country of ref document: EP

Effective date: 20220414

NENP Non-entry into the national phase

Ref country code: DE