CN109246415A - The method and device of video processing - Google Patents

The method and device of video processing Download PDF

Info

Publication number
CN109246415A
CN109246415A CN201710347063.8A CN201710347063A CN109246415A CN 109246415 A CN109246415 A CN 109246415A CN 201710347063 A CN201710347063 A CN 201710347063A CN 109246415 A CN109246415 A CN 109246415A
Authority
CN
China
Prior art keywords
video
omnidirectional
omnidirectional video
pixel
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710347063.8A
Other languages
Chinese (zh)
Other versions
CN109246415B (en
Inventor
李炜明
张文波
王再冉
刘洋
汪昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Samsung Telecom R&D Center
Beijing Samsung Telecommunications Technology Research Co Ltd
Samsung Electronics Co Ltd
Original Assignee
Beijing Samsung Telecommunications Technology Research Co Ltd
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Samsung Telecommunications Technology Research Co Ltd, Samsung Electronics Co Ltd filed Critical Beijing Samsung Telecommunications Technology Research Co Ltd
Priority to CN201710347063.8A priority Critical patent/CN109246415B/en
Publication of CN109246415A publication Critical patent/CN109246415A/en
Application granted granted Critical
Publication of CN109246415B publication Critical patent/CN109246415B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The embodiment of the invention provides a kind of method and devices of video processing, this method comprises: obtaining first omnidirectional's video and second omnidirectional's video, first omnidirectional's video and second omnidirectional's video have a stereoscopic parallax in a first direction, first direction is first omnidirectional's video and second omnidirectional's video be unfolded according to longitude and latitude after corresponding column direction;According to first omnidirectional's video and second omnidirectional's video, determine one or two third omnidirectional videos, wherein second omnidirectional's video and third omnidirectional video have the stereoscopic parallax in second direction, wherein if it is determined that a third omnidirectional video, then second omnidirectional's video and third omnidirectional video have the stereoscopic parallax in second direction;If it is determined that Liang Ge third omnidirectional video, Ze Liangge third omnidirectional's video has stereoscopic parallax in a second direction;Second direction is corresponding line direction after first omnidirectional's video and second omnidirectional's video are unfolded according to longitude and latitude.

Description

The method and device of video processing
Technical field
The present invention relates to technical field of video processing, specifically, the present invention relates to the methods and dress of a kind of processing of video It sets.
Background technique
With the development of information technology, multimedia technology is also with development, three-dimensional (full name in English: Three Dimensional, english abbreviation: 3D) omnidirectional's technique for taking also develops therewith, and wherein 3D omnidirectional technique for taking has and widely answers It can be applied to virtual reality (full name in English: Virtual Reality, English contracting with prospect, such as 3D omnidirectional technique for taking Write: VR) fields such as meeting, VR live streaming, wearable device, navigation system, robot and unmanned plane.
Wherein, 3D omnidirectional technique for taking is applied to 3D omnidirectional video capture device, and current 3D omnidirectional video acquisition is set It is standby to be mounted on a spherical surface or anchor ring for multiple video capture devices installed on rounded bulbous object, as Fig. 1 a with And shown in 1b, each video capture device carries out video acquisition in a respective direction respectively, and to collecting in each direction Video handled, obtain 3D omnidirectional video.However multiple video capture devices in existing 3D omnidirectional video capture device (such as camera) is generally arranged on same level direction, that is, the video being only capable of in acquisition horizontal direction, it is therefore desirable to More video capture device is installed in all directions of rounded bulbous object, to carry out video capture in all directions, Therefore the volume of existing 3D omnidirectional acquisition equipment is larger, not portable, with high costs, is very difficult to apply in individual so as to cause it In the living scene of user, it is difficult to complete the application scenarios such as personal live streaming, daily life record, motion photography, and then lead to it Application scenarios are narrow, and the experience of user is lower.
Summary of the invention
To overcome above-mentioned technical problem or at least being partially solved above-mentioned technical problem, spy proposes following technical scheme:
The embodiment of the present invention provides a kind of method of video processing according to one aspect, comprising:
First omnidirectional's video and second omnidirectional's video are obtained, first omnidirectional video and second omnidirectional video have Have a stereoscopic parallax in a first direction, the first direction be first omnidirectional video and second omnidirectional video according to Corresponding column direction after longitude and latitude expansion;
According to first omnidirectional video and second omnidirectional video, determine that one or two third omnidirectionals regard Frequently, wherein if it is determined that a third omnidirectional video, then second omnidirectional video and third omnidirectional video have the The stereoscopic parallax in two directions;If it is determined that Liang Ge third omnidirectional video, Ze Liangge third omnidirectional's video has in a second direction Stereoscopic parallax;The second direction is after first omnidirectional video and second omnidirectional video are unfolded according to longitude and latitude Corresponding line direction.
The embodiment of the present invention additionally provides a kind of device of video processing according to other side, comprising:
Obtain module, for obtaining first omnidirectional's video and second omnidirectional's video, first omnidirectional video with it is described Second omnidirectional's video has a stereoscopic parallax in a first direction, and the first direction is first omnidirectional video and described the Corresponding column direction after two omnidirectional's videos are unfolded according to longitude and latitude;
Determining module, is also used to according to first omnidirectional video and second deep video, determine one or Liang Ge third omnidirectional video, wherein if it is determined that a third omnidirectional video, then second omnidirectional video and the third are complete There is the stereoscopic parallax in second direction to video;If it is determined that Liang Ge third omnidirectional video, Ze Liangge third omnidirectional's video tool There is stereoscopic parallax in a second direction;The second direction is that first omnidirectional video and second omnidirectional video are pressed According to corresponding line direction after longitude and latitude expansion.
The present invention provides a kind of method and devices of video processing, compared with existing mode, obtain two in the present invention It is a in a first direction with omnidirectional's video of stereoscopic parallax, respectively first omnidirectional's video and second omnidirectional's video, then root According to first omnidirectional's video and second omnidirectional's video, third omnidirectional video is determined, wherein second omnidirectional's video and third omnidirectional regard Frequency has the stereoscopic parallax in second direction, i.e., only need to get two in the present invention has stereoscopic parallax in a first direction Omnidirectional's video can be obtained by omnidirectional's Video Quality Metric of the stereoscopic parallax in the stereoscopic parallax to second direction on first direction Be located at the third omnidirectional video on same line direction to second omnidirectional's video, or obtain two be located on same line direction the Three omnidirectional's videos, so for second omnidirectional's video and third omnidirectional video be integrated as user present three-dimensional omnidirectional's video effect or Person is that Liang Ge third omnidirectional video is integrated as the three-dimensional omnidirectional's video effect of user's presentation, provides possible and premise.Together When, it is only necessary to video acquisition can be completed in Liang Ge omnidirectional video capture device, this device structure can substantially reduce omnidirectional's video Acquire equipment volume, reduce cost, based on its portable, small and exquisite formula, cost is relatively low the features such as, omnidirectional's video can be increased and adopted Collect the applicable application scenarios of equipment, to promote user experience.
The additional aspect of the present invention and advantage will be set forth in part in the description, these will become from the following description Obviously, or practice through the invention is recognized.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 a is the schematic diagram that a kind of existing 3D omnidirectional acquires equipment;
Fig. 1 b is the schematic diagram that existing another kind 3D omnidirectional acquires equipment;
Fig. 2 is the method flow diagram that a kind of video of the embodiment of the present invention is handled;
Fig. 3 a is the schematic diagram of one of embodiment of the present invention omnidirectional's video capture device;
Fig. 3 b is a kind of showing for omnidirectional's video equipment that the video capture device for being located at same level direction by two forms It is intended to;
Fig. 3 c is a kind of omnidirectional's video equipment being made of multiple video capture devices on same vertical direction Schematic diagram;
Fig. 3 d is the schematic diagram of another omnidirectional video capture device in the embodiment of the present invention;
Fig. 4 is a kind of method schematic diagram of synchronized timestamp in the embodiment of the present invention;
Fig. 5 a, which is a kind of in the embodiment of the present invention, to be same water positioned at the Liang Ge omnidirectional Video Quality Metric of same vertical direction Square to Liang Ge omnidirectional video method schematic diagram;
Fig. 5 b is that another kind will be same positioned at the Liang Ge omnidirectional Video Quality Metric of same vertical direction in the embodiment of the present invention The method schematic diagram of the Liang Ge omnidirectional video of horizontal direction;
Fig. 6 is that there are the schematic diagrames in black hole region in the upper virtual omnidirectional video generated in the embodiment of the present invention;
Fig. 7 is the schematic diagram for fill out hole in the embodiment of the present invention treated virtual omnidirectional video;
Fig. 8 is a kind of method schematic diagram that training sample generates in the embodiment of the present invention;
Fig. 9 is the method schematic diagram that another training sample generates in the embodiment of the present invention;
Figure 10 is a kind of schematic device of video processing in the embodiment of the present invention.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, and for explaining only the invention, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in specification of the invention Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition Other one or more features, integer, step, operation, element, component and/or their group.It should be understood that when we claim member Part is " connected " or when " coupled " to another element, it can be directly connected or coupled to other elements, or there may also be Intermediary element.In addition, " connection " used herein or " coupling " may include being wirelessly connected or wirelessly coupling.It is used herein to arrange Diction "and/or" includes one or more associated wholes for listing item or any cell and all combinations.
Those skilled in the art of the present technique are appreciated that unless otherwise defined, all terms used herein (including technology art Language and scientific term), there is meaning identical with the general understanding of those of ordinary skill in fields of the present invention.Should also Understand, those terms such as defined in the general dictionary, it should be understood that have in the context of the prior art The consistent meaning of meaning, and unless idealization or meaning too formal otherwise will not be used by specific definitions as here To explain.
Those skilled in the art of the present technique are appreciated that " terminal " used herein above, " terminal device " both include wireless communication The equipment of number receiver, only has the equipment of the wireless signal receiver of non-emissive ability, and including receiving and emitting hardware Equipment, have on bidirectional communication link, can carry out two-way communication reception and emit hardware equipment.This equipment It may include: honeycomb or other communication equipments, shown with single line display or multi-line display or without multi-line The honeycomb of device or other communication equipments;PCS (Personal Communications Service, PCS Personal Communications System), can With combine voice, data processing, fax and/or communication ability;PDA (Personal Digital Assistant, it is personal Digital assistants), it may include radio frequency receiver, pager, the Internet/intranet access, web browser, notepad, day It goes through and/or GPS (Global Positioning System, global positioning system) receiver;Conventional laptop and/or palm Type computer or other equipment, have and/or the conventional laptop including radio frequency receiver and/or palmtop computer or its His equipment." terminal " used herein above, " terminal device " can be it is portable, can transport, be mounted on the vehicles (aviation, Sea-freight and/or land) in, or be suitable for and/or be configured in local runtime, and/or with distribution form, operate in the earth And/or any other position operation in space." terminal " used herein above, " terminal device " can also be communication terminal, on Network termination, music/video playback terminal, such as can be PDA, MID (Mobile Internet Device, mobile Internet Equipment) and/or mobile phone with music/video playing function, it is also possible to the equipment such as smart television, set-top box.
Embodiment one
The embodiment of the invention provides a kind of methods of video processing, as shown in Figure 2, comprising:
Step 201 obtains first omnidirectional's video and second omnidirectional's video.
Wherein first omnidirectional's video and second omnidirectional's video have stereoscopic parallax in a first direction.
It is selected when first omnidirectional's video and second omnidirectional's video carry out longitude and latitude expansion respectively for the embodiment of the present invention Warp pole line direction and two omnidirectional's videos (first omnidirectional's video and second omnidirectional's video) optical center connection direction weight It closes, and latitude selected by two videos is overlapped with origin, the line direction that rear video is unfolded corresponds to weft direction, and column direction corresponds to warp Direction, and first direction is corresponding column direction after first omnidirectional's video and second omnidirectional's video are unfolded according to longitude and latitude.
Wherein, first omnidirectional's video can be upper viewpoint omnidirectional video, and second omnidirectional's video can regard for lower viewpoint omnidirectional Frequently;Or first omnidirectional's video be lower viewpoint omnidirectional video, second omnidirectional's video be upper viewpoint omnidirectional video.Implement in the present invention In example without limitation.
In the embodiment of the present invention can by omnidirectional's video capture device as shown in Figure 3a obtain first omnidirectional's video with And the second omnidirectional's video.
Wherein, omnidirectional's video capture device as shown in Figure 3a may include two videos being located on same vertical direction Equipment is acquired, the video capture device that two of them are located on same vertical direction can be attached by telescopic rod.
For the embodiment of the present invention, omnidirectional's video capture device can also be by two videos on same level direction Equipment composition is acquired, the video capture device that two of them are located on same level direction can also be connected by telescopic rod It connects, as shown in Figure 3b.In embodiments of the present invention, the video capture device conversion side that two can be located in horizontal direction To be suitable for the embodiment of the present invention.
For the embodiment of the present invention, omnidirectional's video capture device may include multiple videos on same vertical direction Equipment is acquired, the video capture device that should be wherein located on same vertical direction can be attached by telescopic rod, and can To pass through any two video capture device, to be suitable for the embodiment of the present invention, as shown in Figure 3c.
For the embodiment of the present invention, omnidirectional's video capture device may include two videos being located on same vertical direction Equipment is acquired, wherein this two video capture devices being located on same vertical direction are embedded in telescopic rod, as shown in Figure 3d.
Wherein, above-mentioned telescopic rod is also possible to the connecting rod with regular length, or has different length for one group Connecting rod, and pass through and manually replaced, or only there is a connecting rod, which can be adjusted complete by manual operation To the length of the connecting rod between video capture device;Or only there is a connecting rod, and omnidirectional's video can be automatically adjusted Acquire the length of equipment connecting rod.
For the embodiment of the present invention, omnidirectional's video capture device shown in Fig. 3 a, 3b, 3c, 3d only need two by stretching The video capture device of bar connection, substantially reduces the volume of omnidirectional's video capture device, reduces cost, portable, small based on its Skilful formula, the features such as cost is relatively low, can increase the applicable application scenarios of omnidirectional's video capture device, to promote user experience.
Optionally, after step 201, further includes: be corrected to first omnidirectional's video and second omnidirectional's video.
Wherein, the step for being corrected to first omnidirectional's video and second omnidirectional's video can specifically include: according to First omnidirectional's video and second omnidirectional's video determine video acquisition corresponding with first omnidirectional's video and second omnidirectional's video The position of equipment and attitude error parameter;Correction parameter is determined according to position and attitude error parameter;According to correction parameter to One omnidirectional's video and second omnidirectional's video are corrected.
Due to generating in assembling process in physical device, two video capture devices being located on same vertical direction can not There can be the error on posture and direction with avoiding, so need to adjust the corresponding calibrating parameters of video capture device It is whole, to reach to collected first omnidirectional video and the second corrected purpose of omnidirectional's video.
For the embodiment of the present invention, image spread carried out respectively to first omnidirectional's video and second omnidirectional's video, and A pixel is extracted in image after the expansion of first omnidirectional's video, and is searched in the image after the expansion of second omnidirectional's video Pixel corresponding with the pixel, determines whether the two pixels are located on same column direction, if being not located at same row side Upwards, then the corresponding calibrating parameters of video capture device being located on same vertical direction to two are adjusted, to protect Above-mentioned two corresponding pixel is demonstrate,proved to be located on same column direction.
Wherein, omnidirectional's video that omnidirectional's video capture device is shot can in the way of spherical surface warp and woof unfolding, from 360 degree of spherical surface images convert to obtain warp and woof unfolding flat image.Specifically, sphere centre defines a three-dimensional system of coordinate O-XYZ, O point is coordinate system central point, and XYZ is respectively three mutually perpendicular directions.Ideally, XY is located at horizontal plane, and Z is along gravity Direction points up, and in the flat image that transformation obtains, the row coordinate pair of image answers vertical guide -90 in spheric coordinate system to spend extremely 90 degree of angular range, the row coordinate pair of image answer 0 degree to 360 degree of horizontal plane in spheric coordinate system of angular range.
For the embodiment of the present invention, consideration system is in the two width omni-directional images sometime shot: upper viewpoint omni-directional image, Lower viewpoint omni-directional image, spheric coordinate system are respectively O1-X1Y1Z1 and O2-X2Y2Z2.Wherein in the ideal case, Z1 and straight The direction line O1O2 is overlapped, and Z2 is overlapped with the direction Z1, and X1 is parallel with X2, and Y1 is parallel with Y2.Wherein in the ideal case, two width are complete After being transformed into warp and woof unfolding image to image, same object o'clock column seat having the same in two width warp and woof unfolding images in space Mark.
Wherein, when detecting in space that same object o'clock has different column coordinate in two width warp and woof unfolding images, The spheric coordinate system for characterizing two video capture devices does not snap to ideally, at this moment needs to set two video acquisitions The spheric coordinate system of at least one video capture device in standby is rotated around center, snaps to it ideally.
For example, the rotation is expressed as around center, around the angle [Ax, Ay, Az] of three change in coordinate axis direction.Wherein, pass through A kind of method of self-calibration calculates [Ax, Ay, Az] automatically.
Optionally, the embodiment of the present invention can also include step a (being not marked in figure), wherein step a, synchronous first omnidirectional Video and the corresponding timestamp of second omnidirectional's video.
Wherein, step a can be located at after the step of being corrected to first omnidirectional's video and second omnidirectional's video, or Person is located at after step 201.In embodiments of the present invention without limitation.
For the embodiment of the present invention, fisrt feature pixel is obtained in first omnidirectional's video, and in second omnidirectional's video Middle determination second feature pixel corresponding with the fisrt feature pixel;Then the fisrt feature pixel and are determined The corresponding motion profile of two feature pixels, and sampling feature is carried out to the corresponding motion profile of fisrt feature pixel and is mentioned (the track turning point of such as direction of motion sudden change) is taken, the first sampled point is obtained, motion profile corresponding to second feature point Similar feature extraction sampling is carried out, the second sampled point corresponding with the first sampled point is obtained;Then on the same time axis Determine whether the first sampled point is aligned with the second sampled point (or being located on same vertical line);It, can be according to if unjustified The corresponding time is adjusted the second sampled point to one sampled point on a timeline, or on a timeline according to the second sampled point The corresponding time is adjusted the first sampled point, with synchronous first omnidirectional's video time corresponding with second omnidirectional's video Stamp.
Another way is synchronized according to the time in third party's terminal or Cloud Server, with synchronous first omnidirectional Video timestamp corresponding with second omnidirectional's video.
Wherein, the process of specific synchronized timestamp is as shown in Figure 4.
Step 202, according to first omnidirectional video and second omnidirectional video, determine one or two thirds Omnidirectional's video.
Wherein, wherein if it is determined that a third omnidirectional video, then second omnidirectional's video and third omnidirectional video have The stereoscopic parallax of second direction;If it is determined that Liang Ge third omnidirectional video, Ze Liangge third omnidirectional's video has in second direction On stereoscopic parallax;Second direction is corresponding row side after first omnidirectional's video and second omnidirectional's video are unfolded according to longitude and latitude To.
For example, second direction is horizontal direction if first direction is vertical direction, then obtains to exist in the vertical direction and stand The first omnidirectional's video and second omnidirectional's video of stereoscopic difference, and one is determined according to first omnidirectional's video and second omnidirectional's video A or Liang Ge third omnidirectional video, wherein if it is determined that a third omnidirectional video, then second omnidirectional's video and the third are complete There is stereoscopic parallax in the horizontal direction to video, however, it is determined that Chu Liangge third omnidirectional video, Ze Liangge third omnidirectional video With stereoscopic parallax in the horizontal direction.
Wherein, step 202 includes step 2021-2022 (being not marked in figure):
Step 2021, according to first omnidirectional's video and second omnidirectional's video, determine omnidirectional's deep video.
Specifically, step 2021 includes step 20211 (being not marked in figure):
Step 20211, according to first omnidirectional's video and second omnidirectional's video, and by training after deep learning network, Determine omnidirectional's deep video.
For the embodiment of the present invention, step 20211 specifically includes step 20211a, 20211b, 20211c, 20211d (figure In do not mark), wherein
Step 20211a, it is based on deep learning network, it is each in determining and first omnidirectional's video in second omnidirectional's video The pixel that pixel matches.
Step 20211b, the corresponding depth information of each pair of pixel to match is determined.
Step 20211c, it is based on deep learning network, semantic tagger is carried out to pixel each in second omnidirectional's video.
Step 20211d, according to the corresponding depth information of each pair of pixel to match and second omnidirectional's depth view The corresponding semantic tagger information of each pixel, determines omnidirectional's deep video in frequency.
For the embodiment of the present invention, the deep learning network of omnidirectional's deep video includes: based on deep learning network (English Full name: Deep Neural Network, english abbreviation: DNN) Stereo matching unit, the depth image based on Stereo matching estimates Count unit, the image, semantic cutting unit based on DNN, object geometrical model unit, semantic depth image generation unit, omnidirectional's depth Spend image output unit.
Wherein, the depth image estimation unit based on Stereo matching, progress pixel matching and determination match each pair of The corresponding depth information of pixel.Wherein, pixel matching process and the determining each pair of pixel to match respectively correspond Depth information it is specific as follows shown in:
The the first omni-directional image OImage1 and the second omni-directional image OImage2 of warp and woof unfolding are passed through in the first step, input.
Second step performs the following operations each pixel p 1 in OImage1:
(1) to being located in OImage2 and p1 is located at each pixel p 2r of same column, compare the image similarity of p1 and p2 Numerical value is expressed as S (p1, p2r), then in all p2r, finds out the maximum pixel of S (p1, p2r) numerical value, is denoted as p2.
Wherein, S (p1, p2r)=D (d1, d2r), wherein D is the depth that the method based on a deep learning model obtains Neuroid;
(2) if S (p1, p2) > Ts, the distance of p1 and p2 is calculated, and p1 and p2 is labeled as to have the pixel of estimation of Depth, And assign depth to p1, wherein Ts is an image similarity threshold value;If S (p1, p2) < Ts, p1 and p2 is labeled as without deep Spend the pixel of estimation;
(3) be labeled as having the image pixel p2 of estimation of Depth for each pixel in OImage2, according to step (2) Identical mode finds most like pixel, if the most like pixel found is not p1, is marked as no depth The pixel of estimation;
(4) omnidirectional's depth image OImageD is exported, wherein containing all there are the pixel of estimation of Depth, pixel value is The depth value of object distance system.
According to above-mentioned (1) (2) (3) (4), the pixel of not depth value may be included in OImageD.
Wherein, in the Stereo matching unit based on DNN, learn to obtain by a large amount of stereo-picture training datas be most suitable for into The image characteristics extraction model of row Stereo image matching.Specifically, which includes multilayer neural networks, every layer network Between have with weight side connect.The input layer of the DNN model is two images, corresponds respectively to scheme from upper viewpoint omnidirectional The identical image window of two sizes intercepted in picture and lower viewpoint omni-directional image, the output layer of the DNN model are one 0 to 1 Between floating number output.When in embodiments of the present invention, to the DNN model training, training sample is with authentic signature value Image pair, the two images of image pair are respectively two intercepted from upper viewpoint omni-directional image and lower viewpoint omni-directional image The identical image window of size, when two video in window corresponding to the same object in space and are included as identical position range When, mark value 1;Otherwise, mark value 0.
Wherein, the subject image cutting unit based on DNN, the unit include the DNN model being split to image, The model is the different zones not coincided image segmentation, and different regions corresponds to different objects, such as people, desk, Road surface, bicycle etc..Specifically, the DNN model includes multilayer neural networks, there is the side with weight to connect between every layer network It connects, the input layer of the model is piece image, and the output layer of model is the image for having same size with input picture, image Each pixel is to represent the integer value of object category, and different integer values corresponds to different classes of object.
Wherein, semantic depth image generation unit, generating has semantic depth image.Specifically, based on DNN image point The segmentation result cut, each cut zone corresponds to an object in image, the threedimensional model of the object can by from It is retrieved to obtain in one three-dimensional modeling data storehouse.The depth map obtained from the depth image estimation unit based on Stereo matching Depth information as OImageD, and about the object is distributed, so as to estimate the three-dimensional appearance of the object in the picture State projects the threedimensional model of the object to image according to the 3 d pose, and then every in the available image-region The depth information of a pixel, and also know the object category information of each pixel in image simultaneously, the image be referred to as with Semantic depth image.
Further, excessively small for area, or the not region of estimation of Depth, semantic depth image generation unit can It can not generate as a result, carrying out arest neighbors difference to these regions, with neighborhood there is the region of estimation of Depth value to be filled, it can To generate omnidirectional's dense depth image that each pixel has depth value, the as a result output of output unit, i.e. depth Practise omnidirectional's dense depth image that the information of network final output has depth value for each pixel.
Step 2022, according to second omnidirectional's video and omnidirectional's deep video, determine one or two third omnidirectionals regard Frequently.
Wherein, step 2022 specifically includes step S1-S3 (being not marked in figure), wherein
Step S1, the first pixel corresponding depth information in fixed omnidirectional's deep video is determined, and according to One pixel determines horizontal polar curve.
Wherein, the first pixel is located in second omnidirectional's video.
Step S2, according to the first pixel in fixed omnidirectional's deep video corresponding depth information and horizontal pole Line determines the second pixel.
Step S3, circulation step S1-S2 is until obtain third omnidirectional video.
Wherein, third omnidirectional video is made of all second pixels determined.
For the embodiment of the present invention, as shown in Figure 5 a, for a pixel p2 in left view point omnidirectional video, can obtain Know that the corresponding object point P of the pixel is located on the ray that left view point omni-directional image optical center C2 and 2 line of pixel p determine, benefit With the depth value in omnidirectional's deep video, position of the point P on the ray can be learnt to get knowing point P in three dimensions Then position projects point P to " right viewpoint video acquisition equipment ", obtain in " the right viewpoint video acquisition equipment " plane of delineation On location of pixels p3.For example, virtual " right viewpoint video acquisition equipment " C3, the video capture device and left view point regard Frequency acquisition equipment imaging intrinsic parameter having the same: including focal length, resolution ratio, principal point.Wherein, C3 was located at C2, with plane P- On C2-p2 vertical straight line, C3 is the display stereo base length of a setting at a distance from C2, wherein the display stereo base Length can be equal to human eye and be averaged the length of interpupillary distance, can also be adjusted according to the interpupillary distance of user, the pixel color of p3 is equal to The pixel color of p2.
Wherein, step 2022 specifically can also include step S4-S8 (being not marked in figure), wherein
Step S4, third pixel and third the pixel corresponding depth information in omnidirectional's deep video are determined.
Wherein, third pixel is located in second omnidirectional's video.
Step S5, according to third pixel and third pixel in omnidirectional's deep video corresponding depth information, really Fixed vertical stereoscopic parallax.
Step S6, according to vertical stereoscopic parallax, the corresponding horizontal stereoscopic parallax of vertical stereoscopic parallax is determined.
Step S7, according to horizontal stereoscopic parallax and third pixel, the 4th pixel is obtained.
Step S8, circulation step S4-S7, until obtaining third omnidirectional video.
Wherein third omnidirectional video is made of all 4th pixels determined.
For example, third pixel is labeled as P2, P2 corresponding depth value in depth image is D2, then calculates the pixel Corresponding vertical stereoscopic parallax is DUD(p2)=f*BUD/ (D2), wherein f is the focal length of video capture device, BUDVideo above and below being Acquire the baseline length of equipment;It is then based on the corresponding vertical stereoscopic parallax D of the pixelUD(p2), horizontal stereoscopic parallax is calculated DLR (p2)=DUD (p2) * (BLR/BUD), wherein BLR represents the baseline length between the stereo pairs of left and right, according to DLR (p2), the color of pixel p 2 is drawn to the corresponding position of right viewpoint omni-directional image.Wherein, BLRIt can be set as human eye to be averaged pupil Away from length, can also be adjusted according to the interpupillary distance of user, then rotate above-mentioned steps, until obtain virtual omnidirectional video, As shown in Figure 5 b.
In the virtual omnidirectional video generated due to the above method, it is understood that there may be some to generate effectively projection without any pixel Black hole region, as shown in fig. 6, the corresponding object in these regions appears under the corresponding observation viewpoint of virtual omnidirectional video, and It is invisible due to being blocked by foreground object in the observation viewpoint of second omnidirectional's video.In order to generate complete virtual omnidirectional view Frequently, it needs to carry out image completion to these hole regions, it is as shown in Figure 7 to obtain filled image.
It optionally, can also include: to be obtained to having determined that third omnidirectional video fill out hole processing after step 2022 Fill out hole treated third omnidirectional video.
For the embodiment of the present invention, due to being generated there may be some without any pixel in determining third omnidirectional video The black hole region effectively projected, therefore third omnidirectional video is carried out filling out hole processing.
Wherein, to having determined that third omnidirectional video fill out hole processing, obtain filling out hole treated third omnidirectional video this One step, including step S9-S13 (being not marked in figure):
Step S9, the first omni-directional image and the second omni-directional image corresponding with the first omni-directional image are determined.
Wherein the first omni-directional image belongs to first omnidirectional's video, and the second omni-directional image belongs to second omnidirectional's video.
Step S10, the identical image window of size is intercepted in the first omni-directional image and the second omni-directional image, respectively To first window image and the second video in window.
Step S11, network, first window image and the second video in window are generated based on confrontation, generated and the second window The corresponding third image of image.
Wherein, it includes having the coding network of high-level semantic attribute and with bottom layer image attribute that confrontation, which generates network, Decoding network.
Step S12, frame image corresponding with the third image of generation in third omnidirectional video is determined, and to determining frame Image carries out filling out hole processing.
Step S13, circulation step S9-S12, until completing to fill out hole processing to every frame image in third omnidirectional video.
Wherein, to having determined that third omnidirectional video fill out hole processing, obtain filling out hole treated third omnidirectional video this One step, comprising: determine the pending corresponding filling of every frame image for filling out hole processing in fixed third omnidirectional video Strategy;Fill out hole processing according to filling Strategy, obtains filling out hole treated third omnidirectional video.
Further, it is determined that the pending every frame image for filling out hole processing is corresponding in fixed third omnidirectional video The step for filling Strategy, can specifically include: by the every frame figure for filling out hole processing pending in fixed third omnidirectional video The image of default frame number before picture, be input to confrontation generate network, obtain in third omnidirectional video it is pending fill out hole processing The corresponding filling Strategy of every frame image.
For the embodiment of the present invention, a kind of image completion mode of simplification is that nearest pixel is chosen around cavity, directly It connects its colour reproduction into cavity.
For example, below step can be used in a kind of specific method:
(1) one-row pixels in a hole region are selected, the row pixel left margin pixel and right margin pixel are found.Root According to depth information, judges pixel farthest apart from video capture device in right boundary pixel, the brightness value of the pixel is assigned to All pixel values in the row pixel.
(2) all rows in hole regions all in image are carried out with the operation in step (1).
It the embodiment of the invention also provides a kind of filling mode is somebody's turn to do using the method based on depth neural network model Method use a kind of similar confrontation generate network (full name in English: Generative-Adversarial Net, english abbreviation: GAN network structure).
Wherein, which includes multilayer neural networks, there is the side connection with weight, the net between every layer network Network referred to as encodes net with the structure that neuroid quantity in every layer gradually decreases close to the first half network of input layer Network, the network may learn the feature (such as: object category, property etc.) in image with high-level semantic attribute, the network Close to the latter half network of output layer, there is the structure that neuroid quantity gradually increases in every layer, referred to as decoding network, The network may learn the feature (such as: color of image, texture etc.) in image with low layer pictures attribute.
Wherein, the input layer of the model is two images, is corresponded respectively to from upper viewpoint omni-directional image and lower viewpoint omnidirectional The identical image window of two sizes intercepted in image.The output layer of the model is width figure identical with input picture size Picture, the image are right viewpoint omni-directional images corresponding with the image window in lower viewpoint omni-directional image.In use, by the right side of generation Image-region corresponding with hole region is filled into hole region in viewpoint omni-directional image, wherein viewpoint omni-directional image on this Belong to viewpoint omnidirectional video, which belongs to the lower viewpoint omnidirectional video.
When wherein, to the model training, the input in every group of training sample is upper viewpoint omni-directional image and lower viewpoint omnidirectional Image exports as right viewpoint omni-directional image.There are two types of methods for the generation of training sample:
Method 1 shoots training image by three video capture devices, and specifically, three video capture devices are located at phase In same vertical direction, location arrangements are the position on top, lower section, right side, are fixed by mechanical device, as shown in Figure 8.Its In, above and below video capture device form stereo pairs up and down, lower section and right side video capture device composition The equipment is placed in various real world environment by left and right stereo pairs, shoots training image.
Method 2 generates training image by the technical modelling of computer graphics, specifically, in Computerized three-dimensional mould In the type world, three virtual video acquisition equipment are set.Three virtual video acquisition equipment are located in identical vertical direction, position Set the position for being arranged as top, lower section, right side.Wherein, the video capture device above and below forms stereo pairs up and down, Lower section and right side video capture device forms left and right stereo pairs, as shown in Figure 9.
For the embodiment of the present invention, when generating network to confrontation and being trained, using with above-mentioned " image cavity is filled single The similar equipment of member " calculates iconology environment generation video training data, and every group of video training data includes: upper viewpoint omnidirectional Video, lower viewpoint omnidirectional video and right viewpoint omnidirectional video.
Wherein, this method includes an image filling method set, includes a variety of image filling methods in the set, such as The method of filling based on Image neighborhood, the method based on GAN filling.
Wherein, the method for the filling based on Image neighborhood may there are many variant form, such as: in the way of line by line into Row filling, and/or is filled in the way of by column, and when filling is filled using colour reproduction, and/or texture is used when filling Duplication filling.
Wherein, the fill method based on GAN may there are many variant forms, such as: in training using for different fields The training data of scape and depth distribution, the GAN model trained have different filling modes.
For the embodiment of the present invention, a kind of video gap filling method is provided, similar to the method for enhancing study, study The strategy of a kind of video image cavity filling, when cavity in filling video sequence in every frame image, according to several before the frame The feature of the hole region image of frame selects a kind of optimal fill method from image filling method set, and guarantee has been filled At video there is visual continuity in the time domain.
Specifically, note S is the feature of the hole region image of several frames before the frame, and note a is from image filling method set One of fill method, (S a) indicates the estimation of video continuity degree to using fill method a to obtain under feature S to note Q Value, (S is a) return immediately for taking the movement, for example, (S a) may be calculated and filled t moment figure using method a r to note r As after, the image difference which is made comparisons with t-1 moment image, the difference can be by two images according to the farther away figure of depth After carrying out image registration as region, color difference is sought pixel by pixel in filling region part and seeks draw value.Remember that v is one Discount factor, 0 < v < 1.
Wherein, the step of learning process is as follows:
(1) to each S, the combination of a is initialized, i.e. and Q (S, a)=0;
(2) the feature S at current time is sought;
(3) following step a)-e is repeated), until training video terminates.
A) one is selected to make Q (S, a) maximized method a0;
B) application method a0 fills image hole region, calculates r (S, a0);
C) the feature S ' at next moment after filling is obtained;
D) Q (S, a)=r (S, a0)+v*max is updatedaQ (S ', a) };
E) S=S ' is enabled.
For the embodiment of the present invention, (S a) is filled the cavity in video the tactful Q obtained using above-mentioned study.
For the embodiment of the present invention, under certain application scenarios, for example, user shoots video in the process of movement, it is Guarantee to export more smooth omnidirectional's video, then the first omnidirectional's video and the second omnidirectional's video for needing to obtain shooting into Row processing, wherein specific processing mode such as step 301 (being not marked in figure), wherein
Step 301 carries out stablizing processing to second omnidirectional's video and/or fixed third omnidirectional video.
For the embodiment of the present invention, step 301 may include two kinds of situations:
Situation 1: if only generating a third omnidirectional video, second omnidirectional's video and the third omnidirectional video are carried out Stablize processing.
Situation 2: if generating Liang Ge third omnidirectional video, generated third omnidirectional video is carried out stablizing processing.
Wherein, step 301 can specifically include step 3011 (being not marked in figure):
Step 3011, by second omnidirectional's video and/or fixed third omnidirectional Video Rendering to video stabilization target track On mark, second stable omnidirectional's video and/or stable third omnidirectional video are obtained.
Wherein it is determined that the mode of video stabilization target trajectory, comprising: according to omnidirectional's deep video, determine that video acquisition is set The location information of standby moment each during the motion corresponding three dimensional environmental model;It is being moved according to video capture device The location information of corresponding three dimensional environmental model of each moment in the process, determines video capture device in world coordinate system Three-dimensional running track;3 D motion trace is filtered, video stabilization target trajectory is obtained.
For the embodiment of the present invention, third omnidirectional video can be carried out before step 3011 filling out hole processing, it can also be with Third omnidirectional video is carried out after 3011 to fill out hole processing.In embodiments of the present invention without limitation.
Wherein, it is identical as carrying out filling out the mode of hole processing in above-described embodiment to carry out filling out the mode of hole processing, herein no longer It repeats.
The embodiment of the invention provides a kind of methods of video processing, compared with existing mode, in the embodiment of the present invention Two omnidirectional's videos in a first direction with stereoscopic parallax, respectively first omnidirectional's video and second omnidirectional's video are obtained, Then according to first omnidirectional's video and second omnidirectional's video, third omnidirectional video is determined, wherein second omnidirectional's video and third Omnidirectional's video has the stereoscopic parallax in second direction, i.e., only needs to get two in the embodiment of the present invention and have in a first direction There is omnidirectional's video of stereoscopic parallax, passes through omnidirectional's video of the stereoscopic parallax in the stereoscopic parallax to second direction on first direction Conversion, can be obtained the third omnidirectional video that second omnidirectional's video is located on same line direction, or obtain two positioned at same Third omnidirectional video on line direction, and then be integrated as user for second omnidirectional's video and third omnidirectional video and three-dimensional is presented entirely To video effect or be integrated as user for Liang Ge third omnidirectional video three-dimensional omnidirectional's video effect be presented, provide may and Premise.Simultaneously, it is only necessary to which video acquisition can be completed in Liang Ge omnidirectional video capture device, this device structure can drop significantly The volume of low omnidirectional's video capture device, reduces cost, based on its portable, small and exquisite formula, cost is relatively low the features such as, can increase The applicable application scenarios of omnidirectional's video capture device, to promote user experience.
Embodiment two
The embodiment of the present invention provides a kind of device of video processing, as shown in Figure 10, comprising: obtains module 1001, determines Module 1002, wherein
Module 1001 is obtained, for obtaining first omnidirectional's video and second omnidirectional's video.
Wherein, first omnidirectional's video and second omnidirectional's video have a stereoscopic parallax in a first direction, first direction the Corresponding column direction after one omnidirectional's video and second omnidirectional's video are unfolded according to longitude and latitude.
Determining module 1002, for according to first omnidirectional's video and the second deep video, determine one or two the Three omnidirectional's videos.
Wherein, however, it is determined that go out a third omnidirectional video, then second omnidirectional's video and third omnidirectional video have second The stereoscopic parallax in direction;If it is determined that Liang Ge third omnidirectional video, Ze Liangge third omnidirectional's video has in a second direction Stereoscopic parallax;Second direction is corresponding line direction after first omnidirectional's video and second omnidirectional's video are unfolded according to longitude and latitude.
The embodiment of the invention provides a kind of devices of video processing, compared with existing mode, in the embodiment of the present invention Two omnidirectional's videos in a first direction with stereoscopic parallax, respectively first omnidirectional's video and second omnidirectional's video are obtained, Then according to first omnidirectional's video and second omnidirectional's video, third omnidirectional video is determined, wherein second omnidirectional's video and third Omnidirectional's video has the stereoscopic parallax in second direction, i.e., only needs to get two in the embodiment of the present invention and have in a first direction There is omnidirectional's video of stereoscopic parallax, passes through omnidirectional's video of the stereoscopic parallax in the stereoscopic parallax to second direction on first direction Conversion, can be obtained the third omnidirectional video that second omnidirectional's video is located on same line direction, or obtain two positioned at same Third omnidirectional video on line direction, and then be integrated as user for second omnidirectional's video and third omnidirectional video and three-dimensional is presented entirely To video effect or be integrated as user for Liang Ge third omnidirectional video three-dimensional omnidirectional's video effect be presented, provide may and Premise.Simultaneously, it is only necessary to which video acquisition can be completed in Liang Ge omnidirectional video capture device, this device structure can drop significantly The volume of low omnidirectional's video capture device, reduces cost, based on its portable, small and exquisite formula, cost is relatively low the features such as, can increase The applicable application scenarios of omnidirectional's video capture device, to promote user experience.
The embodiment of the invention provides a kind of devices of video processing, and the embodiment of the method for above-mentioned offer may be implemented, and have Body function realizes the explanation referred in embodiment of the method, and details are not described herein.
Those skilled in the art of the present technique are appreciated that the present invention includes being related to for executing one in the application in operation Or multinomial equipment.These equipment can specially design and manufacture for required purpose, or also may include general-purpose computations Known device in machine.These equipment have the computer program being stored in it, these computer programs selectively activate Or reconstruct.Such computer program, which can be stored in equipment (for example, computer) readable medium or be stored in, to be suitable for Storage e-command is simultaneously coupled in any kind of medium of bus respectively, and computer-readable medium is including but not limited to any The disk (including floppy disk, hard disk, CD, CD-ROM and magneto-optic disk) of type, ROM (Read-Only Memory, read-only storage Device), RAM (Random Access Memory, immediately memory), EPROM (Erasable Programmable Read- Only Memory, Erarable Programmable Read only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory, Electrically Erasable Programmable Read-Only Memory), flash memory, magnetic card or light card.It is, readable Medium includes by equipment (for example, computer) with any medium for the form storage or transmission information that can be read.
Those skilled in the art of the present technique be appreciated that can be realized with computer program instructions these structure charts and/or The combination of each frame and these structure charts and/or the frame in block diagram and/or flow graph in block diagram and/or flow graph.This technology neck Field technique personnel be appreciated that these computer program instructions can be supplied to general purpose computer, special purpose computer or other The processor of programmable data processing method is realized, to pass through the processing of computer or other programmable data processing methods The scheme specified in frame or multiple frames of the device to execute structure chart and/or block diagram and/or flow graph disclosed by the invention.
Those skilled in the art of the present technique have been appreciated that in the present invention the various operations crossed by discussion, method, in process Steps, measures, and schemes can be replaced, changed, combined or be deleted.Further, each with having been crossed by discussion in the present invention Kind of operation, method, other steps, measures, and schemes in process may also be alternated, changed, rearranged, decomposed, combined or deleted. Further, in the prior art to have and the step in various operations, method disclosed in the present invention, process, measure, scheme It may also be alternated, changed, rearranged, decomposed, combined or deleted.
The above is only some embodiments of the invention, it is noted that those skilled in the art are come It says, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications also should be regarded as Protection scope of the present invention.

Claims (18)

1. a kind of method of video processing characterized by comprising
First omnidirectional's video and second omnidirectional's video are obtained, first omnidirectional video and second omnidirectional video have The stereoscopic parallax of first direction, the first direction are first omnidirectional video and second omnidirectional video according to longitude and latitude Corresponding column direction after degree expansion;
According to first omnidirectional video and second omnidirectional video, one or two third omnidirectional videos are determined, In if it is determined that a third omnidirectional video, then second omnidirectional video and third omnidirectional video have in second direction Stereoscopic parallax;If it is determined that Liang Ge third omnidirectional video, Ze Liangge third omnidirectional's video has solid in a second direction Parallax;The second direction is corresponding after first omnidirectional video and second omnidirectional video are unfolded according to longitude and latitude Line direction.
2. the method according to claim 1, wherein according to first omnidirectional video and second omnidirectional Video, the step of determining one or two third omnidirectional videos, comprising:
According to first omnidirectional video and second omnidirectional video, omnidirectional's deep video is determined;
According to second omnidirectional video and omnidirectional's deep video, one or Liang Ge third omnidirectional view is determined Frequently.
3. the method according to claim 1, wherein described according to first omnidirectional video and described second Omnidirectional's video, after the step of determining one or two third omnidirectional videos, further includes:
To having determined that third omnidirectional video fill out hole processing, obtain filling out hole treated third omnidirectional video.
4. method according to claim 1-3, which is characterized in that acquisition first omnidirectional's video and second After the step of omnidirectional's video, further includes:
First omnidirectional video and second omnidirectional video are corrected.
5. according to the method described in claim 4, it is characterized in that, described to first omnidirectional video and described second complete The step of being corrected to video, comprising:
It is determining with first omnidirectional video and described the according to first omnidirectional video and second omnidirectional video The position of the corresponding video capture device of two omnidirectional's videos and attitude error parameter;
Correction parameter is determined according to the position and attitude error parameter;
First omnidirectional video and second omnidirectional video are corrected according to the correction parameter.
6. according to the method described in claim 5, it is characterized in that, the method also includes:
Synchronize first omnidirectional video and the corresponding timestamp of second omnidirectional video.
7. the method according to claim 1, wherein described according to first omnidirectional video and described second Omnidirectional's video, after the step of determining one or two third omnidirectional videos, further includes:
Enhance second omnidirectional video and/or the corresponding resolution ratio of fixed third omnidirectional video.
8. according to the method described in claim 2, it is characterized in that, described according to first omnidirectional video and described second Omnidirectional's video, the step of determining omnidirectional's deep video, comprising:
According to first omnidirectional video and second omnidirectional video, and by the deep learning network after training, determine Omnidirectional's deep video.
9. according to the method described in claim 8, it is characterized in that, described according to first omnidirectional video and described second Omnidirectional's video, and the step of passing through deep learning network after training, determining omnidirectional's deep video, comprising:
Based on the deep learning network, each pixel in determining and first omnidirectional video in second omnidirectional video The pixel that point matches;
Determine the corresponding depth information of each pair of pixel to match;
Based on the deep learning network, semantic tagger is carried out to pixel each in second omnidirectional video;
According to each picture in the corresponding depth information of each pair of pixel and second omnidirectional video to match The corresponding semantic tagger information of vegetarian refreshments, determines omnidirectional's deep video.
10. according to the method described in claim 2, it is characterized in that, being regarded according to second omnidirectional's video and omnidirectional's depth Frequently, the step of determining third omnidirectional video, comprising:
Step S1, the first pixel corresponding depth information in fixed omnidirectional's deep video is determined, and according to institute It states the first pixel and determines that horizontal polar curve, first pixel are located in second omnidirectional video;
Step S2, according to first pixel in fixed omnidirectional's deep video corresponding depth information and institute Horizontal polar curve is stated, determines the second pixel;
Step S3, the step S1-S2 is recycled until obtaining third omnidirectional video, wherein third omnidirectional video is by determining All second pixels composition.
11. according to the method described in claim 2, it is characterized in that, being regarded according to second omnidirectional's video and omnidirectional's depth Frequently, the step of determining third omnidirectional video, comprising:
Step S4, third pixel and the third the pixel corresponding depth letter in omnidirectional's deep video are determined Breath, the third pixel are located in second omnidirectional video;
Step S5, according to the third pixel and the third pixel in omnidirectional's deep video corresponding depth Information determines vertical stereoscopic parallax;
Step S6, according to the vertical stereoscopic parallax, the corresponding horizontal stereoscopic parallax of the vertical stereoscopic parallax is determined;
Step S7, according to the horizontal stereoscopic parallax and the third pixel, the 4th pixel is obtained;
Step S8, circulation step S4-S7, until third omnidirectional video is obtained, wherein third omnidirectional video is by determining All 4th pixels composition.
12. according to the method described in claim 3, it is characterized in that, described carry out filling out hole to fixed third omnidirectional video Processing, the step of obtaining filling out hole treated third omnidirectional video, comprising:
Step S9, the first omni-directional image and the second omni-directional image corresponding with first omni-directional image are determined, wherein described First omni-directional image belongs to first omnidirectional video, and second omni-directional image belongs to second omnidirectional's video;
Step S10, the identical image window of size is intercepted in first omni-directional image and second omni-directional image, point First window image and the second video in window are not obtained;
Step S11, network, the first window image and the second video in window are generated based on confrontation, generated and the second window The corresponding third image of image, it includes having the coding network of high-level semantic attribute and with bottom that the confrontation, which generates network, The decoding network of image attributes;
Step S12, frame image corresponding with the third image of generation in fixed third omnidirectional video is determined, and to determination Frame image fill out hole processing;
Step S13, circulation step S9-S12, until completing to fill out Dong Chu to every frame image in fixed third omnidirectional video Reason.
13. according to the method described in claim 3, it is characterized in that, described pair has determined that third omnidirectional video carries out filling out Dong Chu Reason, the step of obtaining filling out hole treated third omnidirectional video, comprising:
Determine the pending corresponding filling Strategy of every frame image for filling out hole processing in the fixed third omnidirectional video;
Fill out hole processing according to the filling Strategy, obtains described filling out hole treated third omnidirectional video.
14. according to the method for claim 13, which is characterized in that in the determination fixed third omnidirectional video The step of pending every frame image corresponding filling Strategy for filling out hole processing, comprising:
By in the fixed third omnidirectional video it is pending fill out hole processing every frame image before default frame number image, It is input to the confrontation and generates network, obtain the pending every frame image for filling out hole processing in third omnidirectional video and respectively correspond Filling Strategy.
15. -3, the described in any item methods of 5-14 according to claim 1, which is characterized in that the method also includes:
Second omnidirectional video and/or fixed third omnidirectional video are carried out stablizing processing.
16. according to the method for claim 15, which is characterized in that it is described to second omnidirectional video and/or and Determining third omnidirectional video carries out the step of stablizing processing, comprising:
By on second omnidirectional video and/or fixed third omnidirectional Video Rendering to video stabilization target trajectory, obtain Second stable omnidirectional's video and/or stable third omnidirectional video.
17. according to the method for claim 16, which is characterized in that determine the mode of the video stabilization target trajectory, wrap It includes:
According to omnidirectional's deep video, video capture device corresponding three-dimensional ring of each moment during the motion is determined The location information of border model;
According to the location information of the video capture device corresponding three dimensional environmental model of each moment during the motion, Determine three-dimensional running track of the video capture device in world coordinate system;
The 3 D motion trace is filtered, the video stabilization target trajectory is obtained.
18. a kind of device of video processing characterized by comprising
Module is obtained, for obtaining first omnidirectional's video and second omnidirectional's video, first omnidirectional video and described second Omnidirectional's video has a stereoscopic parallax in a first direction, and the first direction is first omnidirectional video and described second complete Corresponding column direction after being unfolded to video according to longitude and latitude;
Determining module is also used to determine one or two according to first omnidirectional video and second deep video Third omnidirectional video, wherein if it is determined that a third omnidirectional video, then second omnidirectional video and the third omnidirectional regard Frequency has the stereoscopic parallax in second direction;If it is determined that Liang Ge third omnidirectional video, Ze Liangge third omnidirectional's video has Stereoscopic parallax in second direction;The second direction is first omnidirectional video and second omnidirectional video according to warp Corresponding line direction after latitude expansion.
CN201710347063.8A 2017-05-16 2017-05-16 Video processing method and device Active CN109246415B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710347063.8A CN109246415B (en) 2017-05-16 2017-05-16 Video processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710347063.8A CN109246415B (en) 2017-05-16 2017-05-16 Video processing method and device

Publications (2)

Publication Number Publication Date
CN109246415A true CN109246415A (en) 2019-01-18
CN109246415B CN109246415B (en) 2021-12-03

Family

ID=65082943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710347063.8A Active CN109246415B (en) 2017-05-16 2017-05-16 Video processing method and device

Country Status (1)

Country Link
CN (1) CN109246415B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113409374A (en) * 2021-07-12 2021-09-17 东南大学 Character video alignment method based on motion registration

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160042244A1 (en) * 2014-08-07 2016-02-11 Ricoh Company, Ltd. Image feature extraction method and system
CN105869201A (en) * 2016-03-25 2016-08-17 北京全景思维科技有限公司 Method and device for achieving smooth switching of panoramic views in panoramic roaming
CN106534832A (en) * 2016-11-21 2017-03-22 深圳岚锋创视网络科技有限公司 Stereoscopic image processing method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160042244A1 (en) * 2014-08-07 2016-02-11 Ricoh Company, Ltd. Image feature extraction method and system
CN105869201A (en) * 2016-03-25 2016-08-17 北京全景思维科技有限公司 Method and device for achieving smooth switching of panoramic views in panoramic roaming
CN106534832A (en) * 2016-11-21 2017-03-22 深圳岚锋创视网络科技有限公司 Stereoscopic image processing method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113409374A (en) * 2021-07-12 2021-09-17 东南大学 Character video alignment method based on motion registration

Also Published As

Publication number Publication date
CN109246415B (en) 2021-12-03

Similar Documents

Publication Publication Date Title
US11227435B2 (en) Cross reality system
EP3234806B1 (en) Scalable 3d mapping system
US11900547B2 (en) Cross reality system for large scale environments
US11257294B2 (en) Cross reality system supporting multiple device types
US20210256767A1 (en) Cross reality system with accurate shared maps
CN108446310B (en) Virtual street view map generation method and device and client device
US11551430B2 (en) Cross reality system with fast localization
US20210112427A1 (en) Cross reality system with wireless fingerprints
US11562525B2 (en) Cross reality system with map processing using multi-resolution frame descriptors
CN105631861B (en) Restore the method for 3 D human body posture from unmarked monocular image in conjunction with height map
WO2015188684A1 (en) Three-dimensional model reconstruction method and system
CN106204443A (en) A kind of panorama UAS based on the multiplexing of many mesh
EP4107702A1 (en) Cross reality system with wifi/gps based map merge
Pylvanainen et al. Automatic alignment and multi-view segmentation of street view data using 3d shape priors
EP4111293A1 (en) Cross reality system for large scale environment reconstruction
CN113657357B (en) Image processing method, image processing device, electronic equipment and storage medium
WO2020072972A1 (en) A cross reality system
CN108564654B (en) Picture entering mode of three-dimensional large scene
CN109246415A (en) The method and device of video processing
US9240055B1 (en) Symmetry-based interpolation in images
CN116843867A (en) Augmented reality virtual-real fusion method, electronic device and storage medium
US11223815B2 (en) Method and device for processing video
CN113739797A (en) Visual positioning method and device
CN117440140B (en) Multi-person remote festival service system based on virtual reality technology
US20240135656A1 (en) Cross reality system for large scale environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant