CN1134175C - Multi-camera video object took video-image communication system and realizing method thereof - Google Patents
Multi-camera video object took video-image communication system and realizing method thereof Download PDFInfo
- Publication number
- CN1134175C CN1134175C CNB001214411A CN00121441A CN1134175C CN 1134175 C CN1134175 C CN 1134175C CN B001214411 A CNB001214411 A CN B001214411A CN 00121441 A CN00121441 A CN 00121441A CN 1134175 C CN1134175 C CN 1134175C
- Authority
- CN
- China
- Prior art keywords
- line segment
- sub
- video
- thresholding
- coupling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 93
- 238000004891 communication Methods 0.000 title claims abstract description 47
- 238000000605 extraction Methods 0.000 claims abstract description 25
- 230000008878 coupling Effects 0.000 claims description 64
- 238000010168 coupling process Methods 0.000 claims description 64
- 238000005859 coupling reaction Methods 0.000 claims description 64
- 239000000284 extract Substances 0.000 claims description 25
- 230000011218 segmentation Effects 0.000 claims description 17
- 238000007476 Maximum Likelihood Methods 0.000 claims description 15
- 238000004458 analytical method Methods 0.000 claims description 15
- 230000003287 optical effect Effects 0.000 claims description 11
- 230000000007 visual effect Effects 0.000 claims description 10
- 238000004040 coloring Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 claims description 4
- FEPMHVLSLDOMQC-UHFFFAOYSA-N virginiamycin-S1 Natural products CC1OC(=O)C(C=2C=CC=CC=2)NC(=O)C2CC(=O)CCN2C(=O)C(CC=2C=CC=CC=2)N(C)C(=O)C2CCCN2C(=O)C(CC)NC(=O)C1NC(=O)C1=NC=CC=C1O FEPMHVLSLDOMQC-UHFFFAOYSA-N 0.000 claims description 4
- 230000000903 blocking effect Effects 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000013461 design Methods 0.000 claims description 3
- 230000007717 exclusion Effects 0.000 claims description 3
- 230000008676 import Effects 0.000 claims description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 18
- 238000010586 diagram Methods 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- NAHBVNMACPIHAH-HLICZWCASA-N p-ii Chemical compound C([C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@H](C(N[C@H]2CSSC[C@H](NC(=O)[C@H](CC=3C=CC=CC=3)NC(=O)CNC(=O)[C@H](CCCCN)NC(=O)[C@H](CC=3C=CC(O)=CC=3)NC2=O)C(=O)N[C@@H](CC=2C=CC(O)=CC=2)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CSSC[C@@H](C(=O)N1)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(N)=O)=O)C(C)C)C1=CC=CC=C1 NAHBVNMACPIHAH-HLICZWCASA-N 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The present invention relates to a video image communication system and a realization method for multi-camera head video object extraction, which belongs to the technical field of video image communication based on information contents. The system comprises a transmitting end and a receiving end, wherein the transmitting end is composed of a video object extraction unit and a video object coding unit which are connected with multiple camera heads based on combination of multi-video and various characteristics; the transmitting end and the receiving end are connected by a communication channel. The present invention can acquire three-dimensional space information of a physical object, and realize the key technology that a depth information algorithm of the physical object is extracted in real time from multiple video streams, and thus, video object extraction can be rapidly executed.
Description
The invention belongs to the video image communication technical field based on the information content, particularly video object extracts, based on the video image code method of the information content.
Based on the video image communication of video object, be a major part among the ISO/IEC MPEG-4 of International Standards Organization.In this standard, video object is limited by the binary picture sequence, and that how this binary picture sequence obtains to international standard is not related, i.e. the extraction of video object is open.The video object extraction is graphical analysis and understands research field an open question so far that it is relevant with image pickup, expression and treatment technology, and is also relevant to the interest of different target with human vision property and different people.In the image communication communication system, existing video object extracting method has following a few class:
(1), adopts the discontinuity split image of texture based on the video object extracting method of texture.Representing document is M.Kunt, A.Ikonomopoulos, and M.Kocher, " Second generation image coding techniques " (second generation image coding technology), Proceedings of the IEEE (IEE's journal), Vol.73 (4) (the 73rd the 4th phase of volume), pp.549-575 (page or leaf), 1985.
(2) based drive video object extracting method adopts motion model coupling divided video target.Representing document is HansGeorge Musmann, Michael Hotter and Jorn Ostermann, " Object-oriented analysis-synthesis coding for moving images " (the active images analysis-composite coding of based target), ImageCommunication (image communication) Vol.1, pp.117-138 (first volume 117-138 page or leaf), 1989.
(3), adopt the discontinuity divided video target of color based on the video standard laid down by the ministries or commissions of the Central Government extracting method of color.Represent document LiHB, Forchheimer R." Location of Face Using Color Cue " (based on people's face location of color), PictureCoding Symposium, P.2.4 (picture coding meeting, the 4th piece of part 2), 1993.
(4) many feature video target extraction method, document is more, for example use motion feature and edge feature divided video target, Reinders, M.van Beek, P., Sankur, B., and van der Lubbe, J. " Facial featurelocalisation and adaptation of a generic face model for model-based coding ", (the face characteristic location in the model-based coding and with general faceform's coupling) Signal Processing:ImageComm., Vol.7, No.1, pp.57-74 (signal processing journal, image communication divides periodical, the 7th the 1st phase of volume, the 57-74 page or leaf), 1995; Cut apart people's face target with motion feature and color characteristic, T.Xie, Y.He, and C.Weng, " Alayeredvideo coding scheme for very low bit rate videophone " (the very layered video coding method of low numeric code rate video telephone), Picture Coding Symposium, pp.343-347 (picture coding meeting, the 343-347 page or leaf), Berlin (Berlin), 1997.
Video communication system based on said method all adopts single camera to obtain video image, is called the video communication system of single camera based on the information content.Feature such as single camera video communication system utilization campaign, texture, color and some priori are extracted object video, are to encode and be sent to communication channel in the unit then with the object video.After receiving terminal is received signal, code word is deciphered the reconstruction video target, by video display display video target.The structure of its communication system as shown in Figure 1.Single camera is made up of two unit at transmitting terminal based on the video communication system of the information content among Fig. 1.First unit is " based on the Video Object Extraction unit of haplopia ", and second unit is " object-oriented video coding unit ".Also form at receiving terminal by two unit.First unit is " an object video decoding unit ", and second is " object video display unit ".
Another kind of video communication system is multi-video communication system (Multi-view video communicationsystem), is designated hereinafter simply as many viewing systems.Existing many viewing systems comprise " looking more-look more type " and " looking-the haplopia type " more:
Look-look more more, comprise automatic monitored control system, multimachine position on-the-spot broadcasting system and camera array system, its structure mainly comprises as shown in Figure 2: have more than two at transmitting terminal (1.。。, n) " haplopia signal encoding unit ", each haplopia signal encoding unit connects a camera.N haplopia code stream input " multi-channel video code stream Multi-connection unit " carried out signal and mixed, and is sent to communication channel then.At receiving terminal, composite bit stream is separated into n independently code stream in " multi-channel video code stream tap ", and n " haplopia signal decoding unit " is reduced into n video image with n video code flow, shown by n video display respectively.The characteristics of this type systematic are to there is no positive connection between looking more, only are system-level a plurality of haplopia communication systems to be merged integral body with certain function of formation.Wherein multimachine position on-the-spot broadcasting system characteristics are the multi-channel videos about Same Scene, and getting parms for concrete image does not have special provision; And the characteristics of camera array system are not only at Same Scene, and the camera parameter of relation of the mutual alignment between the video camera and unit are all had the regulation of comparison strictness.The concrete application comprises three-dimensional video-frequency communication etc.
Look-haplopia more, comprise based on looking selecting system and scene rebuilding system.
Wherein based on looking the selecting system structure as shown in Figure 3, it mainly comprises: position judging module, a plurality of visual acquisition module, a MUX, a single channel video encoding module and a single channel video decode module.Its general workflow is: at first judge observer's present located position by the position judging module, and positional information is transmitted back to the MUX control assembly; MUX is according to suitably choosing of looking of the positional information passed back (or the image of looking in the middle of being undertaken by simple interpolations generates), and send the video encoder module with image as a result; Video encoder is encoded to input imagery, and code stream arrives decoding end by Channel Transmission; Decoder is decoded to code stream, produces decoded picture, and sends to the end user.
The scene rebuilding system configuration as shown in Figure 4, it mainly comprises: a plurality of visual acquisition modules, scene rebuilding module, virtual scene projection module, position judging module and respective coding decoder module.General workflow is: at first the multi-channel video input module send the scene rebuilding module with the multi-channel video that obtains; Then reconstruct virtual 2D or 3D scene by the scene rebuilding module according to many visual informations of importing; Judge the position of observer in virtual scene by the position judging module, and send the virtual scene projection module positional information; By the virtual scene projection module according to the observation the person in virtual scene the position and the virtual scene of generation carry out the virtual generation of looking, and virtual the looking that will generate sent video encoder; Encoder is encoded, and code stream arrives decoder by Channel Transmission; Decoder is finished code stream decoding, produces decoded picture and sends to the end user.This type systematic is not analyzed the content of image.Be that perspective view is not among several figure of easy choice, but will be combined into corresponding figure with system's difference shown in Figure 3.
Said method and system have the following disadvantages: the single camera video communication system has been lost the three-dimensional information of physical target in the process of video image picked-up, with the source of the two-dimensional image after the projection as video image analysis and coding, its result has very big uncertainty.Because the purpose of cutting apart of video object is the prospect and the background of dividing in the video image, only dividing from two-dimensional signal is this probabilistic main cause.And because the information matches operand of looking between the bitstream very big more, many views are not developed in the application of communication system as yet as degree of depth matching algorithm.Can the key issue that be used in based on the communication of the information content be to extract in the real-time operation of depth information.
The objective of the invention is for overcoming the weak point of prior art, the video image communication system that a kind of multi-cam video object extracts is proposed, adopt multi-cam input video image, thereby can obtain the three-dimensional spatial information of physical target, promptly depth information will be to providing important basis cutting apart of prospect and background; The implementation method of Ti Chuing has solved the depth information Algorithm of Key Technology of extract real-time physical target from multiple video strems simultaneously, and making depth information extract can carry out fast.
The present invention proposes the video image communication system that a kind of multi-cam video object extracts, comprise the transmitting terminal of forming by Video Object Extraction unit and object-oriented video coding unit, by the receiving terminal that object video decoding unit and object video display unit are formed, said transmitting terminal links to each other by communication channel with receiving terminal; It is characterized in that, said Video Object Extraction unit carries out matching operation to a plurality of video flowings simultaneously for linking to each other with a plurality of cameras, extract the depth information of object video, on the basis of depth information, motion feature in conjunction with object video, color characteristic, shape facility to video object cut apart based on the Video Object Extraction unit of looking more.
Said system of the present invention is two-way communication system, in each communication ends transmitter unit and receiving element is arranged simultaneously, and works simultaneously.
The present invention proposes a kind of method that realizes said system, may further comprise the steps:
(1) at transmitting terminal, by a plurality of camera input video images, one of them video flowing is a target image, and all the other video flowings are auxiliary image;
(2) under the help of auxiliary image, target image is carried out the analysis and the extraction of said depth information, and carry out motion feature based on depth information, color characteristic, the shape facility video object extracts comprehensive judgement, carries out again based on locations of pixels corresponding relation between a plurality of video flowings of analysis, thus the 3D object segmentation of the matching result of the depth information of calculating subject, thereby extract video object, its result is expressed as the binary picture sequence of video object;
(3) the object-oriented video coding unit is according to the binary picture sequence of video object, and the source target image is carried out coding based on object video, thereby forms the code stream based on object video, is sent to communication channel;
(4) at receiving terminal, the object video decoding unit will be reduced into the image based on object video based on the code stream of object video;
(5) the object video display unit independently shows each object video.
Related definition in the inventive method is as follows:
Target image: be a certain frame in the video to be split.
Reference picture: be the respective frame in the reference video.
Target segment: be the intersection of some nuclear faces and target image, if the photocentre line of two optical systems and direction of line scan level then are the part (or all) of a certain scan line.
With reference to line segment: the intersection of reference picture and same nuclear face.In fact because previously described reason, the matching problem that the pixel matching problem in reference picture and the target image is put on can being converted into reference to line segment and target segment under the specific hypothesis.
The line segment coupling: we are defined as line segment A and line segment B coupling, and target segment A is with consistent with reference to the Origin And Destination of line segment B.
Sub-line segment: line segment is divided into nonoverlapping subinterval, and each subinterval is a sub-line segment.
Matching degree: by the decision of matching degree flow function function value size.
The histogram functions of line segment: the picture element on line segment carries out brightness statistics, the number of the picture element of getting certain brightness that obtains and the corresponding relation of corresponding bright.
Histogram operation: be actually the picture element with certain brightness in the image become and have another or the image transform processes of the picture element of several brightness in addition.
Sub-line segment: be target segment or with reference to a continuation subset of putting on the line segment.
The rapid analysis and the extracting method of depth information adopt multiple iteration between a plurality of video flowings that the present invention obtains multi-cam, the algorithm of refinement successively, and each layer may further comprise the steps:
(1) imports target segment and respectively with reference to line segment;
(2) carry out the histogram adjustment respectively to said target segment with reference to line segment;
(3) adjusted line segment is established the feature thresholding;
(4) with above-mentioned thresholding line segment is carried out coarse segmentation and obtain sub-line segment, then according to histogram antithetical phrase line segments extraction feature;
(5) carry out characteristic matching with the sub-line segment of target with reference to sub-line segment;
(6) matching result is carried out the judgement that whether will cut apart again;
(7) if do not satisfy condition then enter down one deck, repeating step (1) is to step (7);
Module is cut apart in the unified input of last each layer matching result, thereby finishes cutting apart and coupling of specified accuracy.
Above-mentioned histogram adjustment is to carrying out respectively from the target segment of two visual fields with reference to line segment, and concrete grammar is:
(1) the highest brightness value Max and the minimum brightness value Min of statistics whole piece target segment.
(2), otherwise every bit on the line segment is made following luminance transformation if the difference of Max and Min less than some thresholding Th1, then is changed to its brightness average with the brightness of being had a few on this line segment:
Wherein f (x) is the conversion desired value, and g (x) is a transformation results, and Vmax is the excursion of the brightness of system.
Above-mentioned establishment feature gate method concrete steps are as follows: after the histogram adjustment, the whole piece line segment is divided into different zones according to thresholding, thereby seeks the corresponding relation of each sub-line segment for the coupling of line segment; (1) setting a thresholding Th2 is one and is slightly less than 50% numerical value; (2) if Th2<30% then carries out histogram equalization to the adjusted line segment of histogram; (3) find brightness value DU make brightness greater than the pixel number of DU in two line segments shared toatl proportion just greater than Th2; (4) find brightness value DD make brightness less than the pixel number of DD in two line segments shared toatl proportion just greater than Th2; (5) pixel between statistics brightness DU and the DD is sought the local valley of its number; (6) if local valley do not occur, then reduce Th2, repeat (2)-(5); (7), repeat (2)-(5) if a plurality of valley then increases Th2; (8) with the thresholding of valley as thresholding.
Above-mentioned sub-line segment feature extracting method can adopt following concrete steps: (1) is cut apart to target segment with reference to line segment with above-mentioned thresholding; (2) point of the attribute of the same race that will link to each other joins together into section; (3) characteristic value of extracting each sub-line segment is maximum Mmax in the sub-line segment, the minimum M min in the sub-line segment, the length M length of sub-line segment, the average brightness Maverage of sub-line segment pixels.
The method of above-mentioned subcharacter line segment coupling can adopt following concrete steps: (1) hypothetical target line segment is split into the nonoverlapping sub-line segment of m bar, is designated as C[1] ... C[m]; Then be split into the sub-line segment of n bar non-overlapping copies with reference to line segment, be designated as R[1] ... R[m].Its characteristic value is the picture element mean value of corresponding sub-line segment; (2) establish every strip line segment corresponding weights and be respectively KC[i], KR[j], equal the length of corresponding sub-line segment respectively; (3) get a part (i in the space of m * n ... i+4, j ... j+4); (4) determine its matching degree: to sub-line segment coupling is right one to one: the sub-line segment C[i of hypothetical target line segment] in the reference line segment with sub-line segment R[j] corresponding, then the matching degree that produced of this sub-line segment correspondence is:
One-to-many is mated sub-line segment: the sub-line segment C[i+1 of hypothetical target line segment] with C[i] in the reference line segment with sub-line segment R[j] corresponding, then the matching degree of this part is:
To there not being coupling sub-line segment: C[i] or R[j], its matching degree of separate provision is:
FV[i,0]=KC[i]×OcP
FV[0, j]=KR[j] * OcP wherein OcP calculate the FV[on its each height section respectively for blocking penalty factor (5) to each bar candidate matches path ,], then the final coupling tolerance factor S FV in whole piece coupling path is all FV[on the path ,] sum; (6) path candidate of calculating smallest match measure coefficient.
The method that the continuing of the sub-line segment of above-mentioned coupling cut apart judgement can adopt following concrete steps: (1) is to carry out the 3D object segmentation in view of the purpose of whole algorithm, be included into the sub-line segment of object or background scope for whole line segment, needn't have further mated; (2) brightness does not have the sub-line segment of fluctuating, i.e. the sub-line segment of those of Mmax-Mmin<a certain thresholding Th3; (3) the too short sub-line segment of length, the i.e. sub-line segment of those of Mlength<a certain thresholding Th4; (4) Dui Ying whole sub-line segment meets above-mentioned 3 sub-line segment; (5) the sub-line segment that will mate makes equal in length by interpolation, asks whole line segment sad value again, and this value is right less than those sub-line segments of a certain thresholding Th5; (6) to the processing of no matching section,, think the blocked area, further do not mate for the sub-line segment that does not have coupling.
The present invention adopts many features to the method that video object extracts on the basis of depth information, may further comprise the steps: (1) replenishes judgement with colouring information to the result of depth information analysis; (2) with movable information depth information is replenished judgement: (3) also can adopt further expanding of other information; (4) adopt division-merging method that video object is cut apart.
Above-mentionedly with colouring information the result of depth information analysis is replenished decision method and can adopt following concrete steps: the thresholding that (1) adopts directivity to face territory minimal difference figure is divided target image is carried out spatial sub area dividing based on color; (2) adopt the zone to overflow the water algorithm to the regional merging of the spatial sub of color; (3) with the combining of depth information, carry out subregion degree of depth thresholding according to the maximum likelihood mean depth of color subregion and cut apart.
Above-mentioned movable information replenishes decision method to depth information can adopt following concrete steps; (1) with the criterion of different motor patterns as the subregion area dividing; (2) with the identical motor pattern of different subregions as the foundation that merges; (3) interframe of cutting apart as object according to motion vector is inherited.
The above-mentioned method that further expands based on other information can comprise the employing marginal information, more advanced processes information etc.
The method that above-mentioned employing division-merging method is cut apart video object can comprise:
At first divide, its concrete steps are as follows: decision function F is divided in one of (1) definition
Seg(A|I) wherein I is a target image to be split, and A is the subregion of a connection on it; (2) when dividing decision function at the value of subregion A division thresholding, i.e. F greater than certain setting
Seg(A|I)>Th
SegThe time, subregion A further is divided into the m sub regions; (3) foundation of Hua Fening is certain metric function sum minimalization on A, that is:
Wherein D (.) is that the subregion that is adopted is divided metric function;
The concrete steps that merge then are as follows: concurrent sentencing function F of (1) definition
Merge(A
1, A
2..., A
n| I) A wherein
i(i=1,2 ..., n) be any n communicated subarea among the I; (2) when concurrent sentencing function during less than the thresholding of certain setting, this n sub regions is merged into a sub regions A: above-mentioned division methods and merging method replace iteration and carry out.
Above-mentioned division-merging method is used for the method that the video object of multiple information characteristics cuts apart can adopts following concrete steps: (1) adopts N feature (F
1, F
2..., F
N)
T, at first they are divided into two groups of not mutual exclusion:
U
seg=(F
i1,F
i2,…,F
iK)
T
U
Merge=(F
I1, F
I2..., F
IK)
T(2) U wherein
SegFor being used for divided characteristic collection U
MergeBe the feature set that will be used to merge; (3) respectively according to U
SegAnd U
MergeDesign F
Seg(A|I) and F
Merge(A
1, A
2..., A
n| I), and divide metric function D (.); (4) with the F that obtains
Seg(A|I), F
Merge(A
1, A
2..., A
n| I) and in the above-mentioned division of D (.) substitution-merging method formula, promptly
F
Merge(A
1, A
2..., A
n| I) just obtain the division merge algorithm of a various features combination; (5) the division merge algorithm that combines as the various features of concurrent sentencing with the subregion maximum likelihood degree of depth.
Above-mentioned maximum likelihood degree of depth decision method can adopt following concrete steps: (1) definition makes posterior probability
The maximum x of P (d (z)=x|z ∈ A, I, Dd (I)) is the maximum likelihood degree of depth of subregion A.Wherein d (z) is the degree of depth of z pixel, and A is subregion to be adjudicated, and I is a target image to be split, and Dd (I) is an optical parallax field; (2) the subregion maximum likelihood degree of depth is reduced to the two-value criterion:
F
Dts=P (d (z)<Th
d| z ∈ A, I, Dd (I)) be the ratio of the subregion mid point degree of depth less than certain certain threshold; (3) depth information is included in the step of division-merge algorithm;
The method of above-mentioned 3D object segmentation based on matching result can adopt following steps: (1) is divided into object according to the matching result of sub-line segment with coupling starting point and the sub-line segment that coupling terminal point parallax all surpasses a certain thresholding Th6; (2) will mate starting point and be divided into background with the sub-line segment that coupling terminal point parallax all is no more than a certain thresholding Th6; (3) continue to cut apart the coupling iteration for other zones; (4) till whole segmentation result can satisfy required precision.
Characteristics of the present invention and effect:
The video image communication system that the multi-cam video object that the present invention proposes extracts, thus input constitutes as image based on video information content coding and video image communication system notion and system's realization by multi-cam.The Video Object Extraction unit will carry out matching operation to a plurality of video flowings, thus according to the degree of depth, color, the multiple information relevant with the physics video object such as motion are cut apart video object information.Encoding to the video object after cutting apart in the object-oriented video coding unit, is sent to transmission channel then.At receiving terminal, video decoding unit is told video unit to code stream decoding, and the final video display is to different video object independent displaying.
Because the present invention adopts the fast algorithm that depth information extracts between a plurality of video flowings that multi-cam is obtained, making depth information extract can carry out fast, thereby the video image communication system that the multi-cam video object is extracted can realize.
Because the many feature video target extraction algorithm based on depth information of the present invention makes target extract and obtains better effect, for the communication based on video information content provides better target source.Adopt various features to carry out having high efficient and accuracy cutting apart of video object.
The simple declaration of accompanying drawing: Fig. 1 is the video communication system structured flowchart of existing single camera based on the information content.Fig. 2 is existing look-multi-video communication system architecture block diagram more.Fig. 3 for existing based on looking looking-haplopia video communication system structured flowchart of choosing more.Fig. 4 looks-haplopia video communication system structured flowchart for existing scene rebuilding more.Fig. 5 is the video image communication system architecture block diagram that multi-cam video object of the present invention extracts.Fig. 6 is parallel optical axis condition of the present invention and search 1 dimensionization schematic diagram.Fig. 7 is the geometric projection schematic diagram on the coaxial plane of the present invention.Fig. 8 is the fast method FB(flow block) that depth information of the present invention extracts.Fig. 9 is the Optimum Matching schematic diagram of cutting apart the son section of the present invention.Figure 10 is the path candidate schematic diagram of smallest match measure coefficient of the present invention.Figure 11 is depth information rapid extraction experimental simulation result of the present invention.Wherein: Figure 11 (a) is ball letter left side frame video input figure (500 * 500);
Figure 11 (b) is the right frame video input figure (500 * 500) of ball letter;
Figure 11 (c) is ball_letter
Figure 11 (d) is a man sequence left side frame video input imagery (384 * 384);
Figure 11 (e) is the right frame video input imagery (384 * 384) of man sequence;
Figure 11 (f) is a man sequence segmentation result.
In conjunction with each accompanying drawing operation principle of the present invention and embodiment are described in detail as follows:
The video image communication system architecture that multi-cam video object of the present invention extracts as shown in Figure 5, comprise by based on looking the transmitting terminal of forming with the Video Object Extraction unit and the object-oriented video coding unit of various features combination more, by the receiving terminal that object video decoding unit and object video display unit are formed, transmitting terminal links to each other by communication channel with receiving terminal; The Video Object Extraction unit links to each other with a plurality of cameras and simultaneously the depth information between the video flowing of target image and a plurality of auxiliary image formation is carried out matching operation, and video object information is cut apart, and its result is expressed as the binary picture sequence of video object; The object-oriented video coding unit is according to the binary picture sequence of video object, and the source target image is carried out coding based on object video, thereby forms the code stream based on object video, is sent to communication channel; At receiving terminal, the object video decoding unit will be reduced into the image based on object video based on the code stream of object video; The object video display unit independently shows object video.
The rapid extracting method principle analysis of depth information between a plurality of video flowings that multi-cam is obtained of the present invention:
With two cameras is example, if the parallel optical axis condition is satisfied in the geometric position of two cameras, thereby make two matching problems between the video image be reduced to the linear search matching problem, as shown in Figure 6: suppose that stereo projection system satisfies parallel optical axis condition (epipolar condition), the optical axis that is optical projection system O1 and O2 be parallel to each other (might as well be assumed to be the Z direction), the projection of then a certain spatial point P in two visual fields must be on P and the determined plane of projection centre separately, two visual fields, this plane is nuclear face (epipolar plane), and P1 is on the nuclear face PO1O2 in P2.The space is on the intersection that the projection o'clock in two optical projection systems on certain nuclear face X also must be in nuclear face and corresponding projection plane, that is to say if F1 is the intersection of the image plane S1 of X and O1 system, and F2 is the intersection of the image plane S2 of X and O2 system, then the projection of space corresponding points in the O2 system of the last arbitrfary point of F1 must drop on the F2, and vice versa.Therefore in the process of search volume corresponding points, can be reduced to the matching problem of corresponding points on two straight lines, the obvious like this complexity that greatly reduces problem.If O1O2 is parallel with horizontal scanning line, then all can be parallel with horizontal scanning line at each nuclear face, therefore the data on each bar scan line of two central final images that obtained in visual field are inevitable from same nuclear face, that is can be the right problem of search match point on corresponding scan line, with the right problem reduction of search match point in two visual fields.
The position of spatial point in stereo projection system and the relation of spatial depth, as shown in Figure 7: the distance as plane and lens mid point of supposing two cameras is l (thinking that in most of the cases l is approximately equal to focal length of lens f) with being without loss of generality, and the optical center of lens spacing of two video cameras is 2d.
Relative position py1 and py2 according to space object point P projection P1 and P2 on two image planes can obtain the space coordinates that P is ordered.The P point is on straight line O1P1, so xp, yp satisfy:
P is also on straight line O2P2, so xp, yp satisfy again simultaneously:
Above two solution of equations of simultaneous get:
Therefore, the degree of depth xp of space object point is only relevant with the difference py1-py2 of its relative position of projection on two image planes, and irrelevant with the concrete numerical value of py1, py2, only need obtain this object point and get final product at the parallax of stereo image pair.
The rapid analysis and the extracting method of depth information can adopt multiple iteration between a plurality of video flowings that the present invention obtains multi-cam, the algorithm of refinement successively, and as shown in Figure 8, each layer may further comprise the steps: (1) imports target segment respectively and with reference to line segment; (2) carry out the histogram adjustment respectively to said target segment with reference to line segment; (3) adjusted line segment is established the feature thresholding; (4) with above-mentioned thresholding line segment is carried out coarse segmentation and obtain sub-line segment, then according to histogram antithetical phrase line segments extraction feature; (5) carry out characteristic matching with the sub-line segment of target with reference to sub-line segment; (6) matching result is carried out the judgement that whether will cut apart again; (7) if do not satisfy condition then enter down one deck, repeating step (1) is to step (7); Module is cut apart in the unified input of last each layer matching result, thereby finishes cutting apart and coupling of specified accuracy.
Above-mentioned histogram method of adjustment is to carrying out respectively from the target segment of two visual fields with reference to line segment, specifically can may further comprise the steps: the highest brightness value Max and the minimum brightness value Min of (1) statistics whole piece target segment.(2), otherwise every bit on the line segment is made following luminance transformation if the difference of Max and Min less than some thresholding Th1, then is changed to its brightness average with the brightness of being had a few on this line segment:
Wherein f (x) is the conversion desired value, and g (x) is a transformation results, and Vmax is the excursion of the brightness of system.
Above-mentioned establishment feature gate method can adopt following concrete steps: after the histogram adjustment, the whole piece line segment is divided into different zones according to thresholding, thereby seeks the corresponding relation of each sub-line segment for the coupling of line segment: thresholding Th2 of (1) setting is one and is slightly less than 50% numerical value; (2) if Th2<30% then carries out histogram equalization to the adjusted line segment of histogram; (3) find brightness value DU make brightness greater than the pixel number of DU in two line segments shared toatl proportion just greater than Th2; (4) find brightness value DD make brightness less than the pixel number of DD in two line segments shared toatl proportion just greater than Th2; (5) pixel between statistics brightness DU and the DD is sought the local valley of its number; (6) if local valley do not occur, then reduce Th2, repeat (2)-(5); (7), repeat (2)-(5) if a plurality of valley then increases Th2; (8) with the thresholding of valley as thresholding.
Above-mentioned sub-line segment feature extracting method can adopt following concrete steps: (1) is cut apart to target segment with reference to line segment with above-mentioned thresholding; (2) point of the attribute of the same race that will link to each other joins together into section; (3) characteristic value of extracting each sub-line segment is maximum Mmax in the sub-line segment, the minimum M min in the sub-line segment, the length M length of sub-line segment, the average brightness Maverage of sub-line segment pixels.
The principle of above-mentioned subcharacter line segment coupling is as follows: with the average of the sub-line segment weight as corresponding points, can obtain the sub-line segment of target and with reference to the coupling corresponding relation between the sub-line segment.A sub-line segment has following several correspondence, i.e. correspondence, one-to-many, no correspondence one by one.If the one-to-many situation occurs, then to make sub-line segment and merge, it is corresponding one by one that it is converted into.The coupling of sub-line segment can make the problem of optimal path of coupling measure coefficient FV minimum with search in the space of a m * n, and as shown in Figure 9: how matched accuracy being quantified as coupling measure coefficient FV is the difficult point of this algorithm.The coupling of whole piece line segment be each sub-line segment coupling and effect, so total coupling measure coefficient FV in the alternative coupling of each bar path be on this coupling path each sub-line segment coupling measure coefficient and.The coupling measure coefficient of every strip line segment should have following character: (1) is approximate more with the corresponding sub-line segment of the length of sub-line segment proportional substantially (2), and this value is more little
The method of above-mentioned subcharacter line segment coupling can adopt following concrete steps: (1) hypothetical target line segment is split into the nonoverlapping sub-line segment of m bar, is designated as C[1] ... C[m]; Then be split into the sub-line segment of n bar non-overlapping copies with reference to line segment, be designated as R[1] ... R[m].Its characteristic value is the picture element mean value of corresponding sub-line segment; (2) establish every strip line segment corresponding weights and be respectively KC[i], KR[j], equal the length of corresponding sub-line segment respectively; (3) get a part (i in the space of m * n ... i+4, j ... j+4); (4) determine its matching degree: to sub-line segment coupling is right one to one: the sub-line segment C[i of hypothetical target line segment] in the reference line segment with sub-line segment R[j] corresponding, then the matching degree that produced of this sub-line segment correspondence is:
One-to-many is mated sub-line segment: the sub-line segment C[i+1 of hypothetical target line segment] with C[i] in the reference line segment with sub-line segment R[j] corresponding, then the matching degree of this part is:
To there not being coupling sub-line segment: C[i] or R[j], its matching degree of separate provision is:
FV[i,0]=KC[i]×OcP
FV[0, j]=KR[j] * OcP wherein OcP for blocking penalty factor (1) to each bar candidate matches path, calculate the FV[on its each height section respectively, ], then the final coupling tolerance factor S FV in whole piece coupling path is all FV[on the path ,] sum: (2) calculate the path candidate of smallest match measure coefficient.
The path candidate method concrete steps of above-mentioned smallest match measure coefficient are as follows, as shown in figure 10:
From 1 to n order line by line, and calculate minimum FV in all coupling paths from (0,0) to current point according to the order pointwise of i from 1 to m according to j at each interline.Only there are three directions of search in regulation to current point, and (i j) is current point to point, and for current point, this coupling path only may be from 1,2,3 three kind of direction enters in the drawings.Calculate (i, minimum total coupling tolerance factor S FV the time, 1,2,3 (thick dashed line) are respectively three path candidates in all coupling paths j); Notice simultaneously owing to allowing the coupling of one-to-many during the course, so must increase some several path candidates according to former matching result.For example for enter along direction 3 (i, path j) is if at (i, j-1) Optimum Matching path selected in the judgement is 1 (heavy line), then takes all factors into consideration, and this path is (i-1, j-2)-(i, j), so 4 (fine dotted lines) also should be a path candidate; Equally for enter along direction 1 (i, path j) is if at (i-1, j) Optimum Matching path selected in the judgement is 1 (heavy line), then takes all factors into consideration, and this path is (i-2, j-1)-(i, j), so 5 (fine dotted lines) also should be a path candidate.Therefore total path candidate number can be above 6.If stipulate that some directions are consistent with the optimal path approach axis of its pairing starting point then do not increase the candidate search path.With to every candidate matches path, advance to that (total coupling measure coefficient that i, total coupling measure coefficient j) equal the starting point of this time coupling adds that this mates the pairing coupling measure coefficient of sub-line segment.For example for path candidate 3, its advance (i, coupling measure coefficient j) is:
SFV (i, j)=SFV (i, j-1)+(i j-1) enters from all that (i selects the SFV reckling as entering (i, optimal path j) in path candidate j) to FV.Continue then a bit carrying out down.Up to (m, n) till.Only need this moment from (m n) prolongs each some approach axis pointwise and falls back back, just can find whole piece Optimum Matching path up to (0,0).According to the method that preamble is mentioned each height section in Optimum Matching path is analyzed, just can be drawn the corresponding relation between the sub-line segment.Final step is that many line segments with the same sub-line segment of correspondence are merged, and so just obtains this final result of matching.
The method that the continuing of the sub-line segment of above-mentioned coupling cut apart judgement can adopt following concrete steps: (1) is to carry out the 3D object segmentation in view of the purpose of whole algorithm, be included into the sub-line segment of object or background scope for whole line segment, needn't have further mated; (2) brightness does not have the sub-line segment of fluctuating, i.e. the sub-line segment of those of Mmax-Mmin<a certain thresholding Th3; (3) the too short sub-line segment of length, the i.e. sub-line segment of those of Mlength<a certain thresholding Th4; (4) Dui Ying whole sub-line segment meets above-mentioned 3 sub-line segment; (5) the sub-line segment that will mate makes equal in length by interpolation, asks whole line segment sad value again, and this value is right less than those sub-line segments of a certain thresholding Th5; (6) to the processing of no matching section,, think the blocked area, further do not mate for the sub-line segment that does not have coupling.
The present invention adopts many features to the method that video object extracts on the basis of depth information, may further comprise the steps: (1) replenishes judgement with colouring information to the result of depth information analysis; (2) with movable information depth information is replenished judgement; (3) also can adopt further expanding of other information; (4) adopt division-merging method that video object is cut apart.
Above-mentionedly with colouring information the result of depth information analysis is replenished decision method and can adopt following concrete steps: the thresholding that (1) adopts directivity to face territory minimal difference figure is divided target image is carried out spatial sub area dividing based on color; (2) adopt the zone to overflow the water algorithm to the regional merging of the spatial sub of color; (3) with the combining of depth information, carry out subregion degree of depth thresholding according to the maximum likelihood mean depth of color subregion and cut apart.
Above-mentioned movable information replenishes decision method to depth information can adopt following concrete steps: (1) is with the criterion of different motor patterns as the subregion area dividing; (2) with the identical motor pattern of different subregions as the foundation that merges; (3) interframe of cutting apart as object according to motion vector is inherited.
The above-mentioned method that further expands based on other information comprises the employing marginal information, more advanced processes information etc.
The method that above-mentioned employing division-merging method is cut apart video object can adopt following concrete steps: at first divide, its concrete steps can comprise: decision function F is divided in one of (1) definition
Seg(A|I) wherein I is a target image to be split, and A is the subregion of a connection on it; (2) when dividing decision function at the value of subregion A division thresholding, i.e. F greater than certain setting
Seg(A|I)>Th
SegThe time, subregion A further is divided into the m sub regions; (3) foundation of Hua Fening is certain metric function sum minimalization on A, that is:
Wherein D (.) is that the subregion that is adopted is divided metric function.
Merge then, its concrete steps can comprise: concurrent sentencing function F of (1) definition
Merge(A
1, A
2..., A
n| I) A wherein
i(i=1,2 ..., n) be any n communicated subarea among the I; (2) when concurrent sentencing function during, this n sub regions is merged into a sub regions A less than the thresholding of certain setting; Above-mentioned division methods and merging method will replace iteration and carry out.
The video object that the present invention is used for multiple information characteristics with above-mentioned division-merging method is cut apart can adopt following concrete steps: (1) adopts N feature (F
1, F
2..., F
N)
T, at first they are divided into two groups of not mutual exclusion:
U
seg=(F
i1,F
i2,…,F
iK)
T
U
Merge=(F
I1, F
I2..., F
IK)
T(2) U wherein
SegFor being used for divided characteristic collection U
MergeBe the feature set that will be used to merge; (3) respectively according to U
SegAnd U
MergeDesign F
Seg(A|I) and F
Merge(A
1, A
2..., A
n| I), and divide metric function D (.); (4) with the F that obtains
Seg(A|I), F
Merge(A
1, A
2..., A
n| I) and in the above-mentioned division of D (.) substitution-merging method formula, promptly
F
Merge(A
1, A
2..., A
n| I) just obtain the division merge algorithm of a various features combination; (5) can adopt following concrete steps with the subregion maximum likelihood degree of depth as the above-mentioned maximum likelihood degree of depth of the division merge algorithm decision method that the various features of concurrent sentencing combines: (1) definition makes posterior probability
The maximum x of P (d (z)=x|z ∈ A, I, Dd (I)) is the maximum likelihood degree of depth of subregion A.Wherein d (z) is the degree of depth of z pixel, and A is subregion to be adjudicated, and I is a target image to be split, and Dd (I) is an optical parallax field.(2) the subregion maximum likelihood degree of depth is reduced to the two-value criterion:
F
Dis=P (d (z)<Th
d| z ∈ A, I, Dd (I)) be the ratio of the subregion mid point degree of depth less than certain certain threshold.(3) depth information is included in the step of division-merge algorithm.
The method of above-mentioned 3D object segmentation based on matching result can comprise the steps: (1) matching result according to sub-line segment, and coupling starting point and the sub-line segment that coupling terminal point parallax all surpasses a certain thresholding Th6 are divided into object; (2) will mate starting point and be divided into background with the sub-line segment that coupling terminal point parallax all is no more than a certain thresholding Th6; (3) continue to cut apart the coupling iteration for other zones; (4) till whole segmentation result can satisfy required precision.Depth information rapid extraction experimental simulation result of the present invention is as shown in figure 11: wherein: ball letter sequence experimental result, Figure 11 (a) is ball letter left side frame video input figure (500 * 500), Figure 11 (b) is the right frame video input figure (500 * 500) of ball letter, and figure (c) is the ball_letter left side frame segmentation result iteration number of plies 1.Operation time 31ms.PII-400 PC, C language are adopted in computing.
Man sequence experimental result, Figure 11 (d) is a man sequence left side frame video input imagery (384 * 384), Figure 11 (e) is the right frame video input imagery (384 * 384) of man sequence, and Figure 11 (f) is 3 operation times of man sequence segmentation result iterations: 50 frame 8.74s.PII-400 PC, C language are adopted in computing.
The embodiment of the video image communication system that multi-cam video object of the present invention extracts is described as follows:
One of embodiment:
Adopt P-II 400 PCs, be equipped with two above USB CMOS OV6620 cameras, adopt many USB plug-in card that vision signal is inputed to PC, meeting under the coaxial condition, adopting rapid extracting method and the present invention of depth information between a plurality of video flowings that the present invention obtains multi-cam that the fast algorithm that extracts based on many feature video target of depth information is analyzed multi-video stream.Utilize the degree of depth that scene is divided into different prospects and background, obtain different video target binary time series.Thereby can adopt coding method (as MPEG-4) that video object is encoded based on video object.Network Transmission can adopt based on IP protocol with hardware plug-in card
Two of embodiment:
Adopt the hardware-accelerated subplan of PC plug-in card, rapid extracting method and the present invention of depth information are finished by the hardware plug-in card the computing of extracting based on many feature video target of depth information and the parallel Video Object Extraction of finishing based on many features of PC between a plurality of video flowings that obtain with multiple video strems input and to multi-cam.All the other calculate with embodiment 1).Plug-in card mainly contains the multiple video strems input unit and video arithmetic element able to programme is formed.Such as adopting programmable chip Trimedia as the hardware core device.
Three of embodiment:
The hardware embodiment of complete divorced from computer.Hardware system is by the multiple video strems input unit, and video arithmetic element able to programme and Network Transmission interface unit are formed.Such as adopting programmable chip Trimedia as the hardware core device.
Claims (14)
1. the video image communication system extracted of a multi-cam video object, comprise the transmitting terminal of forming by Video Object Extraction unit and object-oriented video coding unit, by the receiving terminal that object video decoding unit and object video display unit are formed, said transmitting terminal links to each other by communication channel with receiving terminal; It is characterized in that, said Video Object Extraction unit carries out matching operation to a plurality of video flowings simultaneously for linking to each other with a plurality of cameras, extract the depth information of object video, on the basis of depth information, motion feature in conjunction with object video, color characteristic, shape facility to video object cut apart based on the Video Object Extraction unit of looking more.
2. realize the method for system according to claim 1 for one kind, may further comprise the steps:
(1) at transmitting terminal, by a plurality of camera input video images, one of them video flowing is a target image, and all the other video flowings are auxiliary image;
(2) under the help of auxiliary image, target image is carried out the analysis and the extraction of said depth information, and carry out motion feature based on depth information, color characteristic, the shape facility video object extracts comprehensive judgement, carries out again based on locations of pixels corresponding relation between a plurality of video flowings of analysis, thus the 3D object segmentation of the matching result of the depth information of calculating subject, thereby extract video object, its result is expressed as the binary picture sequence of video object;
(3) the object-oriented video coding unit is according to the binary picture sequence of video object, and the source target image is carried out coding based on object video, thereby forms the code stream based on object video, is sent to communication channel;
(4) at receiving terminal, the object video decoding unit will be reduced into the image based on object video based on the code stream of object video;
(5) the object video display unit independently shows each object video.
3, implementation method as claimed in claim 2 is characterized in that, the rapid analysis and the extracting method of depth information adopt multiple iteration between a plurality of video flowings that said multi-cam obtains, the algorithm of refinement successively, and each layer may further comprise the steps:
(1) imports target segment and respectively with reference to line segment;
(2) carry out the histogram adjustment respectively to said target segment with reference to line segment;
(3) adjusted line segment is established the feature thresholding;
(4) with above-mentioned thresholding line segment is carried out coarse segmentation and obtain sub-line segment, then according to histogram antithetical phrase line segments extraction feature;
(5) carry out characteristic matching with the sub-line segment of target with reference to sub-line segment;
(6) matching result is carried out the judgement that whether will cut apart again;
(7) if do not satisfy condition then enter down one deck, repeating step (1) is to step (7);
Module is cut apart in the unified input of last each layer matching result, thereby finishes cutting apart and coupling of specified accuracy.
4, implementation method as claimed in claim 3 is characterized in that, described histogram adjustment is carried out respectively by the target segment of two visual fields with reference to line segment, specifically may further comprise the steps:
(1) the highest brightness value Max and the minimum brightness value Min of statistics whole piece target segment;
(2), otherwise every bit on the line segment is made following luminance transformation if the difference of Max and Min less than some thresholding Th1, then is changed to its brightness average with the brightness of being had a few on this line segment:
Wherein f (x) is the conversion desired value, and g (x) is a transformation results, and Vmax is the excursion of the brightness of system.
5, implementation method as claimed in claim 3 is characterized in that, said establishment feature gate method concrete steps are as follows:
(1) setting a thresholding Th2 is one and is slightly less than 50% numerical value;
(2) if Th2<30% then carries out histogram equalization to the adjusted line segment of histogram;
(3) find brightness value DU make brightness greater than the pixel number of DU in two line segments shared toatl proportion just greater than Th2;
(4) find brightness value DD make brightness less than the pixel number of DD in two line segments shared toatl proportion just greater than Th2;
(5) pixel between statistics brightness DU and the DD is sought the local valley of its number;
(6) if local valley do not occur, then reduce Th2, repeat (2)-(5);
(7), repeat (2)-(5) if a plurality of valley then increases Th2;
(8) with the thresholding of valley as thresholding.
6, implementation method as claimed in claim 3 is characterized in that, it is as follows that described sub-line segment feature extracts concrete steps:
(1) cuts apart to target segment with reference to line segment with above-mentioned thresholding;
(2) point of the attribute of the same race that will link to each other joins together into section;
(3) characteristic value of extracting each sub-line segment is maximum Mmax in the sub-line segment, the minimum M min in the sub-line segment, the length M length of sub-line segment, the average brightness Maverage of sub-line segment pixels.
7, implementation method as claimed in claim 3 is characterized in that, the concrete steps of described sub-line segment feature coupling are as follows:
(1) the hypothetical target line segment is split into the nonoverlapping sub-line segment of m bar, is designated as C[1] ... C[m]; Then be split into the sub-line segment of n bar non-overlapping copies with reference to line segment, be designated as R[1] ... R[m]; Its characteristic value is the picture element mean value of corresponding sub-line segment;
(2) establish every strip line segment corresponding weights and be respectively KC[i], KR[j], equal the length of corresponding sub-line segment respectively;
(3) get a part (i in the space of m * n ... i+4, j ... j+4);
(4) determine its matching degree:
(5) to sub-line segment coupling is right one to one: the sub-line segment C[i of hypothetical target line segment] in the reference line segment with sub-line segment R[j] corresponding, then the matching degree that produced of this sub-line segment correspondence is:
One-to-many is mated sub-line segment: the sub-line segment C[i+1 of hypothetical target line segment] with C[i] in the reference line segment with sub-line segment R[j] corresponding, then the matching degree of this part is:
To there not being coupling sub-line segment: C[i] or R[j], its matching degree of separate provision is:
FV[i,0]=KC[i]×OcP
FV[0,j]=KR[j]×OcP
Wherein OcP is for blocking penalty factor
(6) to each bar candidate matches path, calculate the FV[on its each height section respectively ,], then the final coupling tolerance factor S FV in whole piece coupling path is all FV[on the path ,] sum;
(7) path candidate of calculating smallest match measure coefficient.
8, implementation method as claimed in claim 3 is characterized in that, the method concrete steps that the continuing of the sub-line segment of described coupling cut apart judgement are as follows:
(1) is to carry out the 3D object segmentation in view of the purpose of whole algorithm, has been included into the sub-line segment of object or background scope, needn't further mate for whole line segment;
(2) brightness does not have the sub-line segment of fluctuating, i.e. the sub-line segment of those of Mmax-Mmin<a certain thresholding Th3;
(3) the too short sub-line segment of length, the i.e. sub-line segment of those of Mlength<a certain thresholding Th4;
(4) Dui Ying whole sub-line segment meets above-mentioned 3 sub-line segment;
(5) the sub-line segment that will mate makes equal in length by interpolation, asks whole line segment sad value again, and this value is right less than those sub-line segments of a certain thresholding Th5;
(6) to the processing of no matching section,, think the blocked area, further do not mate for the sub-line segment that does not have coupling.
9, implementation method as claimed in claim 2 is characterized in that, described many features may further comprise the steps the method that video object extracts:
(1) with colouring information the result of depth information analysis is replenished judgement;
(2) with movable information depth information is replenished judgement;
(3) also can adopt further expanding of other information;
(4) adopt division-merging method that video object is cut apart;
It is (5) above-mentioned that with colouring information the result of depth information analysis to be replenished the decision method concrete steps as follows:
(6) thresholding that adopts directivity to face territory minimal difference figure is divided target image is carried out spatial sub area dividing based on color;
(7) adopt the zone to overflow the water algorithm to the regional merging of the spatial sub of color;
(8) with the combining of depth information, carry out subregion degree of depth thresholding according to the maximum likelihood mean depth of color subregion and cut apart.
10, implementation method as claimed in claim 9 is characterized in that, the above-mentioned movable information of institute is as follows to the additional decision method concrete steps of depth information:
(1) with the criterion of different motor patterns as the subregion area dividing;
(2) with the identical motor pattern of different subregions as the foundation that merges;
(3) interframe of cutting apart as object according to motion vector is inherited.
11, implementation method as claimed in claim 9 is characterized in that, above-mentioned employing division-merging method method concrete steps that video object is cut apart as follows:
At first divide, specifically comprise:
(1) decision function is divided in one of definition
(2)F
seg(A|I)
(3) wherein I is a target image to be split, and A is the subregion of a connection on it;
(4) when dividing decision function at the value of subregion A division thresholding, i.e. F greater than certain setting
Seg(A|I)>Th
SegThe time, subregion A further is divided into the m sub regions;
(5) foundation of Hua Fening is certain metric function sum minimalization on A, that is:
Wherein D (.) is that the subregion that is adopted is divided metric function;
Merge then, specifically comprise:
(1) concurrent sentencing function of definition
F
merge(A
1,A
2,...,A
n|I)
A wherein
i(i=1,2 ..., n) be any n communicated subarea among the I;
(2) when concurrent sentencing function during, this n sub regions is merged into a sub regions A less than the thresholding of certain setting;
Above-mentioned division methods and merging method will replace iteration and carry out
As claim 9 or 11 described implementation methods, it is characterized in that 12, it is as follows that described division-merging method is used for the concrete steps that the video object of multiple information characteristics cuts apart:
(1) adopts N feature (F
1, F
2..., F
N)
T, at first they are divided into two groups of not mutual exclusion:
U
seg=(F
i1,F
i2,…,F
iK)
T
(2)
U
merge=(F
i1,F
i2,…,F
iK)
T
(3) U wherein
SegFor being used for divided characteristic collection U
MergeBe the feature set that will be used to merge;
(4) respectively according to U
SegAnd U
MergeDesign F
Seg(A|I) and F
Merge(A
1, A
2..., A
n| I), and divide metric function D (.);
(5) with the F that obtains
Seg(A|I), F
Merge(A
1, A
2..., A
n| I) and in the above-mentioned division of D (.) substitution-merging method formula, promptly
F
merge(A
1,A
2,...,A
n|I)
Just obtain the division merge algorithm of a various features combination;
(6) division-merge algorithm that combines as the various features of concurrent sentencing with the subregion maximum likelihood degree of depth.
13, implementation method as claimed in claim 12 is characterized in that, described maximum likelihood degree of depth decision method concrete steps are as follows:
(1) definition makes posterior probability
P(d(z)=x|z∈A,I,Dd(I))
Maximum x is the maximum likelihood degree of depth of subregion A; Wherein d (z) is the degree of depth of z pixel, and A is subregion to be adjudicated, and I is a target image to be split, and Dd (I) is an optical parallax field;
(2) the subregion maximum likelihood degree of depth is reduced to the two-value criterion:
F
dis=P(d(z)<Th
d|z∈A,I,Dd(I))
Be the ratio of the subregion mid point degree of depth less than certain certain threshold;
(3) depth information is included in the step of division-merge algorithm.
14, implementation method as claimed in claim 2 is characterized in that, the method step of described 3D object segmentation based on matching result is as follows:
(4) according to the matching result of sub-line segment, coupling starting point and the sub-line segment that coupling terminal point parallax all surpasses a certain thresholding Th6 are divided into object;
(5) will mate starting point and be divided into background with the sub-line segment that coupling terminal point parallax all is no more than a certain thresholding Th6;
(6) continue to cut apart the coupling iteration for other zones;
(7) till whole segmentation result can satisfy required precision
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB001214411A CN1134175C (en) | 2000-07-21 | 2000-07-21 | Multi-camera video object took video-image communication system and realizing method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB001214411A CN1134175C (en) | 2000-07-21 | 2000-07-21 | Multi-camera video object took video-image communication system and realizing method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1275871A CN1275871A (en) | 2000-12-06 |
CN1134175C true CN1134175C (en) | 2004-01-07 |
Family
ID=4588797
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB001214411A Expired - Fee Related CN1134175C (en) | 2000-07-21 | 2000-07-21 | Multi-camera video object took video-image communication system and realizing method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1134175C (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101268685B (en) * | 2005-07-27 | 2013-10-30 | 米迪尔波得股份有限公司 | System, apparatus, and method for capturing and screening visual images for multi-dimensional display |
JP5059788B2 (en) * | 2006-03-01 | 2012-10-31 | エージェンシー フォー サイエンス,テクノロジー アンド リサーチ | Method and system for obtaining multiple views of an object with real-time video output |
CN101453662B (en) * | 2007-12-03 | 2012-04-04 | 华为技术有限公司 | Stereo video communication terminal, system and method |
CN101540916B (en) * | 2008-03-20 | 2010-12-08 | 华为技术有限公司 | Method and device for coding/decoding |
CN101662694B (en) * | 2008-08-29 | 2013-01-30 | 华为终端有限公司 | Method and device for presenting, sending and receiving video and communication system |
EP2299726B1 (en) * | 2008-06-17 | 2012-07-18 | Huawei Device Co., Ltd. | Video communication method, apparatus and system |
JP2011029905A (en) * | 2009-07-24 | 2011-02-10 | Fujifilm Corp | Imaging device, method and program |
CN102195894B (en) * | 2010-03-12 | 2015-11-25 | 腾讯科技(深圳)有限公司 | The system and method for three-dimensional video-frequency communication is realized in instant messaging |
KR20140011481A (en) * | 2011-06-15 | 2014-01-28 | 미디어텍 인크. | Method and apparatus of motion and disparity vector prediction and compensation for 3d video coding |
EP2792149A4 (en) * | 2011-12-12 | 2016-04-27 | Intel Corp | Scene segmentation using pre-capture image motion |
CN102722080B (en) * | 2012-06-27 | 2015-11-18 | 杭州南湾科技有限公司 | A kind of multi purpose spatial image capture method based on many lens shootings |
CN107547889B (en) * | 2017-09-06 | 2019-08-27 | 新疆讯达中天信息科技有限公司 | A kind of method and device carrying out three-dimensional video-frequency based on instant messaging |
CN111915740A (en) * | 2020-08-13 | 2020-11-10 | 广东申义实业投资有限公司 | Rapid three-dimensional image acquisition method |
-
2000
- 2000-07-21 CN CNB001214411A patent/CN1134175C/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN1275871A (en) | 2000-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1134175C (en) | Multi-camera video object took video-image communication system and realizing method thereof | |
Wang et al. | Towards analysis-friendly face representation with scalable feature and texture compression | |
CN111355956B (en) | Deep learning-based rate distortion optimization rapid decision system and method in HEVC intra-frame coding | |
US10586334B2 (en) | Apparatus and method for segmenting an image | |
Ma et al. | Joint feature and texture coding: Toward smart video representation via front-end intelligence | |
CN111950477A (en) | Single-image three-dimensional face reconstruction method based on video surveillance | |
CN1885948A (en) | Motion vector space prediction method for video coding | |
CN101047867A (en) | Method for correcting multi-viewpoint vedio color | |
Liu et al. | Fast depth intra coding based on depth edge classification network in 3D-HEVC | |
Wang et al. | Region-of-interest compression and view synthesis for light field video streaming | |
WO2023203509A1 (en) | Image data compression method and device using segmentation and classification | |
CN1926879A (en) | A video signal encoder, a video signal processor, a video signal distribution system and methods of operation therefor | |
Zhang et al. | End-to-end learning-based image compression with a decoupled framework | |
Zhang et al. | Sst: Real-time end-to-end monocular 3d reconstruction via sparse spatial-temporal guidance | |
Nguyen et al. | Deep probabilistic model for lossless scalable point cloud attribute compression | |
Chen et al. | Bidirectional optical flow NeRF: high accuracy and high quality under fewer views | |
Jing et al. | Video prediction: a step-by-step improvement of a video synthesis network | |
He et al. | MTRFN: Multiscale temporal receptive field network for compressed video action recognition at edge servers | |
CN110427904B (en) | Mall monitoring system, method and device based on pedestrian re-identification | |
Zhang et al. | A compact stereoscopic video representation for 3D video generation and coding | |
CN109040747B (en) | Stereo-picture comfort level quality evaluating method and system based on convolution self-encoding encoder | |
Shi et al. | Multilevel Cross-Aware RGBD Semantic Segmentation of Indoor Environments | |
Hou et al. | Stereoscopic video quality assessment using oriented local gravitational force statistics | |
Liu et al. | Bidirectional stereo image compression with cross-dimensional entropy model | |
Zhou et al. | Dynamic point cloud compression with spatio-temporal transformer-style modeling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C06 | Publication | ||
PB01 | Publication | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20040107 Termination date: 20110721 |