CN103942795B - A kind of structuring synthetic method of image object - Google Patents

A kind of structuring synthetic method of image object Download PDF

Info

Publication number
CN103942795B
CN103942795B CN201410163775.0A CN201410163775A CN103942795B CN 103942795 B CN103942795 B CN 103942795B CN 201410163775 A CN201410163775 A CN 201410163775A CN 103942795 B CN103942795 B CN 103942795B
Authority
CN
China
Prior art keywords
image
dimensional
parts
agency
image object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410163775.0A
Other languages
Chinese (zh)
Other versions
CN103942795A (en
Inventor
周昆
许威威
陈翔
杨世杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201410163775.0A priority Critical patent/CN103942795B/en
Publication of CN103942795A publication Critical patent/CN103942795A/en
Application granted granted Critical
Publication of CN103942795B publication Critical patent/CN103942795B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The invention discloses the structuring synthetic method of a kind of image object, comprise the following steps: according to the simple mutual calibration for cameras parameter of user, combining camera parameter and the three-dimensional agency of image object segmentation information generating structure, utilize three-dimensional agency, with contact point information, image components different, that viewpoint is different of originating are connected into novel image object, adjusted by intelligence color and obtain result images;Image component based on consistent segmentation, carries out statistical learning on the image data set of certain objects type, obtains probability graph model;By carrying out probability inference on the Bayesian graphical model of acquistion, unit type and style are sampled and obtains composition proposal and the viewpoint attribute of high probability, use viewpoint perceptual image object synthetic method to generate result images.The method can synthesize big measurer photographic quality and planform changes abundant new images object, can provide good basis and guiding to 3D shape modeling simultaneously.

Description

A kind of structuring synthetic method of image object
Technical field
The invention mainly relates to field of digital media, particularly relate to image creation/editor, industry/Art Design, The applications such as dummy object/role's establishment, three-dimensional modeling.
Background technology
The technical background that the present invention is correlated with is summarized as follows:
One, image synthesis is with comprehensive
Image synthesis and comprehensive main purpose be create from multiple image sources the most reasonable believable newly Image.
Synthesizing for image, its focus is generally placed upon and the picture material of selection is carried out seamless spliced novelty Mixed method.Prior efforts includes multiresolution spline technology (BURT, P.J., AND ADELSON, E.H. 1983.A multiresolution spline with application to image mosaics.ACM Trans.Graph. 2,4(Oct.),217–236.;OGDEN,J.M.,ADELSON,E.H.,BERGEN,J.R.,AND BURT, P.J.1985.Pyramid-based computer graphics.RCA Engineer30,5,4 15.) and Synthetic operation (PORTER, T., AND DUFF, T.1984.Compositing digital images. SIGGRAPH Comput.Graph.18,3(Jan.),253–259.).Since Poisson image-editing technology (P′EREZ,P.,GANGNET,M.,AND BLAKE,A.2003.Poisson image editing.ACM Transactions on Graphics (TOG) 22,3,313 318.) occur after, the synthetic method of gradient field (JIA,J.,SUN,J.,TANG,C.-K.,AND SHUM,H.-Y.2006.Drag-and-drop pasting. In ACM Transactions on Graphics(TOG),vol.25,ACM,631–637.;FARBMAN,Z., HOFFER,G.,LIPMAN,Y.,COHEN-OR,D.,AND LISCHINSKI,D.2009. Coordinates for instant image cloning.In ACM Transactions on Graphics(TOG),vol. 28,ACM,67.;TAO,M.W.,JOHNSON,M.K.,AND PARIS,S.2010.Error tolerant image compositing.In Computer Vision–ECCV2010.Springer,31–44.; SUNKAVALLI,K.,JOHNSON,M.K.,MATUSIK,W.,AND PFISTER,H.2010. Multi-scale image harmonization.ACM Transactions on Graphics(TOG)29,4,125.; SZELISKI,R.,UYTTENDAELE,M.,AND STEEDLY,D.2011.Fast poisson blending using multi-splines.In Computational Photography(ICCP),2011IEEE International Conference on, IEEE, 1 8.) between one's early years, become seamless spliced standard technique. Recently, Xue et al. (XUE, S., AGARWALA, A., DORSEY, J., AND RUSHMEIER, H. 2012.Understanding and improving the realism of image composites.ACM Transactions on Graphics (TOG) 31,4,84.) improve synthesis by the outward appearance adjusting synthetic body Vision reasonability.
On the other hand, and image synthesis (DIAKOPOULOS, N., ESSA, I., AND JAIN, R.2004. Content based image synthesis.In Image and Video Retrieval.Springer,299–307.; JOHNSON,M.,BROSTOW,G.J.,SHOTTON,J.,ARANDJELOVIC,O.,KWATRA, V.,AND CIPOLLA,R.2006.Semantic photo synthesis.In Computer Graphics Forum,vol.25,Wiley Online Library,407–413.;LALONDE,J.-F.,HOIEM,D., EFROS,A.A.,ROTHER,C.,WINN,J.,AND CRIMINISI,A.2007.Photo clip art. In ACM Transactions on Graphics (TOG), vol.26, ACM, 3.) it is primarily upon vision content Select and arrangement.The representational work of one type is image collages, will multiple images in certain constraint One image of lower synthesis.The initiator of this kind of work be interactive digital montage technology (AGARWALA, A., DONTCHEVA,M.,AGRAWALA,M.,DRUCKER,S.,COLBURN,A.,CURLESS, B.,SALESIN,D.,AND COHEN,M.2004.Interactive digital photomontage.In ACM Transactions on Graphics (TOG), vol.23, ACM, 294 302.), emerge in large numbers the most successively Go out many follow-up works, as numeral establishment (ROTHER, C., KUMAR, S., KOLMOGOROV, V., AND BLAKE,A.2005.Digital tapestry[automatic image synthesis].In Computer Vision and Pattern Recognition,2005.IEEE Computer Society Conference on,vol.1, IEEE, 589 596.), automatically piece (ROTHER, C., BORDEAUX, L., HAMADI, Y., AND together BLAKE,A.2006.Autocollage.In ACM Transactions on Graphics(TOG),vol.25, ACM, 847 852.), image collages (WANG, J., QUAN, L., SUN, J., TANG, X., AND SHUM,H.-Y.2006.Picture collage.In Computer Vision and Pattern Recognition, 2006IEEE Computer Society Conference on, vol.1, IEEE, 347 354.), puzzle is pieced together (GOFERMAN,S.,TAL,A.,711AND ZELNIK-MANOR,L.2010.Puzzle-like Collage.In Computer Graphics Forum, vol.29, Wiley Online Library, 459 468.), Sketch2Photo(CHEN,T.,CHENG,M.-M.,TAN,P.,SHAMIR,A.,AND HU,S.-M. 2009.Sketch2photo:Internet image montage.ACM Transactions on Graphics28,5, 124:1 10.), PhotoSketcher (EITZ, M., RICHTER, R., HILDEBRAND, K., BOUBEKEUR,T.,AND ALEXA,M.2011.Photosketcher:interactive sketch-based Image synthesis.Computer Graphics and Applications, IEEE31,6,56 66.), Arcimboldo pieces (HUANG, H., ZHANG, L., AND ZHANG, H.-C.2011. together Arcimboldo-like collage using internet images.ACM Transactions on Graphics(TOG) 30,6,155.) and up-to-date ring-type packing piece together (YU, Z., LU, L., GUO, Y., FAN, R., LIU, M., AND WANG,W.2013.Content-aware photo collage using circle packing.IEEE Transactions on Visualization and Computer Graphics99,PrePrints.)。
The synthesis of above-mentioned most of images and integration algorithm all imply hypothesiss: i.e. synthesis content and Source images has identical viewpoint, and therefore they do not process camera parameter information.Technology is drawn in photo clipping (LALONDE,J.-F.,HOIEM,D.,EFROS,A.A.,ROTHER,C.,WINN,J.,AND CRIMINISI,A.2007.Photo clip art.In ACM Transactions on Graphics(TOG),vol. 26, ACM, 3.) in, author attempt infer camera attitude by object height.But, this method without Method processes true three-dimension relation, therefore, it is difficult to carry out the rotation transformation of complexity.In a nearest job, Zheng et al. (ZHENG, Y., CHEN, X., CHENG, M.-M., ZHOU, K., HU, S.-M., AND MITRA,N.J.2012.Interactive images:cuboid proxies for smart image manipulation. ACM Trans.Graph.31,4 (July), 99:1 99:11.) image object is expressed as three-dimensional cuboid agency, And explicitly optimize camera and geometric parameter.Method in the present invention uses three-dimensional agency to represent equally, but needs Process the spatial relationship between the most challenging non-cuboid parts and structure.
Two, the threedimensional model synthesis of data-driven
The threedimensional model synthesis of data-driven has attracted the research interest in a large amount of graphics field recently.Its purpose It is intended to be automatically synthesized out a large amount of novel by the parts in a collection of input 3D shape of combination and meet input shape The 3D shape of shape set internal structure constraint.The three-dimensional modeling of data-driven is by Funkhouser et al. the earliest Propose (FUNKHOUSER, T., KAZHDAN, M., SHILANE, P., MIN, P., KIEFER, W., TAL,A.,RUSINKIEWICZ,S.,AND DOBKIN,D.2004.Modeling by example. ACM Trans.Graph.23,3 (Aug.), 652 663.), their sample modeling allows user's search point The three-dimensional part storehouse cut, the most interactively assembles these parts to form new shape.In follow-up work In, some use user input sketch come search parts (SHIN, H., AND IGARASHI, T.2007. Magic canvas:interactive design of a3-d scene prototype from freehand sketches.In Graphics Interface,63–70.;LEE,J.,AND FUNKHOUSER,T.A.2008.Sketch-based Search and composition of3d models.In SBM, 97 104.), the user that then allows having can be little one Interchange components (KREAVOY, V., JULIUS, D., AND SHEFFER, A. in the shape of group coupling 2007.Model composition from interchangeable components.In Proceedings of the 15th Pacific Conference on Computer Graphics and Applications,IEEE Computer Society,Washington,DC,USA,PG’07,129–138.).Chaudhuri et al. (CHAUDHURI, S.,AND KOLTUN,V.2010.Data-driven suggestions for creativity support in3d Modeling.ACM Trans.Graph.29,6 (Dec.), 183:1 183:10.) propose a kind of data-driven Method is come to designing the incomplete shape suitable parts of recommendation, and devises a kind of shape and structure later Probability represents, can be given the parts more mated in semantic and style recommend (CHAUDHURI, S., KALOGERAKIS,E.,GUIBAS,L.,AND KOLTUN,V.2011.Probabilistic reasoning for assembly-based3d modeling.ACM Trans.Graph.30,4(July),35:1–35:10.)。 Kalogerakis et al. has continued the method for above-mentioned probability inference and has used it for the synthesis of overall shape (KALOGERAKIS,E.,CHAUDHURI,S.,KOLLER,D.,AND KOLTUN,V. 2012.A probabilistic model for component-based shape synthesis.ACM Trans.Graph. 31,4(July),55:1–55:11.)。
Summary of the invention
Present invention aims to the deficiencies in the prior art, it is provided that the structuring synthesis of a kind of image object Method.The method, on the basis of one group of image object giving type and tool different visual angles, passes through constitutional diagram As parts synthesize the image object that vision is the most true to nature.
It is an object of the invention to be achieved through the following technical solutions: the structuring synthesis of a kind of image object Method, comprises the following steps:
(1) pretreatment of image object data: use the digital or figure of network equipment collection particular type object Image set closes, it is desirable to object structures complete display, and uses image segmentation and marking tool to obtain object composition portion The consistent cut zone of part;
(2) the image object synthetic method of viewpoint perception: simple according to user demarcates single image alternately Camera parameter, and combining camera parameter and the three-dimensional agency of image object segmentation information generating structure, then Utilize three-dimensional agency and contact point information that image component is connected into the image object of novelty, finally by Intelligence color adjusts and obtains result images;
(3) training of Bayesian probability graph model and integrated approach: image component based on consistent segmentation, Carry out statistical learning on the image data set of certain objects type, obtain one and can express shape style, object The probability graph model of complicated dependence between structure, component categories and camera parameter;And by acquistion Carry out probability inference on Bayesian graphical model, unit type and style etc. are carried out sampling and obtains high probability image The composition proposal of object and viewpoint attribute, synthesize result images finally by the method in step 2;
(4) derivation of image object synthesis result: the result images that step 2 and step 3 are obtained, including The camera parameter that step 2 obtains and three-dimensional proxy data, derive and storage with general format.
The invention has the beneficial effects as follows: image object is carried out the parts aspect synthesis of viewpoint perception, can be in the future Connect from the parts of different points of view image and synthesize the novel image object that viewpoint is consistent and structure is correct.Meanwhile, Present invention firstly provides a kind of single view camera calibration method based on coordinate frame, it is adaptable to general There is no the obvious or single image camera calibration of full geometry clue;Propose the three-dimensional generation of a kind of structure perception Reason construction method, it is adaptable to the cuboid agency of image object component layer builds;Propose a kind of three-dimensional agency The image component structuring synthetic method guided;Propose and synthesize high-volume based on given sample image object The application of the image object that shape and style change are enriched;The Bayes proposing integrated image view information is general Rate graph model, it is adaptable to phenogram is as viewpoint, structure and the change of shape of collection of objects.Compare existing three Dimension shape synthetic technology, the method can make full use of the acquisition huge, easy of conventional images data bulk, face The informative advantage of colored appearance, synthesizes big measurer photographic quality and planform changes abundant new images Object, meets the requirement of many picture editting's related application, can provide good to 3D shape modeling simultaneously Basis and guiding.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the viewpoint perceptual image object synthetic method in the present invention;
Fig. 2 is the schematic diagram that in the present invention, single view camera calibration and structuring three-dimensional agency build, in figure, (a) For the mutual schematic diagram of user of input picture Yu camera calibration, (b) is each portion of object obtained based on camera parameter Part initially three-dimensional acts on behalf of schematic diagram, and (c) is the optimization schematic diagram carrying out three-dimensional agency based on object structures constraint, D () is the result after structure optimization;
Fig. 3 is the key element used in the image object building-up process in the present invention, and in figure, (a) is three-dimensional Agency and link slot, (b) is the partition member of image object, (c) be in image boundary for image wrapping The two-dimentional invocation point of deformation, (d) is the two dimensional touch point connected for image component;
Fig. 4 is the schematic diagram of the parts color optimization process of image object in the present invention;
Fig. 5 is to carry out image object training and comprehensive main-process stream schematic diagram based on probability graph model in the present invention;
Fig. 6 is the structuring synthesis result figure obtained based on chair image collection in the present invention;
Fig. 7 is the structuring synthesis result figure obtained based on cup image collection in the present invention;
Fig. 8 is the structuring synthesis result figure obtained based on desk lamp image collection in the present invention;
Fig. 9 is the structuring synthesis result figure obtained based on toy airplane image collection in the present invention;
Figure 10 is the structuring synthesis result figure obtained based on robot graphics's set in the present invention;
Figure 11 is the knot that novel degree based on experiment gained composograph object carries out in the present invention user's evaluation Fruit figure;
Figure 12 is in the present invention to be directly synthesized chair and the Comparative result of viewpoint perceptual image object synthesis Figure;
Figure 13 is in the present invention to be directly synthesized toy airplane and the result of viewpoint perceptual image object synthesis Comparison diagram.
Detailed description of the invention
The core of the present invention is to carry out the structure perception of parts aspect based on image object set and have abundant Shape and wind declinable image object synthetic method.The core methed of the present invention is broadly divided into following four portions Point: the pretreatment of image object data, the image object synthesis of viewpoint perception, Bayesian probability graph model Training and comprehensive, the derivation of image object synthesis result.
1. the pretreatment of image object data: use digital equipment or network collection a certain particular type object Image collection, it is desirable to the modular construction complete display of object, and it is each to use image partition tools to obtain in image The region of parts, performs semantic marker simultaneously.
The acquisition of 1.1 image object set
This method is applied to common digital picture.As input data, process require that collection some Similar subject image, and scale and be cropped to unified size.Owing to this method is the structure of a kind of parts aspect It is combined to, therefore the modular construction complete display of the image object collected by the requirement of this step.
1.2 user assists alternately
Owing to content and the border of each parts of objects in images also exist complicated morphological characteristic, it is difficult to robustly Automatically identifying and split, therefore image object set is carried out pre-by this method dependence appropriate user alternately Process so that the carrying out of subsequent step.By use Lazy Snapping technology (LI, Y., SUN, J., TANG, C.-K.,AND SHUM,H.-Y.2004.Lazy snapping.ACM Transactions on Graphics (ToG) 23,3,303 308.) split whole object area, then use LabelMe technology (RUSSELL, B.C.,TORRALBA,A.,MURPHY,K.P.,AND FREEMAN,W.T.2008.LabelMe:a database and web-based tool for image annotation.International journal of computer Vision77,1-3,157 173.) split and each component area of labelling object.For the image being blocked Parts, employing PatchMatch technology (BARNES, C., SHECHTMAN, E., FINKELSTEIN, A., AND GOLDMAN,D.2009.PatchMatch:a randomized correspondence algorithm for structural image editing.ACM Transactions on Graphics(TOG)28,3,24:1–11.) Carry out completion to be blocked region.
2. the image object synthesis of viewpoint perception
This image object synthetic method uses the conduct input of of a sort image object set (such as chair), partly Automatically analyze their structure and extract camera parameter.Then according to fabric matching three-dimensional cuboid generation Reason represents image object.The parts of image object connection can synthesize novelty under the guiding of three-dimensional agency And have an X-rayed correct complete image object.
2.1 image objects represent
Based on step 1.1 and the result of 1.2, build and close between the semantic parts characterizing image object The structured representation of system.Each image object is represented by figure G={V, an E}, the knot during wherein V is figure Point set, E is the set on limit in figure.Each parts CiIt it is a node in V.As two node CiAnd Cj When being connected, E exists limit eij.Wherein, Ci={ Pi,Si,cl,Bi, PiFor belonging to parts CiArea pixel point, SiFor its partitioning boundary, cl is the mass-tone (k=2) extracted by k-means method in parts pixel, BiFor its Corresponding three-dimensional cuboid agency (being obtained by subsequent step 2.2).This structured representation will in subsequent step Can frequently use.
The generation of 2.2 image three-dimensional agencies: according to the simple mutual camera parameter demarcating each image of user, And the three-dimensional agency of the parts segmentation information generating structure of combining camera parameter and image object.
2.2.1 camera calibration based on coordinate axis system
This method use a 2 D vertex and three bivectors (three-dimensional coordinate system initial point and coordinate axes Image projection) as input, compare with existing single view camera calibration method, the method is more suitable for typically The image (the geometry clue on object is less) of property.
Camera projection matrix M3×4It is represented by:
M 3 × 4 = K [ R | t ] , K = f 0 u 0 f v 0 0 1 , t = { t x , t y , t z }
Wherein K is camera internal reference matrix, and { u, v} are set to picture centre, and focal distance f is as variable, and R is for using The parameterized orthogonal matrix of Eulerian angles, t is translation vector, totally 7 variable elements.
The initial point of three-dimensional coordinate system and the image projection point P of point (0,0,1)oAnd PupIt is represented by homogeneous coordinates:
P o = o 1 o 2 1 , P up = z 1 z 2 1
And the 3-D walls and floor { subpoint { l of x, y, z}x,ly,lzIt is represented by:
l x = l x 1 l x 2 1 , l y = l y 1 l y 2 1 , l z = l z 1 l z 2 1
Theoretical according to projective geometry, can set up below equation group:
Following 7 equations are obtained after expansion:
(fc1c2-us2)lx1+(fc2s1-vs2)lx2-s2=0
(fc1s2s3-fc3s1+uc2s3)ly1+(fc1c3+fs1s2s3+vc2s3)ly2+c2s3=0
(fs1s3+fc1c3s2+uc2c3)lz1+(fc3s1s2-fc1s3+vc2c3)lz2+c2c3=0
ft1+ut3-o1t3=0
ft2+vt3-o2t3=0
fs1s3+fc1c3s2+uc2c3+ft1+ut3-z1c2c3-z1t3=0
fc3s1s2-fc1s3+vc2c3+ft2+vt3-z2c2c3-z2t3=0
This method solves above equation group by nonlinear optimization and obtains camera parameter.
2.2.2 the three-dimensional of structure perception acts on behalf of matching
Based on the camera projection matrix M obtained in step 2.2.1, this method uses interactive image technology (ZHENG,Y.,CHEN,X.,CHENG,M.-M.,ZHOU,K.,HU,S.-M.,AND MITRA,N. J.2012.Interactive images:cuboid proxies for smart image manipulation.ACM Trans.Graph.31,4 (July), 99:1 99:11.) initialize the cuboid that coordinate axes aligns.Owing to not tying Structure relation, these independent initialized cuboids are individually dispersed in space, and therefore, this method carries out the overall situation Optimizing to rebuild the structural relation between these parts, the target of optimization is so that parts are meeting image boundary While meet the geometric relationship in three dimensions, its energy equation is as follows:
E(B1,B2,...,BN)=Efitting+Eunary+Epair
Wherein, Section 1 EfittingPunishment optimum results and the departure degree of initial cuboid, after being expressed as optimizing The cumulative distance of two-dimensional projection's point on the summit of cuboid and initial cuboid:
E fitting = Σ i N Σ k | | M v k - M v ‾ k | | 2
Wherein N is part count, vkWithIt is cuboid B after optimization respectivelyiWith initial cuboidSummit, It is used herein as normalized homogeneous coordinates to calculate.
Unitary bound term EunaryPunish the departure degree of structural constraint on single agency.Mainly include two { Globreflection, OnGround} guarantee the correct relation between parts and camera parameter to plant structural constraint. Globreflection represents that a cuboid is symmetrical about certain world coordinates plane reflection, and OnGround represents long Cube must be positioned on ground.This is defined as:
The cuboid set of OnGround constraint.Dist is the some distance function to plane,It is the rectangular of z value minimum The summit of surface.
Binary bound term EpairPunish the departure degree of structural constraint between two agencies.Mainly include three { Symmetry, On, Side} guarantee cuboid structural relation between any two to plant structural constraint.Symmetry represents Between two cuboids symmetrical about certain world coordinates plane reflection, and On and Side represent respectively one rectangular Body is positioned on another cuboid or leans against side.This is defined as:
WhereinBe two-by-two between meet reflective symmetry constraint cuboid set,And SpIt is to meet two-by-two respectively Between On and Side constraint cuboid set.Rf function calculates the mirror position of a point, c according to plane piIt is Cuboid BiCentral point.The central point that Section 1 in above formula is acted on behalf of by requiring geometry is about plane reflection Symmetry guarantees that reflective symmetry retrains, and then the face central point of one cuboid of two punishment is rectangular with another The end face of body or the distance of side guarantee that On and Side retrains.bciIt is BiBottom center, tpjIt is BjTop Face;Similarly, sciIt is BiCenter side and spjIt is BjSide.
In this method, cuboid is coordinate axes alignment, and each cuboid only need to optimize 6 parameters, the most each The yardstick of cuboid and center { lx,ly,lz,cx,cy,cz}.This method uses nonlinear optimization method Levenberg-Marquardt(LOURAKIS,M.,2004.levmar:Levenberg-marquardt nonlinear least squares algorithms in C/C++.[web Page] http://www.ics.forth.gr/~lourakis/levmar/.) minimize gross energy.
The parts synthesis that 2.3 agencies guide: after estimation camera parameter and structure three-dimensional agency, by three-dimensional The each three-dimensional agency of Pan and Zoom in space, be then based on the agency after conversion carry out two dimensional image bending and Bond by selectable one single image object of parts synthesis.
2.3.1 the connection of parts
This method use " link slot " (KALOGERAKIS, E., CHAUDHURI, S., KOLLER, D., AND KOLTUN,V.2012.A probabilistic model for component-based shape synthesis. ACM Trans.Graph.31,4(July),55:1–55:11.;SHEN,C.-H.,FU,H.,CHEN,K., AND HU,S.-M.2012.Structure recovery by part assembly.ACM Transactions on Graphics (TOG) 31,6,180.) carry out attached article parts.After building three-dimensional agency, it is connected with each other Parts between can generate a link slot pair, the most each groove comprises: a) be connected with groove associated components Target component;B) interconnective contact point between two parts;C) scale size of linking objective parts.This Method uses following strategy generating Three-Dimensional contact point: if two agencies are intersected, then take intersection midpoint; Otherwise, then take small volume and act on behalf of the midpoint in the middle-range long-pending nearest face of bigger agency in vitro.Two dimensional touch point is phase The parts connected (within given threshold value, this method uses 5 pixels) close together in image boundary Point.
(1) three-dimensional agency is connected
This method optimizes under the constraint acting on behalf of annexation each acts on behalf of BiPosition and size.ci, liWith Act on behalf of B respectivelyiCenter, the Three-Dimensional contact point of size and link slot k.Definition conversion BiWithRigid body Conversion:Wherein Λi=diag (si), siAnd tiBe respectively in rigid body translation scaling and The part of translation.Act on BiConversion need the size and location of agency by being attached thereto to be limited, Therefore defineFor in original training image with BiBy the size of the target proxy that groove k connects.We are legal Justice contact energy function item EcAs follows:
WhereinFor the agency that matched by link slot to set, miAnd mjFor coupling link slot each Index in affiliated agency.The contact point of coupling can be interconnected by this energy function item, and makes Interconnective agency has the size matched.Additionally, this method uses two energy letters keeping shape Several EsAnd EtExcessive deformation is produced during avoiding optimizing:
E s = Σ i | | s i - [ 1,1,1 ] T | | 2 , E t = Σ i | | t i | | 2
This method obtains each act on behalf of B by minimizing following energy functioniOptimal transformation
T i * = arg min T i ω c E c + ω s E s + ω t E t
Wherein, this method uses parameter ωc=1, ωs=0.5 and ωt=0.1.
(2) bending two dimensional image parts
After connecting three-dimensional agency, this method calculates and curved three dimensional acts on behalf of corresponding two dimensional image parts. When the three-dimensional building image in step 2.2 is acted on behalf of, this method can act on behalf of B eachiImage boundary on uniformly Sampling niIndividual two dimension reference pointAnd willProject to obtain accordingly in the visible face of agency Three-dimensional reference point (point outside two-dimensional projection border is acted on behalf of in removal).N in all experiments of this methodi=200.
Above-mentioned three-dimensional reference point is obtained corresponding three-dimensional by calculated optimal transformation in (1) step by this method Impact point, obtains two dimension target point by the two dimensional surface of objective spot projection to synthesis scene the most again
Due to the limited viewpoint change of three-dimensional agency, this method uses two dimensional affine conversion to carry out image wrapping. Optimum affine transformation square is obtained by the two-dimentional reference point after minimizing bending and the distance between two dimension target point Battle array
A i * = arg min A i Σ i | | A i · a ^ i , r - b ^ i , r | | 2
Then useEach parts are carried out image wrapping.
(3) two dimensional image parts are connected
This method, after (2) step, can use equallyTwo dimension in each link slot k of each parts i is connect ContactCarry out converting the changing coordinates obtaining two dimensional touch point.Then breadth First is used Search strategy moves image component.Maximum parts can be inserted queue as benchmark.Whenever being hit by a bullet from queue Go out parts, all be connected with these parts and be not accessed for parts can according in link slot when the first two Dimension contact point information moves, and is then inserted queue successively.After all parts are the most accessed, search I.e. terminate, obtain the composograph object after two dimensional image parts connect.Wherein, by parts i to parts When j moves, (link slot of its coupling is miAnd mj), use and first makeCenter andCenter Align, then willIn be positioned at the point outside the partitioning boundary of parts j and constantly move to borderline closest approach Strategy.
2.3.2 the optimization of color
Synthesis result on, this method based on color matching degree model (O ' DONOVAN, P., AGARWALA,A.,AND HERTZMANN,A.2011.Color compatibility from large Datasets.ACM Transactions on Graphics (TOG) 30,4,63.) and data-driven palette carry out The optimization of color.After step 1, this method is taken out from each image of image data set by k-means Take 5 tone colour tables, from all these colors, extract the palette work of 40 colors the most again with k-means For data palette.In the image of synthesis, the tone in the domain color of maximum part is endowed data toning Plate (coordinating variance parameter σ) generates new palette.Then, this method uses and the work of Yu et al. (YU,L.-F.,YEUNG,S.K.,TERZOPOULOS,D.,AND CHAN,T.F.2012. DressUp!:Outfit synthesis through automatic optimization.ACM Transactions on Graphics31,6,134:1 134:14.) in similar color optimisation strategy select to have from new palette The color set of maximum color matching degree, and by color transfer method (REINHARD, E., ADHIKHMIN,M.,GOOCH,B.,AND SHIRLEY,P.2001.Color transfer between Images.Computer Graphics and Applications, IEEE21,5,34 41.) by each portion of imparting Part.
3. the training of Bayesian probability graph model is with comprehensive
In this step, it is primarily based on the image component of consistent segmentation, in the view data of certain objects type Carry out statistical learning on collection, obtain one and can express shape style, object structures, component categories and camera ginseng The probability graph model of complicated dependence between number, then by carrying out probability on the probability graph model of acquistion Reasoning, samples to unit type and style etc. and obtains composition proposal and the viewpoint genus of high probability image object Property, finally utilize viewpoint perceptual image object synthetic method to generate all of result images.
The training of 3.1 Bayesian probability graph models
This method use be similar to Kalogerakis et al. work (KALOGERAKIS, E., CHAUDHURI,S.,KOLLER,D.,AND KOLTUN,V.2012.A probabilistic model for Component-based shape synthesis.ACM Trans.Graph.31,4 (July), 55:1 55:11.) in Particular type image object set is modeled by the probability graph model used, and implicit variable therein represents The population structure of object and parts style, and observational variable represents component categories, geometric properties and they it Between neighbouring relations.The feature of this method is introduced into extra observational variable and carrys out the view information of phenogram picture.
3.1.1 the expression of Bayesian probability graph model
Upper table lists the stochastic variable in the probability graph model that this method uses.Wherein V is viewpoint parameter, by The camera parameter obtained in step 2.2.1 carries out Mean Shift cluster (radius is set to 0.2) and obtains, and represents Integer value for classification index.Geometric properties vector ClComprise the size of the three-dimensional cuboid agency of l class components With the points distribution models characteristic vector of two-dimensional image profile (COOTES, T.F., TAYLOR, C.J., COOPER,599D.H.,GRAHAM,J.,ET AL.1995.Active shape models-their training and application.Computer vision and image understanding61,1,38–59.)。 The calculating of points distribution models is carried out respectively according to different viewpoint classifications.Implicit variable R and S are from training data Acquistion.
Whole joint probability distribution can be decomposed into the product of a set condition probability:
3.1.2 the training of Bayesian probability graph model
After step 1 and 2, from the image of training data (data volume is K), extract set of eigenvectors CloseWherein Ok={ Vk,Nk,Ck,Dk}.This method is by maximizing following likelihood ratio function J Carry out the structure and parameter of learning probability figure:
Wherein use evaluation score (SCHWARZ, the G.1978.Estimating the of bayesian information criterion Dimension of a model.The annals of statistics6,2,461 464.) select optimum probability graph Structure G (definition territory scope).It is the maximum a-posteriori estimation (MAP) of parameter in G, mθIt is independent The number of parameter, and K is the number of data.This method uses greatest hope (EM) algorithm to calculate maximum The corresponding parameter of likelihood ratio
Wherein P (θ | G) is the prior probability distribution of parameter θ.In the M step of EM algorithm, conditional probability Table parameter (discrete random variable) R, V, Sl,DlWith coditionally linear Gaussian distributed constant (continuous random variables) Cl Computational methods and Kalogerakis et al. method (KALOGERAKIS, E., CHAUDHURI, S., KOLLER,D.,AND KOLTUN,V.2012.A probabilistic model for component-based Shape synthesis.ACM Trans.Graph.31,4 (July), 55:1 55:11.) in form consistent.And In E step, below equation is used to calculate implicit variable R and S at observational variable OkUnder conditional probability:
P ( O k ) = Σ R , S l P ( R , S l , O k )
P ( R , S l | O k ) = P ( R , S l , O k ) P ( O k )
Wherein l is component categories labelling.This method calculates joint probability P (R, S according to following formulal,Ok):
This method is by gradually increasing the definition territory scope (selectable value of discrete variable) of implicit variable R and S Greedy strategy searches the graph structure G that J is maximum.
3.2 Bayesian probability graph models comprehensive
Image object comprehensively can be divided into three steps.First, determine for comprehensive image component set. Then, the connection of these parts is synthesized a single image object.Finally, the face of optimum synthesis object Color.Wherein, rear two steps can be realized by the viewpoint perceptual image object synthetic technology in step 2.
3.2.1 component set is comprehensive
In mathematical meaning, different component set can be considered the different sampled points of probabilistic model.Therefore, This method uses depth-first search strategy to search for the shape space of image object.Open from root node variable R Beginning, each stochastic variable in searching route is given its possible value respectively.With Kalogerakis etc. The work of people (KALOGERAKIS, E., CHAUDHURI, S., KOLLER, D., AND KOLTUN, V.2012.A probabilistic model for component-based shape synthesis.ACM Trans. Graph.31,4 (July), 55:1 55:11.) in deterministic algorithm consistent, this method for probability less than certain Threshold value (uses 10 in realization-10) variable-value situation carry out beta pruning.For guaranteeing to search for feasibility, become continuously Amount ClOnly can be endowed the analog value that parts have occurred in training data.Search utility eventually find effective In sampled point, variable ClValue determine for synthesis component set.
4. the derivation of image object synthesis result: the image object of above-mentioned steps is synthesized result with general format Derive and storage so that it is can be used in other Digital Media series products and application.
4.1 the derivation of result
The step 2 of this method and 3 can synthesize the view data of large quantities of new object.For the general number with industry Mutually compatible according to form, composograph specifically can be stored as the relevant file format termination as this method Phenolphthalein goes out form.On the other hand, step 2 constructs the three-dimensional agency of parts to image object, can will instruct Practice image object and be analyzed the three-dimensional agency of gained together with its modular construction, cut zone, texture image one Act the file format exporting as clear readability, use for the technology needed or application.
The application of 4.2 results
As general graphical representation form, the derivation result of this method can apply to all existing images In editor/design system.
Embodiment
Inventor is equipped with Intel Core2Quad Q9400 central processing unit, 4GB internal memory and Win7 at one All embodiments of the present invention are achieved on the machine of operating system.Inventor uses in detailed description of the invention In all parameter values of listing, obtained all experimental results shown in accompanying drawing.
Figure 12 and Figure 13 illustrates and is directly synthesized and the comparison of this method synthesis gained image object.The most straight It is bonded into result and often produces significantly distortion and unnatural, and the viewpoint perception synthetic method in the present invention Novel image object the most true to nature can be generated.
Inventor has invited some users to be given birth to test the integration algorithm of Bayesian probability graph model in this method Synthesis result (Fig. 6 to Figure 10) become.Evaluation result shows, compared with the object in original image set, User thinks that the image object of synthesis is visually the most reasonable, and is strictly novel object.Wherein, always The selection of totally 48% thinks that synthetic body is that vision is reasonably (the brightest compared with select training data 52% Significant difference is different);And for the synthesis novelty (see Figure 11) of each classification, the user of the highest 90% thinks machine The result of people's set is new object, and the user of minimum 79% thinks that the result of desk lamp set is new object.
On the training time of Bayesian probability graph model, chair set (42 sub-pictures, 6 component categories, Totally 243 parts) substantially need 20 minutes, cup set (22 sub-pictures, 3 component categories, totally 44 Individual parts) substantially need 5 minutes, desk lamp set (30 sub-pictures, 3 component categories, totally 90 parts) Substantially needing 12 minutes, collection of bots (23 sub-pictures, 5 component categories, totally 130 parts) is big Causing to need 15 minutes, toy airplane set (15 sub-pictures, 4 component categories, totally 63 parts) is substantially Need 3 minutes.On the generated time of image object, the component set needed for listing synthesis averagely needs 20 seconds to 1 minute;And the connection procedure of parts averagely needs 4 seconds in every sub-picture, color optimizes then Average needs 1 second.

Claims (2)

1. the structuring synthetic method of an image object, it is characterised in that comprise the steps:
(1) pretreatment of image object data: use the digital or image set of network equipment collection particular type object Close, it is desirable to object structures complete display, and use image segmentation and marking tool to obtain object building block Consistent cut zone;
(2) the image object synthetic method of viewpoint perception: according to the simple mutual camera demarcating single image of user Parameter, and combining camera parameter and the three-dimensional agency of image object segmentation information generating structure, then utilize Image component is connected into the image object of novelty by three-dimensional agency and contact point information, finally by intelligence Color adjusts and obtains result images;This step specifically includes following sub-step:
(2.1) user assists alternately, to the coordinate center of world coordinate system and coordinate axes in each image of input Direction is demarcated;
(2.2) it is optimized based on customer interaction information and is calculated the camera parameter of input picture;
(2.3) based on the image object segmentation information obtained in step (1), integrating step (2.2) obtains Each parts of image object are calculated initial three-dimensional agency by camera parameter;
(2.4) initial three-dimensional agencies based on each parts of object required by step (2.3), between each parts of binding object Overall structure constraint information is optimized and generates final three-dimensional agency;
(2.5) assembled scheme based on input and view information, by the three-dimensional agency of each building block by it even Three-Dimensional contact dot information defined in access slot links together;
(2.6) based on the three-dimensional agency connected, the primitive part image in agency is raw by two dimensional affine conversion Becoming the image of component under current view point, wherein affine transformation calculates under the constraint that three-dimensional is acted on behalf of and obtains;
(2.7) based on the two dimensional touch dot information defined in parts link slot by all parts image in image space Seamless link is together;
(2.8) color combined information based on source image and existing color harmony degree evaluation model optimization calculate To the final mass-tone of each parts of object, and by the color of the color transfer method conversion each parts of object, obtain The result images of new object;
(3) training of Bayesian probability graph model and integrated approach: image component based on consistent segmentation, specific Carry out statistical learning on the image data set of object type, obtain one can express shape style, object structures, The probability graph model of complicated dependence between component categories and camera parameter;And by the Bayes in acquistion Carrying out probability inference on graph model, sampling unit type and style obtains the group of high probability image object One-tenth scheme and viewpoint attribute, synthesize result images finally by the method in step (2);
(4) derivation of image object synthesis result: the result images that step (2) and step (3) are obtained, bag Include camera parameter and three-dimensional proxy data that step (2) obtains, derive and storage with general format.
The structuring synthetic method of image object the most according to claim 1, it is characterised in that described step (3) Including following sub-step:
(3.1) structural information of image object is set up based on the consistent segmentation result obtained in step (1), including each The type of parts, number, shape facility, interconnected relationship attribute, wherein shape facility is by two-dimensional silhouette Points distribution models coordinate describe;
(3.2) carry out clustering based on the camera parameter obtained in step (2.2) and obtain the discrete representation of image viewpoint, The i.e. index value of viewpoint classification;
(3.3) based on the view information obtained in the object structures information obtained in step (3.1) and step (3.2), The image data set of whole particular type object utilizes EM algorithm and bayesian information criterion training One Bayesian probability graph model, with this generate model to characterize whole image data set space object structures, The complicated dependence of annexation between shape style, component categories and parts;
(3.4) sample based on probability graph model, obtain composition proposal and the view information of new object, the most each The source of building block;
(3.5) based on a large amount of assembled schemes obtained in step (3.4) and view information, utilize in step (2) Method generate all of result images.
CN201410163775.0A 2014-04-22 2014-04-22 A kind of structuring synthetic method of image object Active CN103942795B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410163775.0A CN103942795B (en) 2014-04-22 2014-04-22 A kind of structuring synthetic method of image object

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410163775.0A CN103942795B (en) 2014-04-22 2014-04-22 A kind of structuring synthetic method of image object

Publications (2)

Publication Number Publication Date
CN103942795A CN103942795A (en) 2014-07-23
CN103942795B true CN103942795B (en) 2016-08-24

Family

ID=51190446

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410163775.0A Active CN103942795B (en) 2014-04-22 2014-04-22 A kind of structuring synthetic method of image object

Country Status (1)

Country Link
CN (1) CN103942795B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376332A (en) * 2014-12-09 2015-02-25 深圳市捷顺科技实业股份有限公司 License plate recognition method and device
CN104850633B (en) * 2015-05-22 2018-10-12 中山大学 A kind of three-dimensional model searching system and method based on the segmentation of cartographical sketching component
US11244502B2 (en) * 2017-11-29 2022-02-08 Adobe Inc. Generating 3D structures using genetic programming to satisfy functional and geometric constraints
US10755112B2 (en) * 2018-03-13 2020-08-25 Toyota Research Institute, Inc. Systems and methods for reducing data storage in machine learning
CN111524589B (en) * 2020-04-14 2021-04-30 重庆大学 CDA (content-based discovery and analysis) shared document based health and medical big data quality control system and terminal

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7729531B2 (en) * 2006-09-19 2010-06-01 Microsoft Corporation Identifying repeated-structure elements in images
CN102800129A (en) * 2012-06-20 2012-11-28 浙江大学 Hair modeling and portrait editing method based on single image
CN103251431A (en) * 2012-01-16 2013-08-21 佳能株式会社 Information processing apparatus, information processing method, and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7729531B2 (en) * 2006-09-19 2010-06-01 Microsoft Corporation Identifying repeated-structure elements in images
CN103251431A (en) * 2012-01-16 2013-08-21 佳能株式会社 Information processing apparatus, information processing method, and storage medium
CN102800129A (en) * 2012-06-20 2012-11-28 浙江大学 Hair modeling and portrait editing method based on single image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"A Probabilistic Model for Component-Based Shape Synthesis";Evangelos Kalogerakis et al.;《ACM Transactions on Graphics(TOG)-Proceedings of ACM SIGGRAPH 2012》;20120630;第31卷(第4期);第2节最后1段-第3节第4段,第4节第1-7段,第5-5.2节 *
"语义驱动的三维形状分析及建模";徐凯;《中国博士学位论文全文数据库 信息科技辑》;20120715(第7期);第6.1-6.2节 *

Also Published As

Publication number Publication date
CN103942795A (en) 2014-07-23

Similar Documents

Publication Publication Date Title
Fu et al. 3d-future: 3d furniture shape with texture
EP3188033B1 (en) Reconstructing a 3d modeled object
Cheng et al. Intelligent visual media processing: When graphics meets vision
US9792725B2 (en) Method for image and video virtual hairstyle modeling
CN103942795B (en) A kind of structuring synthetic method of image object
Shen et al. Clipgen: A deep generative model for clipart vectorization and synthesis
CN103093488B (en) A kind of virtual hair style interpolation and gradual-change animation generation method
Wu et al. Modeling and rendering of impossible figures
Li et al. Advances in 3d generation: A survey
CN115359163A (en) Three-dimensional model generation system, three-dimensional model generation method, and three-dimensional model generation device
Wang et al. A survey of deep learning-based mesh processing
Liao et al. Advances in 3D Generation: A Survey
Meyer et al. PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DOF Object Pose Dataset Generation
Kazmi et al. Efficient sketch‐based creation of detailed character models through data‐driven mesh deformations
Patterson et al. Landmark-based re-topology of stereo-pair acquired face meshes
Cífka et al. FocalPose++: Focal Length and Object Pose Estimation via Render and Compare
Dhondse et al. Generative adversarial networks as an advancement in 2D to 3D reconstruction techniques
Chen et al. View-aware image object compositing and synthesis from multiple sources
Wang et al. SketchBodyNet: A Sketch-Driven Multi-faceted Decoder Network for 3D Human Reconstruction
Yang et al. Application of Virtual Image Symbol Reconstruction Technology in Advertising Design
Li et al. Stereoscopic image recoloring
US20230215094A1 (en) Computer Graphics Interface Using Visual Indicator Representing Object Global Volume and/or Global Volume Changes and Method Therefore
Singla et al. Exploration and Analysis of various techniques used in 2D to 3D Image and Video Reconstruction
Feng et al. Status of research on parametric methods for the reconstruction of 3D models of the human body for virtual fitting
Li et al. Global deformation model for 3D facial combination

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant