CN100407798C - Three-dimensional geometric mode building system and method - Google Patents

Three-dimensional geometric mode building system and method Download PDF

Info

Publication number
CN100407798C
CN100407798C CN2005100122739A CN200510012273A CN100407798C CN 100407798 C CN100407798 C CN 100407798C CN 2005100122739 A CN2005100122739 A CN 2005100122739A CN 200510012273 A CN200510012273 A CN 200510012273A CN 100407798 C CN100407798 C CN 100407798C
Authority
CN
China
Prior art keywords
unit
model
video
geometric
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2005100122739A
Other languages
Chinese (zh)
Other versions
CN1747559A (en
Inventor
汪国平
王宇宙
张凯
葛文兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN2005100122739A priority Critical patent/CN100407798C/en
Publication of CN1747559A publication Critical patent/CN1747559A/en
Application granted granted Critical
Publication of CN100407798C publication Critical patent/CN100407798C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The present invention discloses a three-dimensional geometric model building system which comprises a plurality of video input devices, a plurality of single-video stream visual analysis units, a multiple-video stream visual analysis unit, a real-time interactive semantic recognition unit, a three-dimensional geometric model building unit, a three-dimensional model plotting unit and a video output device, wherein the video input devices are used for collecting video streams used for a designer to design actions, the single-video stream visual analysis units are used for detecting athletic areas and non athletic areas in the video streams for estimating the athletic direction and the speed of an article, predicting a next position, calculating the edge contour of an athletic article, and estimating contour characteristics, the multiple-video stream visual analysis unit is used for carrying out binocular three-dimensional match, carrying out three-dimensional reconstruction and the match of the athletic track of the article, and calculating the cross section of the athletic article, the real-time interactive semantic recognition unit is used for processing the output of the multiple-video stream visual analysis unit to obtain human-computer interactive semantic meanings, the three-dimensional geometric model building unit is used for obtaining a three-dimensional geometric designing model, and the three-dimensional model plotting unit is used for plotting the geometric model on the video output device which is used for displaying the geometric shape of the article and the three-dimensional geometric model designed by a geometric shape designer.

Description

Three-dimensional geometric mode building system and method
Technical field
The present invention relates to a kind of three-dimensional geometric mode building system and method, relate in particular to real-time, interactive three-dimensional geometric mode building system and the method that are applied to the computer-assisted conception design based on computer stereo vision.
Background technology
At present, in the Design of Industrial Product field, the applied computer-aided design of the detailed design phase in later stage (CAD) technology is quite ripe.For example in auto manufacturing, whole automotive type design flow waterline almost all is to finish by area of computer aided.Yet the conceptual design of product still realizes by cartographical sketching.The designer is according to the creation intention cartographical sketching of oneself.The client selects some patterns that meet demand from tens of the sketches of drawing, feed back to the designer in conjunction with the requirement of some personalizations.The designer requires further refinement sketch in view of the above.Through submission with feedback repeatedly repeatedly, finally determine the conceptual design of a new model.Obviously, the cartographical sketching on plane does not have the vision intuitive, is not easy to the collaborative design in strange land, can't provide direct data for the accurate modeling of next step detailed design.From domestic and international automobile industry development, automobile market is tending towards saturated at present, and increasingly competitive, the automotive development cycle foreshortens to 1 year.These new challenge propelling vehicle manufacturing industry make full use of seeks more efficiently and effectively method for designing to accelerate product development.In the level of informatization more and more higher today, the designer wishes and can freely express design idea by the auxiliary Conceptual Design of automation, and realizes being connected with the automatic of other operation, thus the complete informationization of realization Automobile Design flow process.
Therefore, research computer-assisted conception design Geometric Modeling Method and device, with people's geometry designs notion, the man-machine interaction mode by nature is input to computer system, sets up THREE DIMENSION GEOMETRIC MODELING intuitively.Such apparatus and method have important use to auto manufacturing and are worth.
A kind of natural real-time, interactive three-dimensional geometric design system like this relates to three technical fields, i.e. the vision computing technique of the real time human-machine interaction technology of 3 d geometric modeling technology, nature and support nature real-time, interactive geometry designs.
Aspect the THREE DIMENSION GEOMETRIC MODELING modeling technique, the design of curve form and expression both can be with the method for definite control vertex commonly used, as the NURBS method, also can use curved surface (or three-dimensional body, or curve) to generate by forms of motion.Non-plane motion generation method has wide range of applications, for example, and the design and the expression of the object curved surface of elongated strip; Aircraft configuration also can resolve into the union of strip curved surface (entity).Movement generation method has intuitively, characteristic of simple, makes many surface modeling work obtain simplifying, thereby is subjected to liking of designer deeply.This curved surface form that is called sweeping surface is more meeting the requirements aspect efficient and the quality than other formative method in many occasions, rather than is confined to the formative method of this stationary object of push-and-pull of control vertex.
The THREE DIMENSION GEOMETRIC MODELING research that the real-time, interactive motion generates curved surface relates to man-machine interaction theory and its implementation.Man-machine interaction modelling people to reach the purpose of modeling, is one of restriction key in application factor for the control of moving object.Better human-computer interaction technology makes computer be easy to use, and can enhance productivity.The breadboard HMGR project of University of Washington's human-computer interaction technology (Hand Motion Gesture Recognition System) research gesture identification, they use hidden Markov model to carry out gesture identification.By this system, the interactive user interface designer can realize the multichannel input system, is the gesture sign format with the movement conversion of hand in the three dimensions, thereby the input of other situations such as phonetic entry, static gesture language can be combined.But this device uses the data glove transducer, and only is confined to simple application.
The interactive interface of the heuristic 3 D rendering of Brown Univ USA's development is intended to improve the availability at gesture interaction interface, improves the availability based on the modeling of order.In this system, by relevant how much parts in the high brightness scene, the user passes on the clue that need carry out which kind of operation to system.Possible user's operation is inferred according to this clue by system, and shows with the form of thumbnail.The user finishes edit operation by the method for clicking thumbnail.Operation clue mechanism makes the geometrical relationship between the graphic assembly in user's given scenario.But under the lower situation of operation model discrimination, can use many thumbnails operation indicating to solve problem.
Number of patent application is 00118340 Japan Patent, and a kind of device and recognition methods and program record medium that complicated hand shape image is carried out hand shape and gesture identification is provided.
The Nara, Japan technical college has been set up a kind of hybrid three-dimensional object modeling NIME of system (ImmersiveModeling Environment).Inherited the advantage of conventional two-dimensional gui interface and three-dimensional immersion modeling environment, this system uses the back projection type display device that tilts that the two-dimensional/three-dimensional modeling environment is combined into an integral body.On display surface, use two-dimentional gui interface modeling mutual.Simultaneously, make use preface three-dimensional imaging and form of a stroke or a combination of strokes input unit, realize the bumpless transfer of two-dimensional/three-dimensional modeling environment with six-freedom degree.
The real-time, interactive motion generates curved surface 3 d geometric modeling method and also relates to theory of vision computing and technology.The goal in research that vision is calculated is to make computer have ability by the cognitive three-dimensional environment information of two dimensional image, be that three-dimensional shape is rebuild, object space motion and in the geometric position in space, be to obtain and study the objective world object to scan motion and sports envelope thereof, set up the most important theories instrument of sports envelope geometrical model.Over past ten years, real-time dense stereoscopic parallax coupling becomes a reality.But up to date, can realize that really the system that handles in real time all needs the specialized hardware support as digital signal processor (DSP) or programmable gate array (FPGA).For example, the three-dimensional matching system of J.Woodfill and Von Herzen adopts 16 Xilinx 4025 FPGA, with the image of the velocity process 320*240 pixel of per second 42 frames.And P.Corke uses similar algorithm with Dunn, adopts FPGA hardware to realize, can be with the velocity process 256*256 pixel image of per second 30 frames.
Number of patent application is 03153504 Chinese patent use phase place and stereovision technique, grating is projected on the body surface, and then realize the object dimensional surface profile measurement.
Aspect conceptual design cartographical sketching identification facility, the cartographical sketching instrument of Brown Univ USA's computer graphics study group in conjunction with some characteristics of a paper sketch and computer CAD system, provides the rough 3D polyhedron modeling based on gesture interaction.This instrument adopts early stage traditional 2D interface notion, by this form of cartographical sketching, provides the user to draw the ability of various three-dimensional basic elements according to the regular grass of simple placement.
Tokyo Univ Japan has designed the mutual free form surface design tool based on cartographical sketching, target be set up simply, unmounted model design system fast, as the modelling of roubded and bulging toy or similar object.The user is mutual drafting two dimension stroke on screen, constructs three-dimensional polygon surface from two-dimentional silhouettes.
The ARCADE research project use virtual design desktop of Germany Fraunfofer computer graphics study institute is carried out the Free Surface Modeling Technique Research.Subscriber station used the data glove gesture to realize the modeling of Free Surface before the virtual design desktop.The ARCADE system uses the 3D input equipment to realize effective and accurate modeling ability.Interaction technique comprise the free space Object Creation and based on establishment, the implicit expression boolean operation of other object, 3D picks up with fast moving based on the contextual modification of layout, discrete operations, both hands input etc.
Application number a kind of device that has been 00103458 patent disclosure is used for the process of word processing, and the user carries out character calligraph and manuscript editing with pen and editor's gesture.
From top realization system as can be seen, the technology of the each side that the 3 d geometric modeling instrument of natural interaction is related, under the promotion of Product Conceptual Design application demand, obtained extensive studies, yet many problems that also exist of said system need be solved.Conclusion is got up, and major defect comprises the following aspects:
Man-machine interaction lacks enough naturalities.No matter be virtual design table, data glove, 3D mouse, touch sensor, still online cartographical sketching recognition device all needs the user to dress, directly contact three-dimensional input tool.Thisly can make troubles to the user by upgrade kit and direct man-machine physical interconnection mode.Non-natural design tool can influence catching immediately of designer's creation thought inspiration.Illustrate this problem one of the most direct evidence be exactly, but still to be topmost design tool in the actual design process although need to make repeated attempts in the manual drawing sketch process.
The direct deficiency that geometric modeling is mutual.Cartographical sketching need be converted to two dimensional model with notion, is converted to threedimensional model by identification facility then.Online cartographical sketching system has increased the weight of the constraint to the designer with these two processes by facility constraints together on the contrary.System based on data glove passes through the operation of the action realization of hand for model in the virtual scene, the designer is to be undertaken by indirect mode for modelling and modification, when the designer attempts to change the shape of model, need operate the control point, with the conceptual model of Feedback Design person's design.Design idea is often retrained and interrupts by design tools such as data glove, causes the discontinuity of design process, influences design effect.
Range of application and application mode underaction.Sketch system, pen type input, data glove, device for force feedback and virtual design desktop etc. can only be realized the outside input of single mode.For example, may wish to obtain certain silhouettes of a certain concrete material object in designer's design process, set up geometrical model by this profile by simple mode.Existing system uses special equipment, and the user is directly introduced conceptual design process, also has bigger difficulty.
Realization technology itself is perfect inadequately.Cartographical sketching need be converted to threedimensional model with two dimensional model, has identification error rate problem.Because cartographical sketching has random and the very big degree of freedom, sample collection is very difficult, especially the semanteme of sketch has more ambiguity and uncertainty, both can't enumerate recognition objective with template definition fully, also is difficult to adopt the mode of predefine dictionary library to support its semantic interpretation; Online cartographical sketching has improved recognition correct rate, but in the actual use, geometry designs reciprocal process is usually interrupted by system interaction, and service efficiency is not high; Data glove etc. also exist problems such as spatial dimension that transducer uses, positioning resolution.
System's versatility deficiency.Said system is used special equipment, therefore costs an arm and a leg.Both improve product design costs, improved training cost and use cost again.This has improved the threshold that the user enters undoubtedly, has limited range of application.The reduction of the sub-cost of universal attribution of computer and the raising of versatility.
Summary of the invention
For the problem that overcomes the existing system existence the present invention has been proposed.The objective of the invention is to use general, convenient, economic physical unit, by nature, real-time interaction method, according to both hands in the three dimensions or hand-held object surface shape and space motion path, use based drive Geometric Modeling Method, finish establishment, modification and the editor of three-dimensional geometry body, realize three-dimensional objects profile conceptual design Geometric Modeling.
In order to realize this purpose, the invention provides a kind of 3 d geometric modeling method, comprising:
One, video input step is used for gathering video flowing from being distributed in designer's a plurality of video input devices (101) on every side.
Two, single video stream visual analysis step, a plurality of video flowings that above-mentioned collection video flowing step is gathered are handled through a single video stream visual analysis separately, to detect moving region and the non-moving region in the video flowing, estimate moving object travel direction and speed and predict next movement position and calculate the edge contour of moving object and estimate contour feature.
Three, multiple video strems visual analysis step, be used to receive the result of above-mentioned some single video stream visual analysises, carry out the binocular solid coupling, and carry out three-dimensional reconstruction and movement locus of object match and calculate the moving object cross section, and the cross section profile of the object model, movement locus of object and the object that are obtained is offered the semantic identification of real-time, interactive treatment step based on the profile that is obtained and feature.
Four, the semantic identification step of real-time, interactive be used for the output of many videos visual analysis step is handled with acquisition man-machine interaction semanteme, and utilization is stored in the man-machine interaction semanteme that a semantical definition explanation in the semantic model memory cell is obtained in advance.
Five, 3 d geometric modeling step, the output that is used for object model, movement locus of object, object cross section profile and real-time, interactive semanteme identification step to the output of many videos visual analysis step is handled, thereby obtain the three-dimensional geometric design moulding, and result is stored in the 3-D geometric model memory cell.
Six, threedimensional model plot step is used for the 3-D geometric model of 3-D geometric model memory cell real-time storage is plotted to video output device.
Seven, video output step is used for the designed THREE DIMENSION GEOMETRIC MODELING of display design teacher on video output device.
In addition, this also provides a kind of three-dimensional geometric mode building system clearly, comprising:
One, a plurality of video input devices are distributed in the video flowing that the designer is used to gather designer's design action on every side.
Two, the single video corresponding to each video input device flows the visual analysis unit, make a plurality of video flowings of gathering by above-mentioned video input device flow the processing of visual analysis unit separately through a described single video, to detect moving region and the non-moving region in the video flowing, estimate moving object travel direction and speed and predict next movement position and calculate the edge contour of moving object and estimate contour feature.
Three, multiple video strems visual analysis unit, be used to receive the result of above-mentioned a plurality of single video stream visual analysises unit, carry out the binocular solid coupling, and carry out three-dimensional reconstruction and movement locus of object match based on profile that is obtained and feature, calculate the moving object cross section, and the cross section profile of the object model, movement locus of object and the object that are obtained is offered the semantic recognition unit of real-time, interactive.
Four, the semantic recognition unit of real-time, interactive be used for the output of many videos visual analysis unit is handled with acquisition man-machine interaction semanteme, and utilization is stored in the man-machine interaction semanteme that a semantical definition explanation in the semantic model memory cell is obtained in advance.
Five, 3 d geometric modeling unit, be used for the output result of many videos visual analysis unit and the output of the semantic recognition unit of real-time, interactive are carried out integrated treatment, thereby obtain the three-dimensional geometric design moulding, and result is stored in the 3-D geometric model memory cell.
Six, threedimensional model drawing unit, the 3-D geometric model that is used for 3-D geometric model memory cell real-time storage is input, and geometrical model is plotted on the video output device.
Seven, video output device is used for the designed THREE DIMENSION GEOMETRIC MODELING of display design teacher.
According to a further aspect in the invention, a kind of single video stream visual analysis processing method is provided, comprise: the image analysing computer method, to handling from the vision signal of video input apparatus collection, acquisition has the characteristic video stream of different resolution yardstick and different characteristic element, for motion detection and three-dimensional coupling provide input; Real time movement detection method detects the moving region in the video flowing and the background area of non-motion; Estimation and Forecasting Methodology are estimated travel direction and speed and are predicted next movement position.The profile computational methods are used to calculate the edge contour of moving object and estimate contour feature.
According to another aspect of the invention, a kind of multiple video strems visual analysis processing method is provided, comprise: Stereo Matching Algorithm, make from the data of two Video Motion Detection outputs and calculate through three-dimensional coupling and parallax, obtain the Region Segmentation and the moving object depth information of moving object; The threedimensional model method for building up is set up three-dimensional stereo model by the depth data of solid coupling output; From the method for contour recovery solid, according to the profile data of description that profile calculate to obtain, set up movable body threedimensional model and main projection properties thereof already, and, obtained the three-dimensional profile of moving object from the algorithm of contour recovery shape; The cross section computational methods according to outline data and movement locus, are set up the cross section profile with respect to movement locus; The track fitting method for carrying out smoothing processing from estimation and prediction acquisition data, obtains level and smooth movement locus;
According to a fifth aspect of the invention, a kind of real-time, interactive method for recognizing semantics is provided, comprise: collision checking method, movement locus and appearance profile according to moving object determine that semantic type is the operability semanteme, carry out collision detection with the 3-D geometric model of having set up already afterwards, determine position and mode that collision takes place; The operational semantics analytical method according to the result of collision detection, is determined operational semantics such as semantic operand, action type; The interaction semantics analytical method is handled according to the input that movement locus is determined and cross section profile obtains, and obtains the analysis result of interaction semantics; The voice semantic analysis, the semanteme that obtains interactive voice from speech analysis is resolved;
According to a sixth aspect of the invention, provide a kind of 3 d geometric modeling method, having comprised: reprocessing method is used for eliminating repetition, the plyability motion of video image moving object in motion process; Motion processing method is used for eliminating moving object trembling and shake in the generation of motion process; Comprise computational methods, be used for calculating the enveloping surface that object of which movement produces according to movement locus and the cross section profile eliminating shake and repeat; The moulding edit methods is used for the 3-D geometric model of being set up is made amendment.
According to a seventh aspect of the invention, provide a kind of picture output device to comprise: display unit is used to show geometrical model, locus and the attitude of moving object, and shows the THREE DIMENSION GEOMETRIC MODELING of having set up already;
According to an eighth aspect of the invention, provide a kind of method for drafting, the three-dimensional geometric design modeling rendering on display unit, is plotted in moving object geometrical model, relative position and attitude on the display unit.
Use native system or application this method to carry out the product design conceptual design and will produce useful effect in the following areas.
1 by using physical equipment cheap, non-special use, reduces product design costs, widens range of application.
The real-time, interactive mode of 2 natures is convenient to domestic consumer is introduced open conceptual design cyclic process, helps eliminating passivity, one-sidedness in the product shape geometry designs, directly includes many-sided factor in Products Development, design, manufacture process.
3 design environments based on vision and common apparatus are convenient to introduce environmental characteristic.Environmental characteristic is a key factor that influences product design, comprises the environmental physics feature of microcosmic level and the Social Culture feature of macro-level.The device of this invention allows to walk out the design office, carries out conceptual design in the environment for use of product, is a kind of solution that realizes that the good environment feature is introduced.
4 support the three-dimensional visualization design.The subject matter of conceptual design is with the deisgn product modelling.The modeling method of conceptual design comprises from the regular high-rise visual expression that is defined into.At present using more model tormulation mode to comprise language model, geometrical model, graphical model, object model, knowledge model and iconic model, is iconic model near people's the thinking and the model of reasoning.This invention is the important method that visual thinking model is used for PRACTICE OF DESIGN.
5 direct 3-dimensional digital products.This invention provides direct 3-dimensional digital product to export as system, can be designed efficiently and estimate feedback.
Description of drawings
Fig. 1 is the block diagram of structure of the three-dimensional geometric mode building system 100 of first specific embodiment according to the present invention;
Fig. 2 is the schematic diagram of the concrete layout type of video camera of first specific embodiment according to the present invention;
Fig. 3 is the block diagram of the list/multiple video strems visual analysis cellular construction of first specific embodiment according to the present invention;
Fig. 4 is the operational flowchart of the multiple video strems visual analysis unit of first specific embodiment according to the present invention;
Fig. 5 shows the coordinate system of stereo restoration operation of the present invention;
Fig. 6 is the semantic identification process block diagram of first specific embodiment according to the present invention;
Fig. 7 is the 3 d geometric modeling flow chart of first specific embodiment according to the present invention;
Fig. 8 is the schematic diagram of the first specific embodiment computer system according to the present invention;
Fig. 9 is the block diagram of the three-dimensional geometric mode building system 200 of second specific embodiment according to the present invention;
Figure 10 is the semantic identification process block diagram of second specific embodiment according to the present invention;
Figure 11 is the block diagram of the three-dimensional geometric mode building system 300 of the 3rd specific embodiment according to the present invention;
Figure 12 shows the concrete layout type of video camera of the 3rd specific embodiment according to the present invention;
Figure 13 is the operational flowchart of the multiple video strems visual analysis unit of the 3rd specific embodiment according to the present invention;
Figure 14 is the schematic diagram of the 3rd specific embodiment computer system according to the present invention.
Figure 15 shows the mutual gesture mode of first specific embodiment according to the present invention and gives an example
Embodiment
First specific embodiment
Fig. 1 is the block diagram of structure of the three-dimensional geometric mode building system 100 of first specific embodiment according to the present invention.As shown in Figure 1, video input device 101 can be a digital camera, is used to absorb THREE DIMENSION GEOMETRIC MODELING designer's geometry designs modeling motion image.In the present embodiment, video input device 101 is made up of four digital camera C01, C02, C03 and C04, its concrete layout type in one embodiment of the invention as shown in Figure 2, they are placed on respectively, and conceptual design person is right-hand, right front, four the different positions in left front and left, height and attitude apart from ground are expressed as suitable to be fit to designer's gesture, the design action that is the designer should be able to be shot with video-corder fully by video camera, and does not influence designer's operation and other activity.Provide a single video stream visual analysis unit 104 respectively corresponding to each digital camera.Every digital camera is directly connected to the pairing single video stream of this digital camera visual analysis unit 104 by general-purpose interface according to known connected mode, respectively the continuous video flowing of gathering from its pairing video input device 101 is handled by each single video stream visual analysis unit 104, detect moving region and non-moving region in the video flowing, estimate the direction and the speed of object of which movement and predict next movement position, and calculate the edge contour of object and estimate contour feature, and result is offered multiple video strems visual analysis unit 105.Multiple video strems visual analysis unit 105 receives the processing output of above-mentioned 4 single video stream visual analysis unit 104, comprises the moving region testing result, the direction of motion of moving object and speed, the profile of object and contour feature.Afterwards, the 105 pairs of inputs in multiple video strems visual analysis unit data are handled, and carry out the binocular solid coupling, carry out three-dimensional reconstruction and movement locus of object match based on profile and feature, calculate moving object cross section etc.The output result that the processing of above-mentioned single video stream visual analysis unit 105 produces is an object model, movement locus of object and object cross section profile.The result of multiple video strems visual analysis unit 105 is offered the semantic recognition unit 106 of real-time, interactive as input.The semantic recognition unit 106 of real-time, interactive is used for the output of many videos visual analysis unit 105 is handled with acquisition man-machine interaction semanteme, and utilizes the semantical definition that is stored in advance in the semantic model memory cell 110 to explain the man-machine interaction semanteme that is obtained.Integrated treatment is carried out in the output of the semantic recognition unit 106 of the output result of the unit of video visual analysis more than 107 pairs, 3 d geometric modeling unit 105 and real-time, interactive, thereby obtains the three-dimensional geometric design moulding.The result of 3 d geometric modeling unit 107 stores in the 3-D geometric model memory cell 111.Threedimensional model drawing unit 108 is plotted to geometrical model on the video output device 109 based on the 3-D geometric model of real-time storage in the 3-D geometric model memory cell 111.Video output device 109 is used to show the geometric shape and the designed THREE DIMENSION GEOMETRIC MODELING of geometrical body designer of object.
Specifically describe the operation and the system configuration of the three-dimensional geometric mode building system of present embodiment below with reference to the accompanying drawings.System is made up of several groundwork states such as system initialization, camera calibration, object model foundation, the designs of motion geometric modeling.
<system initialization 〉
The initial work state of descriptive system at first.After the three-dimensional modeling designer opens this three-dimensional geometric mode building system 100, at first begin the system initialization process.System initialization comprises that system's initial parameter is loaded and foundation, and the background video statistical model is set up.
The system initialization process is according to predetermined set and current system configuration, loading system initial work environmental parameter at first, and system's initialization environmental parameter is as shown in table 1 in this example.
Table 1 embodiment initiation parameter table:
System operational parameters
User ID
User list
The initial model sign
The model tabulation
The camera parameters table
Video camera layout parameter table
Image resolution
Scale factor
Background reference images table
The working document catalogue
Data file content
Subscriber's meter
User ID
The initial geometric model sign
Working model identifies recently
The model table
Model identification
Types of models
The model data structured fingers
The model file pointer
The camera parameters table
The video camera number
The video camera list index
Video camera layout parameter pointer
The video camera table
First video camera sign
The first camera parameters table
Second video camera sign
The second camera parameters table
...
I camera parameters table
Figure C20051001227300161
Video camera layout parameter table
The first video camera layout parameter head pointer
The second video camera layout parameter head pointer
...
I video camera layout parameter table
Base length to the 1st video camera
Base direction to the 1st video camera
Base length to the 2nd video camera
Base direction to the 2nd video camera
...
After the initial work parameter loads, by video input device 101 (C01, C02, C03 and C04) to operation background continuous acquisition multiple image, and these images are offered the pairing single video of each video input device stream visual analysis unit 104, set up the statistical model of initial background by it by handling these images.For example in the present embodiment, as shown in Figure 3, single video stream visual analysis unit 104 comprises that also an image analysing computer unit 1041 is used for the vision signal of being gathered by video input device 101 is handled to obtain the statistical model of background, promptly has the characteristic video stream of different resolution yardstick and different characteristic element.
Image analysing computer unit 1041 in a specific embodiment of the present invention has adopted the following method of setting up background statistical model.For in the system with each video camera C iCorresponding background B i, set up an initial back-ground model M iTo B iIn each pixel p, the definition μ pBe the expectation of this color value, σ p 2Variance for color value distributes has following formula:
μ p = 1 n Σ t = 1 n h p - - - ( 1 ) t
σ p 2 = 1 n Σ t = 1 n ( h p t - μ p ) 2 - - - ( 2 )
Wherein, h p tBe the p o'clock color value on t frame image.Like this, (the μ of each some p p, σ p 2) formation B iBackground model:
M i = { ( μ p , σ p 2 ) | p ∈ B i } - - - ( 3 )
In addition, system generates according to predetermined set has simple, the surperficial initial object model of regular geometric, for example, and cuboid, spheroid etc.Be provided with by the modification system, can select to generate which kind of predefined object or not generate any initial object.
<camera calibration 〉
When using this system first or layout, position and the attitude of video camera change, also or when having changed video camera, need carry out camera calibration, promptly begin a camera calibration course of work of setting up camera parameters.Under this operating state, each video camera in the system will obtain image and according to well known to a person skilled in the art that camera marking method calculates the inner parameter and the external parameter of each video camera.If not using this system first and not changing layout, position and the attitude of video camera, do not change the camera calibration work that video camera does not then need to set up camera parameters yet.
<object model is set up 〉
After the above-mentioned camera calibration course of work finished, the designer just can use the described system of present embodiment to utilize hand or hand-held object to carry out 3 d geometric modeling in the preceding appropriate location of shooting unit.The geometrical model of setting up according to the geometric shape of the profile of hand or hand-held object is called the object dimensional model, or abbreviates object model as.Object model is a kind of dynamic model, with setting about and the variation of the locus of hand-held object, attitude, shape and instant the variation.The present invention uses object model as design tool, carries out the THREE DIMENSION GEOMETRIC MODELING design.Simultaneously, simply hand-held object model itself also can be used as the initial model of THREE DIMENSION GEOMETRIC MODELING design.
<motion geometric modeling design 〉
After system has experienced initialization, camera calibration, object model and sets up three runnings, enter motion geometric modeling design work state.The designer uses the described system of present embodiment to utilize hand or hand-held object in the preceding appropriate location of shooting unit, by the profile and the motion thereof of hand or hand-held object, oneself three-dimensional geometric design design is input in the three-dimensional geometric mode building system 100 to obtain the 3 d geometric modeling result.The 3-D geometric model of being set up file format according to the rules by real-time storage in 3-D geometric model memory cell 111 and outputed in real time on the video output device 109 such as CRT or LCD display.For example, can pass through storage medium, for example magnetic disc store is stored, and also can transmit and stores by computer network facility or movable memory equipment.In a specific embodiment, the form of a 3-D geometric model can be stored with the form shown in the table 2.
Table 2 3-D geometric model data structure
The model table
Model based coding
Model identification
The types of models coding
The types of models sign
The model attributes table
The model parameter table
Version number
Number of objects
The list object pointer
List object
Object type
Object identity
The parent object pointer
The subobject pointer
The point data number
The limit number
The face number
The point data structured fingers
Point data length
Line data structure pointer
The line data length
Face data structure pointer
The face data length
The point data structure
Point numbering (8 byte)
X (4 byte)
Y (4 byte)
Z (4 byte)
The limit data structure
Limit numbering (8 byte)
Count (8 byte)
Point numbering data chainning head pointer
The face data structure
Face numbering (8 byte)
Limit number (8 byte)
Limit numbering data chainning head pointer
As shown in table 2, the 3-D geometric model data structure comprises: model based coding, model identification, types of models coding, types of models sign, model attributes table, model parameter table, version number, data item such as number of objects list object pointer.Wherein, model identification is used for this model of unique identification.The types of models coding is used for representing the type of this model.In the present embodiment, the 3-D geometric model that generates as design result uses identical data structure to store with the object model that generates as design tool.Therefore, types of models is used for distinguishing 3-D geometric model and object model.Simultaneously, in order to design conveniently, system classifies to the predefine object model and gives unique types of models sign.The model attributes table definition attribute that geometrical model possessed, scale properties for example, position attribution etc.Version number is used for representing the version of geometrical model data structure.A model can have some objects to constitute, and the attribute of list object description object comprises object type, object identity, parent object pointer, subobject pointer, geometric data storage organization or the like.
The designer can be by the geometry designs design of various ways to its conceptual design of computer expression.
First kind of mode directly carried out three-dimensional geometric design by designer's hand.Be exactly that the designer expresses the three-dimensional geometric design design by the sports envelope and the gesture of hand specifically.
The second way is to carry out the design of three-dimensional geometry body by hand-held object.Can be divided into specifically: 1. the designer expresses the three-dimensional geometric design design that will set up by the profile of hand held object merely, for example, the designer wants to design a ball, before he can be placed into video camera with a ball, system will set up the three-dimensional geometry profile of this ball automatically as object model.Then, the designer copies as 3-D geometric model output by issuing an order with object model; 2. the designer unites expression sports envelope 3-D geometric model by the profile and the motion thereof of this hand held object, and for example, the designer is hand-held ball before video camera, and system will set up the three-dimensional geometry profile of this ball automatically as object model.Hand-held this ball of designer is done circular-arc spatial movement, system is according to as the object model of the three-dimensional geometry profile of this ball and the movement locus of this ball, set up a three-dimensional geometric design model that forms by the sports envelope of ball, i.e. space circle arc pipe line output of going forward side by side.
The third mode is to the editor of the 3-D geometric model of having set up and modification.The designer uses hand or/and hand held object is carried out the three-dimensional geometry editor according to predetermined interaction semantics model to existing 3-D geometric model, and for example stretch, distortion etc. generates new 3-D geometric model.
Certainly, such just as readily understood by the skilled person, the designer can comprehensively use above-mentioned three kinds of modes to express design concept.
No matter with the combination of above-mentioned which kind of mode or which kind of mode, the described three-dimensional geometric mode building system of present embodiment all to the video data that obtains by a plurality of video cameras via following processing:
● single video stream visual analysis
In this link, each single video stream visual analysis unit 104 is handled the continuous video flowing of gathering from its pairing video input device 101 respectively, detect moving region and non-moving region in the video flowing, estimate the direction and the speed of object of which movement and predict next movement position, and calculate the edge contour of object and estimate contour feature, result is offered multiple video strems visual analysis unit 105.Particularly, comprise following operation:
I. image analysing computer
As shown in Figure 3, each single video stream visual analysis unit 104 also comprises an image analysing computer unit 1041, and for the input of each road video input device 101, image analysing computer unit 1041 will carry out the processing of following steps.At first, from each video input device 101, obtain each image frame, then, set up hierarchy, the image sequence of output layering according to image resolution.
The implementation method that the image individual-layer data is set up in image analysing computer unit 1041 in one embodiment of the invention is: adopt three grades of pyramid structures, to raw video M LSet up a group image sequence { M L, M L-1, M L-2, M wherein I-1Be M iReduce the image that obtains after the half-resolution.And with M LBe called the pyramid bottom, or resolution layer; With M L-1Be called the pyramid middle level, or the intermediate-resolution layer; With M L-2Be called the pyramid top layer, or low-resolution layer.The image pyramid data structure is as shown in table 3.
Table 3 image data buffer queue is described
Maximum table is long
Team's head pointer
The tail of the queue pointer
Frame data pointer 1
Frame data pointer 2
Frame data pointer 3
Frame data pointer 4
Frame data pointer 5
Frame data pointer 6
Frame data pointer 7
Frame data structure
The frame buffer sequence mark
The frame type mark
The raw video data pointer
Intermediate-resolution image data pointer
Low resolution image data pointer
Heir pointer
Forerunner's pointer
In processing procedure, the time series image data is stored in the image processing buffer queue successively, and queue length is 7 image frames.Formation uses the quiet cycle table to realize.
II. real time kinematics detects
As shown in Figure 3, each single video stream visual analysis unit 104 also comprises a motion detection unit 1042, is used for that the result of each image analysing computer unit, road 1041 is carried out real time kinematics and detects.The target of motion detection is to detect moving region on the image and the direction of motion to cut apart and the edge, moving region to obtain accurately the moving region.The real time kinematics detecting unit detects the moving region and obtains the direction of motion by optical flow method by the image difference algorithm on the middle layer image of multiresolution image bearing layer.In the present embodiment, 1042 pairs of individual-layer data results from image analysing computer unit 1041 of motion detection unit are carried out following operation:
A) background is eliminated
For each video camera C iThe image I that gathers constantly at t i t, carry out foreground area according to following method and extract:
If image I i tThe color value of mid point p is h p, by following formula with image binaryzation:
d p = 1 | h p - μ p | ≤ 3 σ p 0 | h p - μ p | > 3 σ p - - - ( 4 )
Image I i tIn all d pBe that zero some p constitutes foreground area F i
B) image Difference Calculation:
Differential images I d(i, j) be bianry image d (i, j):
d ( i , j ) = 0 | f 1 ( i , j ) - f 2 ( i , j ) | ≤ ϵ 1 other - - - ( 5 )
Wherein, f k(i is two adjacent frame images of front and back in the time series image j), and ε is a very little positive number.In differential images, numerical value is that 1 location of pixels shows the place that motion takes place.Thus, utilize this formula to obtain the moving region.
C) plane motion calculation of parameter
Describe below according to movement velocity c (u, operation v) on the optical flow method calculating projection plane.The basic calculating step is as follows:
1. for pixels all on the piece image (i, j), estimate light stream initial value c (i, j)=0;
2. make k represent iterations, for all pixels (i, j), utilize formula (6), (7) evaluation:
u k ( i , j ) = u ‾ k - 1 ( i , j ) - f x ( i , j ) P ( i , j ) D ( i , j ) - - - ( 6 )
v k ( i , j ) = v ‾ k - 1 ( i , j ) - f y ( i , j ) P ( i , j ) D ( i , j ) - - - ( 7 )
Wherein,
P(i,j)=f x(i,j)u+f y(i,j)v (8)
D ( i , j ) = λ 2 + f x 2 ( i , j ) + f y 2 ( i , j ) - - - ( 9 )
U and v represent the average in u neighborhood and the v neighborhood, and this average can calculate by utilizing the image local smoothing operator.According to the size of noise in the image to the λ value.When noise is big, get less value; When noise hour, get bigger value.
3. work as
&Sigma; i &Sigma; j E 2 ( i , j ) < &epsiv; - - - ( 10 )
The time, iterative process stops; Wherein,
E 2 ( x , y ) = ( f x u + f y v + f t ) 2 + &lambda; ( u x 2 + u y 2 + v x 2 + v y 2 ) - - - ( 11 )
III. profile calculates
As shown in Figure 3, described single video stream visual analysis unit 104 also comprises a profile computing unit 1043, and it passes through to obtain based on the detection algorithm of the colour of skin zone of hand on the moving object zone.Remove the zone of hand, obtain the zone of hand-held object.Calculate the edge contour in zone on this basis.In the present embodiment, 1043 pairs of results from motion detection unit 1042 of profile computing unit carry out on hand that edge obtains and meticulous profile detecting operation:
A) edge obtains operation on hand
In the present embodiment, using the strategy of movable information and skin color information fusion to carry out cutting apart with profile of hand region detects.
The zone of at first describing based on movable information obtains:
In this course, video camera remains static, and the color image sequence of its shooting is by R, G, the color image sequence that the B component is formed.In profile computing unit 1043, (t express time coordinate system, i are represented any one component in the rgb space to definition s=for x, y) presentation video plane space coordinate system.I then t iExpression component i is at t luminance picture constantly.Utilize t-Δ t, t, constantly continuous 3 two field pictures of t+ Δ t calculate t i component moving image d constantly t iFor
d t i ( s ) = min ( | I t i ( s ) - I t - &Delta;t i ( s ) | , | I t i ( s ) - I t + &Delta;t i ( s ) | ) - - - ( 12 )
i=r,g,b (13)
Comprehensively (b) component can get color list at t moving image d constantly for r, g tFor
d t ( s ) = max ( d t r ( s ) , d t g ( s ) , d t b ( s ) ) - - - ( 14 )
At last, moving image is carried out level and smooth and binary conversion treatment, obtain moving image
Figure C20051001227300243
Next hand zone identification based on Face Detection described:
We know color of the same race brightness difference under the illumination of different distributions, but the sensation of color is to keep constant basically.Profile computing unit 1043 has utilized the human body skin color in Luv spatial distributions and this feature of brightness just, carries out Face Detection on the Luv color space.The operating procedure of profile computing unit 1043 is as follows:
1. color space conversion.Rgb color space is converted to the Luv color space;
2. with former frame Face Detection result (or initial features of skin colors) as initial value, adopt the rolling average algorithm that the moving region is cut apart.With the density fonction of each look unit of present frame as a probability density function.The average and the difference between the central value of the probability function of definition one's respective area are the mean transferred vector.The mean transferred vector can find the actual direction of maximal density by search so always along the direction of maximum probability density.Concrete computational methods are:
If pixel p in the image iColor character vector x iCan be defined as:
x i=(L,u,v) (15)
Wherein, L, u, v are the relative brightness and the u of image *, v *Chromaticity coordinate.Make x 0Expression p 0The color character vector of point, x iP in the expression window iThe characteristic vector of point.Window size is 7 in the present embodiment.By the iterative computation of following two steps, obtaining density gradient is zero point.
Calculate average mobile vector m H, G (x)(x)
m h , G ( x ) ( x ) = &Sigma; i = 1 n x i g ( | | x - x i h | | 2 ) &Sigma; i = 1 n g ( | | x - x i h | | 2 ) - x - - - ( 16 )
Wherein, h is a color-resolution, and g (x) is polynary normal function
g ( x ) = ( 2 &pi; ) - d / 2 exp ( - 1 2 | | x | | 2 ) - - - ( 17 )
Press m H, G (x)(x) translation kernel function G (x).Wherein, x is a current window central feature vector, m H, G (x)(x) being in the window is the poor of the weighted average of power and window center with G.Above-mentioned iterative process must restrain and converge on density gradient by level and smooth track is zero point.
3. after determining local maximum point, the feature class that partial structurtes are determined and maximum of points interrelates according to feature space obtains the physical location of the colour of skin in the space.Comprehensively, obtain the edge contour of hand based on the testing result and the based drive testing result of the colour of skin.
B) meticulous profile detects
As mentioned above, the motion detection result on the layer image has obtained the Region Segmentation of low resolution in multiresolution.The profile computing unit 1043 that will describe present embodiment below detects step according to Region Segmentation result and profile, in the wide step of calculating with the comparatively accurate contour of object of acquisition of the enterprising road wheel of raw video.In the present embodiment, the edge detection method of utilization S.M.Smith and J.M.Brady detects the edge, and this method is used 5*5 circular window template.Concrete steps are:
1. set up the rim detection district according to the Region Segmentation result, this zone is near the edge in certain pixel wide;
2. window center is placed on each image point position in the rim detection district calculation window central point r 0Has the number n (r of the point of close brightness with other pixel r in the window 0) to determine whether this pixel is the image border point.Utilize following formula (18) to calculate n (r 0)
n ( r 0 ) = &Sigma; r c ( r , r 0 ) - - - ( 18 )
Wherein, c (r, r 0) the interior brightness I (r) and window center point r that puts r of expression window 0Brightness I (r 0) similarity degree
c ( r , r 0 ) = e - ( | I ( r ) - I ( r 0 ) | t ) 6 - - - ( 19 )
Wherein, t represents luminance threshold.Obviously, when the difference of 2 brightness during less than t, c (r, r 0)=1.By n (r 0) can calculate the center and the direction at edge, suppress the refinement edge by non-maximum.
Thus, profile computing unit 1043 has obtained the edge contour of hand and hand held object.
IV. estimation and prediction
As shown in Figure 3, single video stream visual analysis unit 104 comprises that also an estimation and predicting unit 1044 are used to follow the tracks of the movement locus of the profile that is calculated by profile computing unit 1043.One embodiment of the present of invention use the center of based target profile as focus, and pursuit movement also obtains time-discrete movement locus.The result of motion tracking is a series of plane coordinatess that have the profile center of time tag, and promptly (t i) represents four-tuple for x, y, and wherein, i is used to represent video camera.In order to improve system time efficient, present embodiment uses Kalman filtering to carry out motion prediction, predicts the outcome as the input of motion detection apparatus, and pre-estimating of next frame motion detection is provided.
Thus, each parts and structure in the single video stream visual analysis unit 104 have been described, with and concrete operation.The output of single video stream visual analysis unit comprises: the image sequence of resolution demixing, the contour feature sequence of image sequence and space point sequence.The output result data structure of single video stream visual analysis unit 104 is as shown in table 4.
Table 4 single video flow analysis output data structure is described
The video camera sign
The frame sequence sign
Time marking
The frame type sign
The image data attribute list
Point characteristic attribute list
The provincial characteristics data attribute list
The edge feature data attribute list
The raw video data pointer
Intermediate-resolution image data pointer
Low resolution image data pointer
The characteristic point data list index
Feature structure tables of data pointer
Moving region tables of data pointer
Background area tables of data pointer
Moving region marginal date table
Single video is analyzed output sequence and is comprised the video camera sign, frame sequence sign, time marking, the frame type sign, image data attribute list, some characteristic attribute list, the provincial characteristics data attribute list, edge feature data attribute list, raw video data pointer, intermediate-resolution image data pointer, low resolution image data pointer, characteristic point data list index, feature structure tables of data pointer, moving region tables of data pointer, background area tables of data pointer, moving region marginal date table etc.Wherein, the video camera sign is used for the different video camera of compartment system.The frame sequence sign is the frame sequence numbering of this video camera.Time marking is used to write down the acquisition time of this image frame.The attribute list of various characteristics has been described the information such as length of giving the category feature data.The Various types of data pointer has provided the address of image data and characteristic storage organization.
● the multiple video strems visual analysis
Describe the multiple video strems visual analysis unit 105 of present embodiment below with reference to Fig. 3 and Fig. 4, the output of all four single video stream visual analysis unit 104 is accepted in this unit, and carries out following processing:
I. three-dimensional coupling
It is right that four road video input apparatus can be combined into three groups of inputs in twos.And the multiple video strems visual analysis unit 105 of present embodiment comprises a three-dimensional matching unit 1051, this unit uses two pairs of three-dimensional couplings of dual-view input structure dual-view of three pairs of input centerings, thereby calculate the three dimensional space coordinate of object, promptly by the plane coordinates (x on solid coupling and the photographic plane, y) calculate depth coordinate (z coordinate), and then determine the object dimensional space coordinates according to camera parameters with respect to video camera.Can realize depth reconstruction by the three-dimensional coupling of dual-view image.At first, determining an impact point in the non-background area on the image of unit 1051 above-mentioned acquisitions, is that center definition size is the template window of m * n with the impact point.In order to seek the match point of this impact point on another image, the gray matrix that possible size of matching area definition is (m+c) * (n+d) on another image is realized the image coupling as search window by block matching algorithm.Block matching algorithm is exactly in search window the moving die plate window and calculates the similarity matrix of size for (c+1) * (d+1) by match measure.Image blocks in the similarity matrix in the pairing search window of maximum (minimum) value is exactly the optimum Match of template window.
The three-dimensional matching unit 1051 of present embodiment use as shown in Equation (20) poor absolute value and computing formula calculate similarity measure:
&rho; ( i , j ) = &Sigma; u = 1 m &Sigma; v = 1 n | I ( u , v ) - I &prime; ( u + i , v + j ) | , 0 &le; i &le; c , 0 &le; j &le; d - - - ( 20 )
Wherein, and I (u, v) and I ' (u, v) two view images; M, n are width and height, the width of c+m, d+n region of search and the height of match window.During specific implementation, can be by selecting different threshold values one by one, the simplicity that obtains ratio absolute value and method realizes implementation method more fast.
The three-dimensional matching unit 1051 of present embodiment uses mobile window in possible object boundary zone, that is, thereby the position of moving window obtains more overlay area on the input image, therefrom selects best match position.And use normal window in other zone.
II.SFS﹠amp; FBR (from contour recovery three-dimensional and based on the three-dimensional reconstruction of feature)
As shown in Figure 3 and Figure 4, multiple video strems visual analysis unit 105 also comprises a SFS﹠amp; FBR unit 1052 and object model memory cell 1056.Wherein, SFS﹠amp; FBR unit 1052 is three-dimensional and based on the three-dimensional reconstruction unit of feature from contour recovery, and object model memory cell 1056 is used to store the object model of reconstruction.SFS﹠amp; FBR unit 1052 is used for according to the profile result of calculation of single video stream visual analysis unit 104 and the object model of having set up in 1056 storages of object model memory cell, utilization recovers the object surfaces shape from the method for contour recovery shape and based on the Target Recognition Algorithms of feature.This unit carries out the contour recovery shape manipulation based on spatial variations, and algorithm is as follows:
1. for spatial point P, calculate the some P ' on the image, (x according to formula (24), (25) h, y h)
P x &prime; = ( P - C ) &CenterDot; h &RightArrow; ( P - C ) &CenterDot; a &RightArrow; - - - ( 24 )
P y &prime; = ( P - C ) &CenterDot; v &RightArrow; ( P - C ) &CenterDot; a &RightArrow; - - - ( 25 )
Wherein,
h &RightArrow; = f &CenterDot; h &RightArrow; &prime; + a &RightArrow; &CenterDot; x h - - - ( 26 )
v &RightArrow; = f &CenterDot; v &RightArrow; &prime; + a &RightArrow; &CenterDot; y h - - - ( 27 )
P ' is that (Z) the some P in is in the projection of photo coordinate system for X, Y for world coordinate system.C is the vector of world coordinate system initial point to projection centre,
Figure C20051001227300285
Be the unit vector of camera light direction of principal axis, Be the unit vector of horizontal direction on the photo coordinate system,
Figure C20051001227300287
It is the unit vector of vertical direction on the photo coordinate system.Detailed coordinate system is seen Fig. 5, i.e. W-XYZ world coordinate system and c-xy image plane coordinate system.
If 2. P ' is positioned at the background area on the image, then spatial point P is removed point; Otherwise P is retained;
3. use simple space Octree algorithm, simplify computational process;
4. cut out by many views profile, obtain the reconstructed results of several different angles;
III. object model is set up
As shown in Figure 3 and Figure 4, multiple video strems visual analysis unit 105 also comprises a modelling unit 1053, is used for using flux of light method to calculate three dimensions Three-dimension Target coordinate after obtaining three-dimensional coupling.The concrete operations of this unit are as follows:
Suppose one group of some X in the three dimensions jBy matrix is P iOne group of shot by camera.Use x i jI spatial point of mark at j video camera as the coordinate on the plane, known image coordinate x then i jSet, ask video camera matrix P iWith spatial point X jMake
P iX j=x i j (21)
If for X jPerhaps P iDo not do further constraint, above-mentioned reconstruct is projective reconstruction, i.e. an X jDiffer a three-dimensional arbitrarily projective transformation with real reconstruct.
Since factors such as noise, matching error, EQUATION x i j=P iX jCan be dissatisfied fully.Usually such error of supposition satisfies Gaussian Profile, obtains maximum likelihood then and separates.At this, need to estimate projection matrix
Figure C20051001227300291
Really project to picture point
Figure C20051001227300292
Spatial point
Figure C20051001227300293
, promptly
x ^ i j = P ^ i X ^ j - - - ( 22 )
And in each two field picture, minimize the image distance between re-projection point and the picture point, that is:
min &Sigma; P ^ i , W X ^ j d ( P ^ i X ^ j , x i j ) - - - ( 23 )
Wherein, (x y) is several picture distance between homogeneous some x and the y to d.Estimate X by the beam of adjusting between each video camera center and the three dimensions point jAnd P iThe maximum likelihood value.
The employed initial value of said method is the projection matrix parameter of camera calibration process acquisition, the estimated value and the initial three-dimensional reconstruction estimated value of previous frame.And, obtain the Euclidean three-dimensional reconstruction by the camera parameters constraint.
It is point cloud model that the threedimensional model that said method obtains is expressed.Then point cloud model is converted into the geometrical model that subdivision surface is represented.Because this transfer process is known for those of ordinary skill in the art, therefore do not repeat them here.
The three-dimensional model of object is set up in modelling unit 1053, comprises the three-dimensional model of hand, has the three-dimensional model of the hand-held object of simple geometry profile.The hand-held object of simple geometry profile can be an elastic steel strip with definite shape.Simple geometry profile hand held object also can be a spheroid, and a cuboid etc., the cross sectional shape of these objects are used as the section line that motion generates curved surface.
According to operation control command, system can rebuild the 3 D surface shape of stationary object in the environment, and reconstructed results is can be by interactive editor's object model, and this result is stored in the object model memory cell 1056.Table 5 has been described the document format data of 3-D geometric model.
Table 5 3-D geometric model document format data
Figure C20051001227300301
The point data structure
Point numbering (8 byte) X (4 byte) Y (4 byte) Z (4 byte)
The limit data structure
Limit numbering (8 byte) Count (8 byte) Point numbering (4 byte) .....
The face data structure
Face numbering (8 byte) Limit number (8 byte) Limit numbering (8 byte) .....
IV. track fitting
Movement velocity and contour of object that single video stream visual analysis unit 104 obtains are the projection of object space motion on the video camera photographic plane, be based on the movement locus coordinate sequence of video camera as the plane, so these plane coordinatess are that space-time is discrete.As shown in Figure 3 and Figure 4, multiple video strems visual analysis unit 105 also comprises a track fitting unit 1055 and a movement locus memory cell 1058, a plurality of video camera photographic planes and video camera relative orientation parameter that described track fitting unit 1055 usage spaces distribute estimate space continuous motion track by space coordinates intersection, curve fit and describe.Output region point coordinates sequence also stores in the described movement locus memory cell 1058.The operation of this unit is as follows:
A) calculate based on the spatial point of photography matrix
For one group of video camera C k, k=1 ..., n, n are the video camera total numbers.The absolute fix parameter of each video camera is P i(z), the absolute orientation parameter is R for x, y i(α, beta, gamma).Acquired each video camera as the plane coordinates constantly of t in the plane motion trajectory coordinates sequence be s (x, y, t, i), wherein, x, y are the picture plane coordinatess, t is the time, i is a camera number.By video camera external parameter and the unique definite projection matrix M of inner parameter i
M i = m 11 i m 12 i m 13 i m 14 i m 21 i m 22 i m 23 i m 24 i m 31 i m 32 i m 33 i m 34 i - - - ( 28 )
Synchronization, spatial point P (X, Y, Z) each as the s of projection coordinate on the plane (x, y, t, i) satisfy equation (29), (30):
( x i m 31 i - m 11 i ) X + ( x i m 32 i - m 12 i ) Y + ( x i m 33 i - m 13 i ) Z = m 14 i - x i m 34 i - - - ( 29 )
( y i m 31 i - m 21 i ) X + ( y i m 31 i - m 22 i ) Y + ( y i m 33 i - m 23 i ) Z = m 24 i - y i m 34 i - - - ( 30 )
According to n video camera projection matrix and picture plane coordinates separately, can construct 2n above-mentioned equation, by least square method solve the spatial point coordinate (X, Y, Z).
B) calculate based on the spatial point of triangulation
Under the condition that obtains multiple-camera and Camera Positioning orientation parameter,, can determine the locus of hand by principle of triangulation and least square method.Concrete method of operation can be: the position and the attitude of video camera in the three dimensions are projected on the coordinate plane of three quadratures.On each plane, each coordinate components of difference computer memory point.This method does not need the calibrating camera inner parameter.
C) track fitting
Adopt three basic spline methods to carry out track fitting.And use the fairing condition to determine the boundary condition of spline-fit.
V. calculate in the cross section
As shown in Figure 3 and Figure 4, multiple video strems visual analysis unit 105 also comprises a cross section computing unit 1054 and a cross section profile memory cell 1057.Described interface computing unit 1054 is determined the normal plane at each image frame place on the track according to the continuous motion track that is obtained by described track fitting unit 1054.The projection of object on normal plane is exactly the moving object cross section contour of each frame position.
So far, the key data processing unit 1051,1052,1053,1054 of multiple video strems visual analysis unit 105 and 1055 function have been described.Three data memory cell have also been comprised in the unit 105, i.e. object model memory cell 1056, cross section profile memory cell 1057 and movement locus memory cell 1058.
Object model memory cell 1056 is system's permanent storage units, the 3-D geometric model of storing moving object.Here, after permanent storage unit was meant system roll-back, the content of this cell stores was constant.In this example, the data structure of object model is as shown in table 2.Whole object models of being set up in the system all are stored in the object model memory cell 1056.In unit 1056, each object model is stored with a model list data structure.In each model table, the number of object in pattern number, model identification, types of models numbering, types of models sign, model attributes table, model parameter table, the model and the storage list structure of each object have been stored.For rigid objects, each object is corresponding to a model table.For hand and affined deformable bodies, will store several model tables to the basic distortion of each object and this object thereof in the unit 1056.For simple objects, only comprise a geometric object in the model table; For complex object, store each constituent of complex objects in the model table with a plurality of geometric objects.But the base attribute of model genotype table descriptive model and extended attribute.The basic parameter of model parameter table memory model.
Cross section profile memory cell 1057 is system works memory cell, the time discrete cross section profile of storage moving object on its direction of motion.In the present embodiment, use the long-time sequence contour of object of round-robin queue's memory limited tables of data.For example, can deposit the contour of object tables of data of continuous hundreds of frames before the work at present time.
Movement locus memory cell 1058 is system works memory cell, the spatial movement time discrete track of storage moving object.In the present embodiment, use the long-time sequence movement locus of object of round-robin queue's memory limited tables of data, track data comprises the space coordinates and the attitude of object geometric center.For example, can deposit corresponding to object cross section profile movement locus of object tables of data.
● the semantic identification of real-time, interactive
The semantic identification of real-time, interactive is finished by the semantic recognition unit 106 of real-time, interactive, as Fig. 1 and shown in Figure 6.The semantic recognition unit 106 of the real-time, interactive of three-dimensional geometric mode building system 100 is accepted the output of multiple video strems visual analysis unit 105, and from the semantic model memory cell 110 that has defined, read semantic model, carry out analysis and the explanation and the order of output three-dimensional modeling of movement semantic.The semantic recognition unit 106 of real-time, interactive comprises a collision detection unit 1061, be used to read the movement locus of movement locus memory cell 1058, by the collision between the object model of three-dimensional geometric design model in the collision checking method detection 3-D geometric model memory cell 111 and object model memory cell 1056.Collision detection result will be as the input of operational semantics analytic unit 1062 and interaction semantics analytic unit 1063.The semantic recognition unit 106 of real-time, interactive also comprises interaction semantics analytic unit 1063 and operational semantics analytic unit 1062, and they obtain the operational semantics of moving object for 3-D geometric model according to the data that the collision detection result of collision detection unit 1061 and object model memory cell 1056, movement locus memory cell 1058, semantic model memory cell 110 are stored.The order of semantic analysis result output three-dimensional modeling, and be stored in the three-dimensional modeling command storage unit 1065.
The semantic recognition unit 106 of real-time, interactive comprises and carries out following basic operation:
I. collision detection
Object model, cross section profile and movement locus that collision detection unit 1061 is set up according to multiple video strems visual analysis unit 105 are by the collision result between collision checking method detection three-dimensional geometric design model and the object model.As previously mentioned, object model, cross section profile and movement locus are stored in respectively in object model memory cell 1056, cross section profile memory cell 1057 and the movement locus memory cell 1058.The context environmental that collision detection unit 1061 provides interaction semantics to analyze, simultaneously for the designer to the operation of geometry designs model with visual feedback is provided alternately.
Collision detection result will directly offer the semantic analysis unit, and concrete analytic process will specifically describe hereinafter.Real time collision detection between the two articles adopts AABB tree algorithm known in those skilled in the art to realize, does not repeat them here.
II. operational semantics analytic unit
Operational semantics analytic unit 1062 is used for the semanteme of interpreter operation.For example, " selection " operational semantics, " cancellation " operational semantics etc. in one embodiment of the invention, contact and during certain time, are judged as " selection " operational semantics when fingerprint type and Three Dimensional Design Model bump; Under fingerprint type and situation that Three Dimensional Design Model has contacted, when continuing contact and keeping static a period of time, then be judged to be " cancellation is selected " operational semantics.Semantic analysis result depends on the attitude and the predetermined semantic model of current object.For the flexibility that guarantees to operate, mode of operation can comprise several types: to the operation of the model that generated with to the operation of interface menu and tool bar.
A) to the operation of interface menu and tool bar
Such interface operation has dual mode: mouse-keyboard mode and virtual hand mode.The mouse-keyboard mode is traditional planar graph interface mode.The virtual hand mode is similar to the operation of touch-screen in the prior art, by the motion and the click of virtual hand menu, the tool bar at interface is operated.When virtual hand moved to system figure interface operation zone, system automatically switched to the interface command operator scheme, and mouse-keyboard mode and virtual hand mode automatically switch by the instant input of response.
B) to the operation of generation model
Interaction semantics commonly used is configured to the float ball of virtual 3d space.These float ball are called as the operation ball, and each ball is defined as a kind of operation.After virtual hand is arrested an operation ball, the expression virtual hand will begin this operation.According to the operation context environment, operate the automatic blanking of ball, appear in one's mind and change its degree of depth in virtual three-dimensional space.The selecteed operation ball of most probable will be in the position that virtual hand is the most easily arrested.
Virtual Space operation, gui interface operation and these three kinds of operations of speech command operation are automaticallyed switch by context environmental.When virtual hand moves to the menucommand district of system interface, automatically switch to the virtual mouse pattern.When virtual hand moves to drawing area, be transformed to virtual three-dimensional space molding operation mode.
III. interaction semantics analytic unit
Interaction semantics analytic unit 1063 is determined interaction semantics according to collision detection result, object model memory cell 1056 and movement locus memory cell 1058 and semantic model memory cell 110.In the present embodiment, the interactivity semanteme is expressed by the mutual gesture of static state.1063 pairs of static gestures of interaction semantics analytic unit adopt and carry out semantic analysis based on apparent recognition methods.Promptly, carry out the gesture semantic analysis according to the method for template matches according to predefine gesture template.In the present embodiment, carry out according to following method based on the semantic analysis of template matches:
At first edge image is carried out range conversion, promptly bianry image is carried out the distance map figure of range conversion with the equidimension that obtains corresponding former edge image, the new value of each " pixel " among this distance map figure is a distance value, and range conversion is defined as follows:
D(p)=min(d e(p,q)) q∈O (31)
Wherein, d e(p, q) remarked pixel point p, the Euclidean distance between q, O is the element set of target object.Euclidean range conversion d e(p q) is defined as follows:
d e ( p , q ) = ( p x - q x ) 2 + ( p y - q y ) 2 - - - ( 32 )
For reducing amount of calculation, extracting operation is omitted, and promptly uses formula (33) to replace formula (32)
d e(p,q)=(p x-q x) 2+(p y-q y) 2 (33)
Through as above-mentioned range conversion after, the new value of each point among the distance map figure of formation is in original image the distance apart from the nearest object pixel of this point.
Present embodiment uses unidirectional Hausdorff distance h, and (M I) carries out Model Matching.M is the gesture template edge pixel set of choosing, and I is the set of image edge pixels point behind the edge extracting.During coupling identification, earlier the image to be identified behind the edge extracting is implemented the Euclidean range conversion to obtain distance map figure, in metric space, template is carried out the translation coupling on distance map figure then.Correspondingly, h j(M, I) (subscript j is the translation number of times) is taken as the maximum in some values of the edge pixel point corresponding position on the current distance mapping graph in the template, and it has measured template at the maximum degree of not matching between the corresponding pixel points on current translation position and the edge image.The Hausdorff of citation form apart from the decision rule of template matches is: get the above-mentioned h that obtains in all translation couplings j(M, I) minimum value in the value is as the metric of similarity between this template corresponding objects that might exist in this template and this image.In the translation matching process, if the similarity that several models and current image to be identified occur very near the time, then again edge directional information is joined in the differentiation of translation coupling, judge that being in a certain translation point q (pixel with the template lower left corner in certain translation is a reference point) locates this moment, the condition whether mate template and edge image corresponding position is:
1. establishing the j time translation when coupling, is R if meet the ratio that point that coupling requires and template pixel always count in the template j
R j = max q &Element; Q { n ( [ m &Element; M | &Exists; i &Element; I , | | ( m + q ) - i | | < &tau; , | | Ang ( m ) - Ang ( i ) | | < &theta; ] ) / n ( M ) } - - - ( 34 )
In the following formula, Q is the point set that template is formed by template lower left corner pixel in the several times translation on edge image; The operation of n () for getting element number in the set; Function Ang (x) is the angle value of the edge pixel point x that tries to achieve; τ is given range difference threshold value; θ is given direction radian difference limen value.
2. get P (k)=max (R j), wherein: k=1,2 ..., be the template number, J carries out the number of times of several times translation to some templates.
3. getting the indicated gesture of template corresponding with max P (k) is final recognition result.
Carry out template matches in the Hausdorff distance that present embodiment has been taked to revise, promptly by h to obtaining in the several times translation i(M, I) on average obtain the similarity of template with respect to image to be identified,
h i ( M , I ) = 1 N h i ( M , I ) - - - ( 35 )
Wherein, N is the number of the edge pixel point in the template.
According to the method described above, can determine the semantic substantially of several mutual gestures.Figure 15 shows several examples of interaction semantics gesture.
● 3 d geometric modeling
As shown in Figure 7, three-dimensional set modeling 100 comprises that also a 3 d geometric modeling unit 107 is used for based on 105 acquisition motion and the gesture recognition from multiple video strems visual analysis unit, and based on obtaining semantic analysis result from the semantic recognition unit 106 of real-time, interactive, it is the three-dimensional modeling order, by reprocessing unit 1071, dithering process unit 1072, envelope computing unit 1073 and moulding edit cell 1074, set up new three-dimensional geometric design body, and existing three-dimensional geometry body edited replacement, and read and store three geometrical model memory cell 111 in real time.3 d geometric modeling unit 107 comprises following treatment step:
I. reprocessing
The repetitive operation that produces in the 1071 pairs of object of which movement processes in reprocessing unit is handled, and eliminates repetition, the plyability motion of object.
II. dithering process
The motion of the 1072 pairs of objects in dithering process unit is carried out smoothly, fairing processing, eliminates the small shake of movement locus of object and attitude.
III. envelope computing unit
The basic function of envelope computing unit 1073 is according to movement locus and the cross section profile eliminating shake already and repeated, solves the envelope differential equation, utilizes Runge-Kutta algorithm to calculate the enveloping surface that object of which movement produces then.The enveloping surface that this unit output object of which movement produces.
IV. moulding edit cell
The sports envelope face that moulding edit cell 1074 usefulness envelope computing units 1073 produce is replaced existing THREE DIMENSION GEOMETRIC MODELING model, and carries out the smooth connection processing of new and old dough sheet.Described replacement process is connected with level and smooth according to constraint and modification rule process splice point and Mosaic face, and the 3-D geometric model of being set up is made amendment.
● modeling rendering and demonstration
Modification to 3-D geometric model will activate the modeling rendering process, and drawing result is outputed on the display unit 109.
Thus, technical scheme according to first specific embodiment of the present invention has been described.In addition, as shown in Figure 8, the three-dimensional geometric mode building system 100 of present embodiment can be made of three general purpose digital computers, 1201,1202 and backend computers 1203 of two front-end computers.Per two digital cameras are connected on the front-end computer (1201,1202), and two front- end computers 1201 and 1202 are connected on the backend computer 1203, and backend computer 1203 links to each other with video output device 109.In front- end computer 1201,1202, all provide with its video input device that is connected 101 corresponding single video visual analysis unit 104 and many videos visual analysis unit 105 in three-dimensional matching unit 1051.In backend computer 1203, provide each establishment and the semantic recognition unit 106 of real-time, interactive of the many videos visual analysis unit 105 except that three-dimensional matching unit 1051,3 d geometric modeling unit 107, threedimensional model drawing unit 108.The storage system of computer 1203 provides semantic model memory cell 110 and 3-D geometric model memory cell 111.Above-mentioned establishment of the present invention can realize by software, firmware and integrated circuit or the like.
Second specific embodiment
As shown in Figure 9, the three-dimensional geometric mode building system 200 according to second embodiment of the invention also comprises a voice input device 202 and audio identification unit 203.Voice input device 202 can be general microphone and the sound card equipment that natural-sounding is converted to audio digital signals.In the present embodiment, voice input device equipment provides the auxiliary mutual input of multichannel user interactions mode as auxiliary input device.After obtaining the audio frequency input, voice recognition unit 203 carries out speech recognition, promptly realizes the audio frequency input is converted to the function of restricted language.Voice recognition unit 203 is only discerned and is limited to predefined speech pattern, and undefined phonetic entry will be dropped.For example, can use Microsoft Speech 5.X Microsoft speech recognition engine that basic voice are discerned.Be provided with by the limited syntax that are provided with in the speech recognition engine, can limit the phonetic entry of other behavior that may lead to system abnormity so that the voice command that system only provides in the recognizing grammar effectively improves discrimination based on the XML file.
The voice that voice recognition unit 203 identifies offer the semantic recognition unit 206 of real-time, interactive.The semantic recognition unit 206 of real-time, interactive also comprises a voice semantic analysis unit 2064 except that operations such as the collision detection of carrying out first embodiment, operation and interaction semantics analysis, the voice of voice recognition unit 203 identifications are carried out semanteme discern.When the phonetic entry that obtains from voice recognition unit 203 through identification, voice semantic analysis unit 2064 carries out semantic interpretation according to system's current context environment to obtaining voice.Show a semantic resolution file example below, wherein with<O〉</O〉be that the semantic analytic grammar of mark is the optional syntax.The user can revise the content of this part according to concrete needs, embodies the thought of personalized man-machine interaction.
<GRAMMAR LANGID=″804″>
<RULE NAME=″WithPara″TOPLEVEL=″ACTIVE″>
<P>
<O>
<L>
<P〉please</P 〉
<P〉I think</P 〉
</L>
</O>
<L>
<P PROPNAME=" TYPE_RULEREF " VALSTR=" CrePoint "〉the establishment point</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" CreLine "〉the establishment line</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" CreSurface "〉establishment face</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" Delete "〉deletion</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" Cancel "〉cancel</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" Done "〉finish</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" Edit "〉editor</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" SelePoint "〉choice point</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" SeleLine "〉selection wire</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" SeleSurface "〉selection face</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" zin "〉dwindle</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" zout "〉amplify</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" Translation "〉translation</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" X "〉the X coordinate figure</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" Y "〉the Y coordinate figure</P 〉
<P PROPNAME=" TYPE_RULEREF " VALSTR=" Z "〉the Z coordinate figure</P 〉
</L>
<L>
<P VALSTR=″value″>Value</P>
</L>
</P>
</RULE>
</GRAMMAR>
Simultaneously, the operation of carrying out in the semantic recognition unit 206 of real-time, interactive as shown in figure 10.The semantic recognition unit 206 of the real-time, interactive of present embodiment comprises collision detection unit 2061, operational semantics analytic unit 2062, interaction semantics analytic unit 2063, voice semantic analysis unit 2064 and three-dimensional modeling command storage unit 2065.The groundwork step of the semantic recognition unit 206 of real-time, interactive is as follows:
Collision detection unit 2061 reads the movement locus of movement locus memory cell 2058, detects the collision between the object model of the three-dimensional geometric design model of three-dimensional geometric design memory cell 211 and object model memory cell 2056 by collision checking method.Collision detection result will be as the input of operational semantics analytic unit 2062 and interaction semantics analytic unit 2063.Interaction semantics analytic unit 2063 and operational semantics analytic unit 2062 obtain the operational semantics of moving object for 3-D geometric model according to the data of storage in the collision detection result of collision detection unit 2061 and object model memory cell 2056, movement locus memory cell 2058, the semantic model memory cell 210, generate the three-dimensional modeling demanded storage in three-dimensional modeling command storage unit 2065.Voice semantic analysis unit 2064 obtains the semantic recognition result of restricted language from voice identification result unit 203, definition resolved in semanteme according to semantic model memory cell 210, obtains the explanation of voice semanteme and generate the three-dimensional modeling demanded storage in three-dimensional modeling command storage unit 2065.
At this, voice operating can be used as the control operation in command operation and the drawing process as auxiliary exchange channels in system's man-machine interaction process.
Except that foregoing description, identical with first embodiment according to a second embodiment of the present invention to obtain the technical scheme that similar part implements be identical.
The 3rd specific embodiment
Figure 11 shows the structure chart of the 3rd specific embodiment of using three-dimensional geometric mode building system 300 of the present invention.Video input apparatus 301 shown in Figure 11 is a kind of digital cameras.In the present embodiment, video input apparatus 301 is made up of six digital cameras; Corresponding to each digital camera has a single video stream visual analysis unit 304, and every digital camera is directly connected to the general purpose computing device interface according to common connected mode by connecting line, is handled by single video stream visual analysis unit 304.Voice input device 302 is a general microphone and the sound card equipment that natural-sounding is converted to audio digital signals.Different with first specific embodiment, this embodiment has not only adopted more camera head, and connected mode is also different.Specifically, video camera C01, C02 and C03 are positioned at the position contour with designer's hand exercise, and video camera C04, C05 and C06 are positioned at the designer top.This configuration is owing to the number that has increased video camera, thereby the occlusion issue that can better avoid motion process to cause.It during another difference the change of the connected mode of video camera.As shown in figure 13,6 video cameras are equally divided into two groups, and three video cameras of every group connect into two can be right for the image input that dual-view coupling is used, and offer multiple video strems visual analysis unit 305 and carry out that the degree of depth is obtained and three-dimensional coupling.
Six camera heads are placed on conceptual design person the place ahead, left front, right front, top, six the different positions in upper left side and upper right side respectively, its height and attitude apart from ground is expressed as suitable to be fit to designer's gesture, the design action that is the designer should be able to be shot with video-corder fully by video camera, and does not influence designer's operation and other activity.A kind of concrete layout execution mode of digital camera is illustrated by Figure 12.
In the present embodiment, computer system is made of three general purpose digital computers, as shown in figure 14.Per two video cameras are connected on the front- end computer 1901,1902, and two front-end computers are connected on the backend computer 1903, and backend computer links to each other with audio input device, display device.
The running of present embodiment and first specific embodiment are basic identical.Open the computer system of this device as the three-dimensional modeling designer after, at first begin the system initialization process.Generate the 3-D geometric model of simple objects then.Different is that with respect to the multiple video strems visual analysis of first specific embodiment, this multiple video strems visual analysis of present embodiment is accepted the output of all six single video stream visual analysises.Its processing procedure is handled video input according to combination shown in Figure 14.
The implementation of three Geometric Modeling system and methods of the present invention has more than been described.Though top description is carried out with reference to specific embodiment of the present invention, should be appreciated that under the situation that does not break away from its spirit and can carry out various modifications.Therefore disclosure embodiment is exemplary rather than restrictive in all respects, and scope of the present invention is described by claims rather than front and represented, therefore belongs to the equivalents of claim and all changes of scope include therein.

Claims (14)

1. a 3 d geometric modeling method comprises
Video input step: gather the staff video flowing from being distributed in designer's a plurality of video input devices (101) on every side;
Single video stream visual analysis step: a plurality of video flowings that above-mentioned collection video flowing step is gathered are handled through a single video stream visual analysis separately, to detect moving region and the non-moving region in the video flowing, estimate moving object travel direction and speed and predict next movement position and calculate the edge contour of moving object and estimate contour feature;
Multiple video strems visual analysis step: the result that receives above-mentioned some single video stream visual analysises, carry out the binocular solid coupling, and carry out three-dimensional reconstruction and movement locus of object match and calculate the moving object cross section, and the cross section profile of the object model, movement locus of object and the object that are obtained is offered the semantic identification of real-time, interactive treatment step based on the profile that is obtained and feature;
The semantic identification step of real-time, interactive: object model, cross section profile and movement locus according to multiple video strems visual analysis step is set up, detect the collision between 3-D geometric model and the object model, determine position and mode that collision takes place; And determine operational semantics such as semantic operand, action type according to predetermined semantic model according to the result of collision detection; And, determine interaction semantics according to the semantic model that collision detection result, the object model of being stored and the movement locus of being stored and semantic model memory cell are stored in advance;
The 3 d geometric modeling step: the output to the semantic identification step of object model, movement locus of object, object cross section profile and real-time, interactive of many videos visual analysis step output is handled, thereby obtain the three-dimensional geometric design moulding, and result is stored in the 3-D geometric model memory cell;
Threedimensional model plot step: the 3-D geometric model of real-time storage in the 3-D geometric model memory cell is plotted on the video output device;
Video output step: the designed THREE DIMENSION GEOMETRIC MODELING of display design teacher on video output device.
2. 3 d geometric modeling method according to claim 1 is characterized in that, described single video stream visual analysis step also comprises:
The image analysing computer step: the vision signal to its pairing video input device collection is handled, and obtains to have the characteristic video stream of different resolution yardstick and different characteristic element;
Real time kinematics detects step: the characteristic video stream that is obtained according to described image analysing computer step detects moving region in the video flowing and the direction of motion to be cut apart and the edge, moving region to obtain accurately the moving region;
The profile calculation procedure: by obtain the zone of hand based on Face Detection, the zone of removing hand obtains the zone of hand-held object, thereby calculates the edge contour in zone on the moving object zone;
Estimation and prediction steps: estimate moving object travel direction and speed and predict next movement position.
3. 3 d geometric modeling method according to claim 1 is characterized in that, described multiple video strems visual analysis step comprises:
Three-dimensional coupling step: receive motion detection result, calculate Region Segmentation and the moving object depth information that obtains moving object through three-dimensional coupling and parallax from the motion detection step output in two described single video stream visual analysis steps;
SFS﹠amp; The FBR step: the profile according to the profile calculation procedure output in the described single video stream visual analysis step, calculate and recover the object surfaces shape;
The track fitting step: a plurality of video camera photographic planes and video camera relative orientation parameter that usage space distributes estimate space continuous motion track by space coordinates intersection, curve fit;
The cross section calculation procedure: the continuous motion track according to the outline data and the above-mentioned track fitting step of the profile calculation procedure in described single vision video analysis step output obtained, calculate the cross section profile of moving object;
The object model establishment step: the indicated depth information data of result by above-mentioned three-dimensional coupling step are set up object model.
4. 3 d geometric modeling method according to claim 1 is characterized in that, described 3 d geometric modeling step comprises:
Reprocessing step: the repetitive operation that produces in the object of which movement process is handled, eliminated repetition, the plyability motion of object;
The dithering process step: to the motion of object carry out smoothly, fairing processing, eliminate the small shake of movement locus of object and attitude;
Envelope calculation procedure:, calculate the enveloping surface that object of which movement produces according to movement locus and the cross section profile eliminating shake and repeat;
Moulding edit step: the 3-D geometric model of being set up is made amendment.
5. 3 d geometric modeling method according to claim 1 is characterized in that, described method also comprises:
Step from speech input device input voice command;
Speech recognition steps: the voice command of importing from speech input device is discerned according to predetermined speech model; Wherein
The semantic identification step of described real-time, interactive also comprises a voice semantic analysis step: the output of many videos visual analysis step and the output of speech recognition steps are handled, acquisition man-machine interaction semanteme, and the man-machine interaction semantic results that is obtained offered the 3 d geometric modeling step.
6. 3 d geometric modeling method according to claim 1 is characterized in that, the quantity of described video input device is 4, is distributed in designer's right-hand, right front, left front and left respectively and places to be fit to the position that designer's gesture is expressed.
7. 3 d geometric modeling method according to claim 1, it is characterized in that, the quantity of described video input device is 6, is distributed in designer's the place ahead, left front, right front, top, upper left side and upper right side respectively and places to be fit to the position that designer's gesture is expressed.
8. three-dimensional geometric mode building system comprises:
A plurality of video input devices: be distributed in the staff video flowing that the designer is used to gather designer's design action on every side;
A plurality of single video stream visual analysises unit: corresponding to each video input device, make a plurality of video flowings of gathering by above-mentioned video input device flow the processing of visual analysis unit separately through a described single video, to detect moving region and the non-moving region in the video flowing, estimate moving object travel direction and speed and predict next movement position and calculate the edge contour of moving object and estimate contour feature;
Multiple video strems visual analysis unit: the result that is used to receive above-mentioned a plurality of single video stream visual analysises unit, carry out the binocular solid coupling, and carry out three-dimensional reconstruction and movement locus of object match based on profile that is obtained and feature, calculate the moving object cross section, and the cross section profile of the object model, movement locus of object and the object that are obtained is offered the semantic recognition unit of real-time, interactive;
The semantic recognition unit of real-time, interactive: be used for object model, cross section profile and the movement locus set up according to multiple video strems visual analysis unit, detect the collision between 3-D geometric model and the object model, determine position and mode that collision takes place; And determine operational semantics such as semantic operand, action type according to predetermined semantic model according to the result of collision detection; And, determine interaction semantics according to the semantic model that collision detection, the object model of being stored and the movement locus of being stored and semantic model memory cell are stored in advance;
3 d geometric modeling unit: be used for the output result of many videos visual analysis unit and the output of the semantic recognition unit of real-time, interactive are carried out integrated treatment, thereby obtain the three-dimensional geometric design moulding, and result is stored in the 3-D geometric model memory cell;
Threedimensional model drawing unit: be used for the 3-D geometric model of 3-D geometric model memory cell real-time storage is plotted to video output device;
Video output device: be used for the designed THREE DIMENSION GEOMETRIC MODELING of display design teacher.
9. three-dimensional geometric mode building system according to claim 8 is characterized in that, described single video stream visual analysis unit also comprises:
Image analysing computer unit: be used for the vision signal of its pairing video input device collection is handled, obtain to have the characteristic video stream of different resolution yardstick and different characteristic element;
Real time kinematics detecting unit: be used for the characteristic video stream that obtained according to described image analysing computer unit and detect the moving region of video flowing and the direction of motion and cut apart and the edge, moving region to obtain accurately the moving region;
The profile computing unit: be used for passing through to obtain based on Face Detection on the moving object zone zone of hand, the zone of removing hand obtains the zone of hand-held object, thereby calculates the edge contour in zone;
Estimation and predicting unit: be used to estimate moving object travel direction and speed and predict next movement position.
10. three-dimensional geometric mode building system according to claim 8 is characterized in that, described multiple video strems visual analysis unit comprises:
Three-dimensional matching unit: be used for receiving the motion detection result that two described single video flow the motion detection unit output of visual analysis unit, three-dimensional coupling of process and parallax calculating obtain the Region Segmentation and the moving object depth information of moving object;
SFS﹠amp; FBR unit: be used for profile result of calculation, recover the object surfaces shape according to the profile computing unit output of described single video stream visual analysis unit;
Track fitting unit: be used for a plurality of video camera photographic planes and video camera relative orientation parameter that usage space distributes, estimate space continuous motion track by space coordinates intersection, curve fit;
Cross section computing unit: be used for calculating the cross section profile of moving object according to the outline data of the profile computing unit output of described single video stream visual analysis unit and the continuous motion track that above-mentioned track fitting unit is obtained;
Object model is set up the unit: be used for setting up object model by the indicated depth information data of the result of above-mentioned three-dimensional matching unit.
11. three-dimensional geometric mode building system according to claim 8 is characterized in that, described 3 d geometric modeling unit comprises:
Reprocessing unit: be used for the repetitive operation that the object of which movement process produces is handled, eliminate repetition, the plyability motion of object;
The dithering process unit: be used for to the motion of object carry out smoothly, fairing processing, eliminate the small shake of movement locus of object and attitude;
Envelope computing unit: be used for calculating the enveloping surface that object of which movement produces according to movement locus and the cross section profile eliminating shake and repeat;
Moulding edit cell: be used for the 3-D geometric model of being set up is made amendment.
12. three-dimensional geometric mode building system according to claim 8 is characterized in that, described system also comprises:
Speech input device: be used to import voice command;
Voice recognition unit: be used for according to predetermined speech model to discerning from the voice command of speech input device input; Wherein
The semantic recognition unit of described real-time, interactive is used for the output of many videos visual analysis unit and the output of voice recognition unit are handled, and obtains the man-machine interaction semanteme, and the man-machine interaction semantic results that is obtained is offered the 3 d geometric modeling unit.
13. three-dimensional geometric mode building system according to claim 8 is characterized in that, the quantity of described video input device is 4, is distributed in designer's right-hand, right front, left front and left respectively and places to be fit to the position that designer's gesture is expressed.
14. three-dimensional geometric mode building system according to claim 8, it is characterized in that, the quantity of described video input device is 6, is distributed in designer's the place ahead, left front, right front, top, upper left side and upper right side respectively and places to be fit to the position that designer's gesture is expressed.
CN2005100122739A 2005-07-29 2005-07-29 Three-dimensional geometric mode building system and method Expired - Fee Related CN100407798C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2005100122739A CN100407798C (en) 2005-07-29 2005-07-29 Three-dimensional geometric mode building system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2005100122739A CN100407798C (en) 2005-07-29 2005-07-29 Three-dimensional geometric mode building system and method

Publications (2)

Publication Number Publication Date
CN1747559A CN1747559A (en) 2006-03-15
CN100407798C true CN100407798C (en) 2008-07-30

Family

ID=36166855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2005100122739A Expired - Fee Related CN100407798C (en) 2005-07-29 2005-07-29 Three-dimensional geometric mode building system and method

Country Status (1)

Country Link
CN (1) CN100407798C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI720513B (en) * 2019-06-14 2021-03-01 元智大學 Image enlargement method

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4499693B2 (en) * 2006-05-08 2010-07-07 ソニー株式会社 Image processing apparatus, image processing method, and program
CN101276370B (en) * 2008-01-14 2010-10-13 浙江大学 Three-dimensional human body movement data retrieval method based on key frame
CN101965589A (en) * 2008-03-03 2011-02-02 霍尼韦尔国际公司 Model driven 3d geometric modeling system
CN101795348A (en) * 2010-03-11 2010-08-04 合肥金诺数码科技股份有限公司 Object motion detection method based on image motion
CN101908230B (en) * 2010-07-23 2011-11-23 东南大学 Regional depth edge detection and binocular stereo matching-based three-dimensional reconstruction method
CN101923729B (en) * 2010-08-25 2012-01-25 中国人民解放军信息工程大学 Reconstruction method of three-dimensional shape of lunar surface based on single gray level image
CN102354345A (en) * 2011-10-21 2012-02-15 北京理工大学 Medical image browse device with somatosensory interaction mode
US9396292B2 (en) * 2013-04-30 2016-07-19 Siemens Product Lifecycle Management Software Inc. Curves in a variational system
CN103927784B (en) * 2014-04-17 2017-07-18 中国科学院深圳先进技术研究院 A kind of active 3-D scanning method
CN105323572A (en) * 2014-07-10 2016-02-10 坦亿有限公司 Stereoscopic image processing system, device and method
CN104571510B (en) 2014-12-30 2018-05-04 青岛歌尔声学科技有限公司 A kind of system and method that gesture is inputted in 3D scenes
US10482670B2 (en) 2014-12-30 2019-11-19 Qingdao Goertek Technology Co., Ltd. Method for reproducing object in 3D scene and virtual reality head-mounted device
CN104571511B (en) 2014-12-30 2018-04-27 青岛歌尔声学科技有限公司 The system and method for object are reappeared in a kind of 3D scenes
US9792692B2 (en) * 2015-05-29 2017-10-17 Ncr Corporation Depth-based image element removal
CN105160673A (en) * 2015-08-28 2015-12-16 山东中金融仕文化科技股份有限公司 Object positioning method
CN105404511B (en) * 2015-11-19 2019-03-12 福建天晴数码有限公司 Physical impacts prediction technique and device based on ideal geometry
CN105844692B (en) * 2016-04-27 2019-03-01 北京博瑞空间科技发展有限公司 Three-dimensional reconstruction apparatus, method, system and unmanned plane based on binocular stereo vision
CN107452037B (en) * 2017-08-02 2021-05-14 北京航空航天大学青岛研究院 GPS auxiliary information acceleration-based structure recovery method from movement
CN111448568B (en) * 2017-09-29 2023-11-14 苹果公司 Environment-based application presentation
CN109754457A (en) * 2017-11-02 2019-05-14 韩锋 Reconstruct system, method and the electronic equipment of object threedimensional model
CN107907110B (en) * 2017-11-09 2020-09-01 长江三峡勘测研究院有限公司(武汉) Multi-angle identification method for structural plane occurrence and properties based on unmanned aerial vehicle
CN109993976A (en) * 2017-12-29 2019-07-09 技嘉科技股份有限公司 Traffic accident monitors system and method
CN109783922A (en) * 2018-01-08 2019-05-21 北京航空航天大学 A kind of local product design method, system and its application based on function and environmental factor
CN108777770A (en) * 2018-06-08 2018-11-09 南京思百易信息科技有限公司 A kind of three-dimensional modeling shared system and harvester
CN111010590B (en) * 2018-10-08 2022-05-17 阿里巴巴(中国)有限公司 Video clipping method and device
CN110246212B (en) * 2019-05-05 2023-02-07 上海工程技术大学 Target three-dimensional reconstruction method based on self-supervision learning
CN110660132A (en) * 2019-10-11 2020-01-07 杨再毅 Three-dimensional model construction method and device
CN112215933B (en) * 2020-10-19 2024-04-30 南京大学 Three-dimensional solid geometry drawing system based on pen type interaction and voice interaction
CN114137880B (en) * 2021-11-30 2024-02-02 深蓝汽车科技有限公司 Moving part attitude test system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09180003A (en) * 1995-12-26 1997-07-11 Nec Corp Method and device for modeling three-dimensional shape
CN1263302A (en) * 2000-03-13 2000-08-16 中国科学院软件研究所 Pen and signal based manuscript editing technique
US20030025788A1 (en) * 2001-08-06 2003-02-06 Mitsubishi Electric Research Laboratories, Inc. Hand-held 3D vision system
CN1404016A (en) * 2002-10-18 2003-03-19 清华大学 Establishing method of human face 3D model by fusing multiple-visual angle and multiple-thread 2D information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09180003A (en) * 1995-12-26 1997-07-11 Nec Corp Method and device for modeling three-dimensional shape
CN1263302A (en) * 2000-03-13 2000-08-16 中国科学院软件研究所 Pen and signal based manuscript editing technique
US20030025788A1 (en) * 2001-08-06 2003-02-06 Mitsubishi Electric Research Laboratories, Inc. Hand-held 3D vision system
CN1404016A (en) * 2002-10-18 2003-03-19 清华大学 Establishing method of human face 3D model by fusing multiple-visual angle and multiple-thread 2D information

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI720513B (en) * 2019-06-14 2021-03-01 元智大學 Image enlargement method

Also Published As

Publication number Publication date
CN1747559A (en) 2006-03-15

Similar Documents

Publication Publication Date Title
CN100407798C (en) Three-dimensional geometric mode building system and method
US11703951B1 (en) Gesture recognition systems
US11107272B2 (en) Scalable volumetric 3D reconstruction
CN110458939B (en) Indoor scene modeling method based on visual angle generation
CN103778635B (en) For the method and apparatus processing data
CN107292234B (en) Indoor scene layout estimation method based on information edge and multi-modal features
Häne et al. Dense semantic 3d reconstruction
Kim et al. Pedx: Benchmark dataset for metric 3-d pose estimation of pedestrians in complex urban intersections
CN109003325A (en) A kind of method of three-dimensional reconstruction, medium, device and calculate equipment
CN107357427A (en) A kind of gesture identification control method for virtual reality device
CN104637090B (en) A kind of indoor scene modeling method based on single picture
CN107688391A (en) A kind of gesture identification method and device based on monocular vision
CN1509456A (en) Method and system using data-driven model for monocular face tracking
CN114424250A (en) Structural modeling
Bhattacharjee et al. A survey on sketch based content creation: from the desktop to virtual and augmented reality
Jordt et al. Direct model-based tracking of 3d object deformations in depth and color video
CN110751097A (en) Semi-supervised three-dimensional point cloud gesture key point detection method
CN116097316A (en) Object recognition neural network for modeless central prediction
Tao et al. Indoor 3D semantic robot VSLAM based on mask regional convolutional neural network
Xu et al. Robust hand gesture recognition based on RGB-D Data for natural human–computer interaction
Huang et al. Network algorithm real-time depth image 3D human recognition for augmented reality
CN115578460A (en) Robot grabbing method and system based on multi-modal feature extraction and dense prediction
Fadzli et al. VoxAR: 3D modelling editor using real hands gesture for augmented reality
Bhakar et al. A review on classifications of tracking systems in augmented reality
Xu et al. A novel multimedia human-computer interaction (HCI) system based on kinect and depth image understanding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20080730

Termination date: 20160729