CN109947886A

CN109947886A - Image processing method, device, electronic equipment and storage medium

Info

Publication number: CN109947886A
Application number: CN201910209109.9A
Authority: CN
Inventors: 张润泽; 贾佳亚; 戴宇荣; 沈小勇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-03-19
Filing date: 2019-03-19
Publication date: 2019-06-28
Anticipated expiration: 2039-03-19
Also published as: CN109947886B

Abstract

The invention discloses a kind of image processing method, device, electronic equipment and storage mediums, belong to field of computer technology.The described method includes: obtaining multiple images frame；For any image frame, the corresponding multinomial position difference information of any image frame is obtained, multinomial position difference information is used to indicate the position difference of the matching characteristic point of the previous key frame of any image frame and any image frame；When any one of multinomial position difference information meets the accuracy requirement of scene creation, any image frame is retrieved as key frame；Based on the multiple key frames got, destination virtual scene is created.The present invention passes through when obtaining key frame, consider multinomial position difference information, when any one meets the accuracy requirement of scene creation will any image frame be retrieved as key frame, to get enough key frames, avoiding the occurrence of positioning unsuccessfully leads to the problem of virtual scene inaccuracy of creation, and the accuracy for the virtual scene that above-mentioned image processing method obtains is good.

Description

Image processing method, device, electronic equipment and storage medium

Technical field

The present invention relates to field of computer technology, in particular to a kind of image processing method, device, electronic equipment and storage Medium.

Background technique

With the development of computer technology, positioning in real time and map building (Simultaneous Localization And Mapping, SLAM) technology is widely used in fields such as unmanned, robot, virtual reality or augmented realities.The SLAM skill Art can be applied to automatic Pilot field, can the multiple images frame to vehicle photographs handle, go creation scene.

Currently, image processing method generallys use a kind of indirect SLAM technology, it is based on ORB (Oriented FAST and Rotated BRIEF) feature it is real-time positioning with map building (Oriented FAST and Rotated BRIEF- Simultaneous Localization And Mapping, ORB-SLAM) technology, in this technique, for multiple images Frame, can be by algorithm, according to each picture frame relative to the matched characteristic point of previous key frame before the picture frame A certain information selects multiple key frames from multiple images frame, and according to multiple key frame, it is corresponding to create multiple images frame Virtual scene.

In above-mentioned image processing method, algorithm comparison used by acquisition key frame is coarse, the key chosen often occurs Frame is insufficient, and makes the failure of the positioning to the equipment of acquired image frames or some road signs point, the virtual scene of creation it is accurate Property is poor.

Summary of the invention

The embodiment of the invention provides a kind of image processing method, device, electronic equipment and storage mediums, can solve phase The problem of the accuracy difference of virtual scene in the technology of pass.The technical solution is as follows:

On the one hand, a kind of image processing method is provided, which comprises

Obtain multiple images frame；

For any image frame, the corresponding multinomial position difference information of any image frame, the multinomial position are obtained Different information is used to indicate the position of the matching characteristic point of the previous key frame of any image frame and any image frame Set difference；

When any one of described multinomial position difference information meets the accuracy requirement of scene creation, by any figure As frame is retrieved as key frame；

Based on the multiple key frames got, destination virtual scene is created, the destination virtual scene is for indicating described Scene corresponding to multiple images frame.

In a kind of possible implementation, the method also includes:

For any feature point of any image frame, according to any feature point in the second history image frame The depth of matching characteristic point obtains spatial position of the matching characteristic o'clock in the camera coordinates system of the second history image frame, The second history image frame be characterized a little with the matched multiple images frame of any feature point in first picture frame；

It is gone through according to spatial position of the matching characteristic o'clock in the camera coordinates system of the second history image frame, described second The opposite camera pose of the camera pose of history picture frame and any image frame relative to the second history image frame, Obtain the prediction relative direction of camera position of any feature point relative to any image frame；

Obtain the relative direction of camera position of any feature point relative to any image frame；

By the prediction relative direction and the relative direction throwing in the plane vertical with the relative direction respectively Error of the gap as any feature point between shadow position.

On the one hand, a kind of image processing apparatus is provided, described device includes:

Image collection module, for obtaining multiple images frame；

Data obtaining module, for obtaining the corresponding multinomial position difference of any image frame for any image frame Information, the multinomial position difference information are used to indicate the previous key frame of any image frame Yu any image frame Matching characteristic point position difference；

Described image obtains module, is also used to meet the standard of scene creation when any one of described multinomial position difference information When true property requires, any image frame is retrieved as key frame；

Scene creation module, for creating destination virtual scene, the destination virtual based on the multiple key frames got Scene is for indicating scene corresponding to described multiple images frame.

On the one hand, a kind of electronic equipment is provided, the electronic equipment includes one or more processors and one or more A memory is stored at least one instruction in one or more of memories, and described instruction is by one or more of Reason device is loaded and is executed to realize operation performed by described image processing method.

On the one hand, provide a kind of computer readable storage medium, be stored in the computer readable storage medium to A few instruction, described instruction are loaded as processor and are executed to realize operation performed by described image processing method.

When obtaining key frame in the embodiment of the present invention in multiple images frame, judging whether for any image frame to be retrieved as When key frame, it is contemplated that any image frame is more with the matched characteristic point of previous key frame before any image frame Position difference information, when any one meets the accuracy requirement of scene creation will any image frame be retrieved as key Frame, to get enough key frames, can avoid the occurrence of unsuccessfully causes to create to the equipment or road sign point location of acquired image frames The problem for the virtual scene inaccuracy built, thus, the destination virtual field that image processing method provided in an embodiment of the present invention obtains The accuracy of scape is good.

Detailed description of the invention

To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.

Fig. 1 is a kind of implementation environment of image processing method provided in an embodiment of the present invention；

Fig. 2 is a kind of flow chart of image processing method provided in an embodiment of the present invention；

Fig. 3 is a kind of schematic diagram of picture frame provided in an embodiment of the present invention；

Fig. 4 is a kind of feature extraction schematic diagram of picture frame provided in an embodiment of the present invention；

Fig. 5 is a kind of schematic diagram of characteristic matching process provided in an embodiment of the present invention；

Fig. 6 is a kind of motion profile of vehicle provided in an embodiment of the present invention and the schematic diagram of sparse cloud map；

Fig. 7 is a kind of schematic diagram of image processing flow provided in an embodiment of the present invention；

Fig. 8 is a kind of schematic diagram of the application scenarios of image processing method provided in an embodiment of the present invention；

Fig. 9 is a kind of structural schematic diagram of image processing apparatus provided in an embodiment of the present invention；

Figure 10 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention；

Figure 11 is a kind of structural schematic diagram of server provided in an embodiment of the present invention.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.

Fig. 1 is a kind of implementation environment of image processing method provided in an embodiment of the present invention, which can be with Including two kinds of implementation environments, in a kind of possible implementation, referring to Fig. 1, which may include 101 He of electronic equipment Electronic equipment 102.Wherein, the electronic equipment 101 and electronic equipment 102 can by data line or wireless network connection, with into Row data interaction.The electronic equipment 101 has image collecting function, is used for acquired image frames, which has image Processing function, for handling 101 acquired image frame of electronic equipment, to create the corresponding destination virtual field of picture frame Scape.

The destination virtual scene may include one or more data, which can be by related technical personnel It is configured according to demand, which kind of or which kind data specific to the destination virtual scene of the embodiment of the present invention are not construed as limiting.? In one possible embodiment, which can be based on acquired image frame, create target corresponding to the picture frame Virtual map.In another possible embodiment, which can also be based on acquired image frame to electronic equipment 101 are positioned, and the target trajectory of the electronic equipment 101 is obtained.In another possible embodiment, the electronic equipment 102 can not only create destination virtual map, but also obtain target trajectory.

In a specific embodiment, which can also be installed in positioning target, such as, it is desirable to When being positioned to vehicle, which can be installed on vehicle, the position of the electronic equipment 101 in this way is For the position of the positioning target, thus, electronic equipment 102 can be by positioning electronic equipment 101, to realize to positioning The positioning of target., can be by 101 acquired image frames of electronic equipment that are installed on vehicle for example, in automatic Pilot field, and it will Acquired image frame is sent to electronic equipment 102, is handled by electronic equipment 102 acquired image frame, and creation should The virtual map around place that vehicle driving is crossed, or the vehicle is positioned, it can also not only position but also create the void of surrounding Quasi- map.

In alternatively possible implementation, referring to Fig. 1, which may include electronic equipment 102, that is to say, Above-mentioned electronic equipment 101 and electronic equipment 102 are the same electronic equipment.The electronic equipment 102 have image collecting function and Image processing function.The electronic equipment 102 can be with acquired image frames, and handle picture frame, position to itself, or Virtual map around creating, or not only obtained virtual scene but also itself is positioned.For example, in a specific example, it should Electronic equipment 102 can be robot, the robot can acquired image frames when moving, and acquired image frame is carried out Processing, creates the virtual map of surrounding, or position to itself, can also not only position but also create the virtual map of surrounding.

It should be noted that the electronic equipment can be terminal, or server, the embodiment of the present invention do not make this It limits.

Fig. 2 is a kind of flow chart of image processing method provided in an embodiment of the present invention, and this method is applied to electronic equipment, The electronic equipment can be above-mentioned electronic equipment 102 shown in FIG. 1, and referring to fig. 2, this method may comprise steps of:

201, electronic equipment obtains multiple images frame.

In embodiments of the present invention, electronic equipment can have image processing function, which can set electronics Standby multiple images frame collected is handled, and the corresponding destination virtual scene of multiple picture frame is created.Specifically, the electronics Equipment can position the equipment for acquiring multiple picture frame, the target trajectory of the equipment be obtained, alternatively, can create The corresponding destination virtual map of multiple picture frame is built, or, destination virtual map can also be not only created, but also obtain target fortune Dynamic rail mark, the destination virtual scene can be configured according to demand by related technical personnel, and the embodiment of the present invention does not make this It limits.

In a kind of possible implementation, multiple picture frame can be sent to this by the equipment for acquiring multiple picture frame Electronic equipment.It that is to say, multiple images frame can be acquired by image capture device, and collected multiple images frame is sent to The electronic equipment, certainly, the image capture device can also shoot video, and the video that shooting obtains is sent to the electronics and is set It is standby, so that the electronic equipment can be handled the video received, obtain multiple images frame.Specifically, the electronic equipment Video can be carried out cutting frame processing, intercept multiple images frame, can also directly extract the multiple images frame for including in the video, The embodiment of the present invention is not construed as limiting this.

In alternatively possible implementation, which can have image collecting function, which can be with Multiple images frame is acquired, or shooting video obtains multiple images frame so that the video to shooting is handled.

In above two implementation, multiple picture frame can be real-time acquired image frame, or mention Preceding acquisition and the picture frame saved, can also handle to obtain to the video of captured in real-time, can also be to the video recorded in advance Processing obtains, and the embodiment of the present invention is not construed as limiting this.Certainly, multiple picture frame can also obtain by other means It arrives, for example, can acquire from image data base, the embodiment of the present invention does not limit the acquisition modes of multiple picture frame It is fixed.

In a specific example, which can be applied to automatic Pilot field, with multiple picture frame It is acquired in real time by image capture device and is sent to for the electronic equipment, the image capture device and electronic equipment can be by Be installed on vehicle, can also the image capture device be installed on vehicle, electronic equipment be vehicle except any appliance, The equipment can be communicated with mobile unit, to realize the control to vehicle drive.For example, image capture device often collects One picture frame, can be sent to electronic equipment for the picture frame, be handled by the electronic equipment the picture frame, and be based on The picture frame, to vehicle carry out positioning perhaps the virtual map around driving path is created or two steps into Row, so that subsequent can carry out path planning etc. based on virtual map and positioning.

202, electronic equipment carries out feature extraction to multiple picture frame, obtains the characteristic point of each picture frame.

Electronic equipment can be handled multiple picture frame after getting multiple images frame.For each image Frame, the electronic equipment can carry out feature extraction to each picture frame, the characteristic point of each picture frame be obtained, thus subsequent Electronic equipment can be based on characteristic point, analyze the spatial position of each characteristic point, and the road sign point of analysis surrounding scene is distributed, or Analysis acquires the position of the equipment of the picture frame.

In a kind of possible implementation, electronic equipment can extract ORB (the Oriented FAST of each picture frame And Rotated BRIEF) feature, which by key point and can describe sub- two parts and form, the key of ORB feature Point is " Oriented FAST ", and Oriented FAST is a kind of improved FAST angle point, and description uses " Rotated BRIEF ", Rotated BRIEF are a kind of improved BRIEF description.ORB is a kind of efficient comprising direction and rotation information Visual signature description.The ORB feature has scale invariability, rotational invariance etc., under the transformation of translation, rotation and scaling Good performance is all had, and is extracted with matching speed quickly, therefore, it is possible to effectively improve the speed handled picture frame The accuracy of degree, accuracy and efficiency, the destination virtual scene obtained from improves, and efficiency is also very high.

Specifically, which can be realized by following step one and step 2:

Step 1: electronic equipment extracts the angle point of each picture frame.

The angle point refers to the apparent place of local pixel value variation.For any pixel point of any image frame, when this When the pixel value of the pixel in target area around one pixel and the pixel is greater than threshold value, electronic equipment should Any pixel point is as original angle point.Harris (Harris) response of the available each original angle point of electronic equipment, will Angle point of the original angle point of the maximum preceding destination number of the Harris response as any image frame.Electronic equipment is preliminary After extracting angle point, image pyramid can be established according to scale factor and the pyramidal number of plies, to contracting on each layer of pyramid Image after putting carries out Corner Detection, obtains the angle point of each picture frame, the ruler of the image after scaling on each layer of pyramid Degree is different.After obtaining angle point, the directional information of the available each angle point of electronic equipment.Specifically, the acquisition of direction information Process can be to be realized by moments method, and for any angle point, electronic equipment can be using the angle point as starting point, will be with each angle Point is vector directional information as the angle point of the mass center as terminal of the image block of geometric center.

Wherein, which can be preset by related technical personnel, for example, the target area can be the pixel Centered on point, the circle that radius is pre-set radius, the embodiment of the present invention is not construed as limiting this.It above are only a kind of exemplary illustration, The process of the angle point grid can realize that the embodiment of the present invention is not construed as limiting this by any Corner Detection Algorithm.

Step 2: the object region of the angle point extracted is described in electronic equipment.

For any angle point, target image of the available angle point of electronic equipment on the image that the angle point corresponds to scale Region, the selected pixels point pair from the object region rotate the point of selection to according to the directional information of the angle point Postrotational pixel pair is obtained, according to the pixel value size relation of two pixels of the pixel centering, acquires the angle The description information of point.

For example, as shown in figure 3, can be carried out to the picture frame after electronic equipment gets a picture frame shown in Fig. 3 Feature extraction obtains characteristic point shown in Fig. 4, as shown in figure 4, each circle represents a characteristic point, the center of circle, which is characterized, a little to exist Position in picture frame, round size represent the scale of this feature point, and the line segment in circle is used to indicate the directional information of feature.

Above-mentioned steps one and step 2 are to extract the process of ORB feature, only a kind of exemplary illustration of the step 202, In a kind of possible implementation, the other feature of the electronic equipment also available picture frame, the embodiment of the present invention is to this It is not construed as limiting.

In a kind of possible implementation, when acquiring the type difference of the camera of multiple picture frame, the electronic equipment pair The treatment process that multiple picture frame carries out can be different.For example, the type of camera may include monocular camera, binocular camera, RGBD (Red Green Blue Depth) camera etc., certainly, the type of the camera can also include other types, and the present invention is real It is numerous to list herein to apply example.After the step 202, when the camera for acquiring multiple picture frame is binocular camera, electronics Equipment can also distinguish collected two images to first camera in the binocular camera and second camera by individual thread The characteristic point of frame is matched.The picture frame obtained in this way by two cameras in binocular camera is available to arrive characteristic point Depth determines the spatial position of characteristic point so as to be based on the depth.

The step of progress feature extraction, can be realized by first thread in above-mentioned steps 202, the process of the binocular ranging It can be realized based on the second thread, the first thread and the second thread difference.Different steps is carried out by different threads in this way Suddenly, electronic equipment can execute different steps parallel, to accelerate image processing speed, improve image processing efficiency.

For example, binocular camera can be the camera for having demarcated intrinsic parameter, which may include first camera and Two cameras are Zuo Xiangji with first camera, and for second camera is right camera, the process of above-mentioned binocular ranging can be with are as follows: obtains Baseline of the characteristic point of the first camera the first picture frame collected in the second camera the second picture frame collected, edge Baseline successively match the characteristic point on the baseline, obtain the depth of each matched characteristic point.

203, characteristic point of the electronic equipment to the characteristic point of each picture frame and the previous picture frame of each picture frame It is matched.

In embodiments of the present invention, multiple picture frame can be continuous multiple image frame, and the pose of camera has occurred Variation, the content in picture frame taken are then changed, thus, the position of the characteristic point of different images frame then may hair Variation is given birth to.In electronic equipment after the characteristic point for getting each picture frame, carried out by the characteristic point to adjacent image frame Matching, can determine the situation of change of camera pose according to the change in location of characteristic point, may be implemented so multiple to acquiring The positioning function of the equipment of picture frame.In addition, matched by the characteristic point to adjacent picture frame, it can also be according to feature Position of the point in different images frame, determines the spatial position of this feature point.

Wherein, the camera pose be used for indicate acquired image frames camera position and posture, one kind can the side of being able to achieve In formula, which can indicate that the camera pose can pass through translation matrix and spin matrix table using sextuple vector Show, which is used to indicate the camera position under original coordinate system, and the spin matrix is for indicating that camera current pose exists Rotation angle under camera coordinates system.

In a kind of possible implementation, when which carries out matched process to the characteristic point of adjacent image frame, The camera pose of current image frame can be first predicted according to history image frame, to instruct this current using the camera pose of prediction Picture frame is matched with the characteristic point of the previous picture frame of current image frame, can accelerate the speed of characteristic matching.Specifically Ground, the matching process can be realized by following step one and step 2:

Step 1: electronic equipment is according to the first history image frame before each picture frame for each picture frame Camera pose predicts the camera pose of each picture frame.

Step 2: camera pose of the electronic equipment according to each picture frame, every to the characteristic point and this of each picture frame The characteristic point of the previous picture frame of a picture frame is matched, and the spatial position of the characteristic point of each picture frame is obtained.

In the step 1 and step 2, the first history image frame can be the one or more before each picture frame Picture frame.In a possible embodiment, after the step 2, electronic equipment can repeat execution according to matching result Above-mentioned steps one and two, electronic equipment can carry out camera pose excellent according to the spatial position for the characteristic point that step 2 obtains Change, then the camera pose based on optimization instructs matching process again, to obtain the space of accurate camera pose and characteristic point Position, until meeting certain condition.In the related technology, usually directly according to the previous picture frame of each picture frame with after One picture frame, to guess the camera pose of each picture frame, accuracy is poor.And in the embodiment of the present invention, by predicting phase Matching speed can be improved to instruct characteristic matching process according to camera pose in seat in the plane appearance, to improve whole image processing The efficiency of process.

In a kind of possible implementation, which can be realized by Kalman filter.Specifically, electronic equipment It can be handled, be obtained by camera pose of the Kalman filter to the first history image frame before each picture frame The camera pose of each picture frame.Electronic equipment can camera pose based on each picture frame, previous picture frame The space bit of the characteristic point of each picture frame is predicted in the spatial position of the characteristic point of camera pose and the previous picture frame It sets, to be matched based on the spatial position of the prediction with the characteristic point of each picture frame, which can be true Determine the similarity between characteristic point, determines whether to match, the embodiment of the present invention does not repeat herein.

For example, the matched process of this feature can be with as shown in figure 5, for any image frame (current image frame), electronics be set It is standby the first history image frame before the current image frame to be handled by Kalman filter, obtain predicted value and Predict covariance, electronic equipment instructs characteristic matching, then the phase based on matching result optimization current image frame by the predicted value Machine posture (camera pose), so as to update parameter value and the prediction of Kalman filter based on the camera pose after the optimization Covariance repeats multiple step, until prediction covariance is minimum or prediction covariance convergence, the process are only one and show Example property explanation, the embodiment of the present invention are not construed as limiting this feature matching process.

In a specific embodiment, in above-mentioned steps 202 and step 203, electronic equipment can use different lines Journey carries out feature extraction to picture frame respectively and matches to the characteristic point that picture frame extracts.It that is to say, electronic equipment can To execute above-mentioned steps 202 and step 203 respectively using different threads, multiple images frame can handled in this way In the process, the step 202 and step 203 may be implemented to execute parallel, to save image processing time, improve image processing speed And efficiency.

For example, the electronic equipment can be based on first thread, above-mentioned steps 202 are executed, third thread is based on, executes the step Rapid 203, the first thread is different with third thread.In the possible implementation of one of above-mentioned steps 203, binocular ranging Step can be realized based on individual thread (such as second thread), it should be noted that first thread, the second thread and the The step of three threads are all different, and that is to say, above-mentioned steps 202, step 203 and binocular ranging can use different lines Cheng Shixian, in this way three steps may be implemented to execute parallel.

In the related technology, which is usually to be realized using the same thread, needed for which executes The time of cost is very long, has directly dragged slowly whole image process flow, by test, passes through in the embodiment of the present invention by this three Step is independent, and the bulk velocity of image processing flow can be made to be enhanced about more than once.

Three steps are independently also convenient for extending other steps, when the image processing method is applied to different scenes When, it may further include other steps in the image processing method, which can also be with three step parallel processings.Example Such as, which can be applied to the virtual reality field (Virtual Reality, VR) and augmented reality In the field (Augmented Reality, AR), can with Inertial Measurement Unit (Inertial Measurement Unit, IMU different functions) is realized, such as can be by the real-time positioning camera pose of the image processing method, according to what is positioned in real time Camera pose carries out the three-dimensional reconstruction of image, is game film industry rapid modeling, can also be consumer and three-dimensional (3Dimensions, 3D) printing provides the threedimensional model etc. of low cost, above are only a kind of possible application Sample Scenario, not The application scenarios of the image processing method are caused to limit.In the application scenarios, the integral obtaining step of IMU can with it is upper It states three steps to execute parallel, to improve the speed of entire process flow.

It should be noted that above-mentioned steps 202 to step 203 be to the corresponding multiple images frame of electronic equipment at Reason, obtains the process of the camera pose of each picture frame and the spatial position of characteristic point, the phase seat in the plane of above-mentioned each picture frame Appearance is to be predicted by the camera pose of history image frame, may be less accurate, and after above-mentioned steps 203, electronics is set The camera pose of the standby each picture frame that can also be obtained to prediction optimizes, and obtains more accurate camera pose.

Specifically, which can be with are as follows: electronic equipment obtain all characteristic points of each picture frame error it It predicts to obtain with based on the second history image frame in the spatial position for being used to indicate each characteristic point with, the error of each characteristic point Spatial position between gap, the spatial position of each characteristic point is determined based on the depth of each characteristic point.Electronics is set The sum of the error of standby all characteristic points based on each picture frame, is adjusted the camera pose of each picture frame, directly Stop adjustment, the camera pose of each picture frame after being optimized when meeting goal condition to the sum of the error.

Wherein, which can be the picture frame before each picture frame, one kind can the side of being able to achieve In formula, the second history image frame can be characterized a little with the matched multiple images frame of any feature point in first image Frame.It that is to say, electronic equipment is after handling multiple images frame, when handling some picture frame, obtains this some When the error of some characteristic point in picture frame, electronic equipment can be extracted to the picture frame of this feature point for the first time as this Second history image frame.

Specifically, electronic equipment can first obtain the error of each characteristic point of each picture frame, and then summation is somebody's turn to do The sum of the error of all characteristic points of each picture frame.Using the sum of the error as optimization aim, to the camera pose of picture frame It optimizes.The sum of the error meets goal condition can be configured according to demand by related technical personnel, for example, can be The sum of error reaches the sum of minimum value or error convergence etc., and the embodiment of the present invention is not construed as limiting this.

For the error of each characteristic point of each picture frame, electronic equipment can by following step one to step 4 come Obtain the error:

Step 1: for any feature point of any image frame, electronic equipment second is gone through according to any feature point at this The depth of matching characteristic point in history picture frame obtains the matching characteristic o'clock in the camera coordinates system of the second history image frame Spatial position.

In the step 1, the spatial position of this feature point can be indicated with the depth of characteristic point, for any feature Point, electronic equipment can be according to the depths of camera parameter and matching characteristic point of any feature o'clock in the second history image frame Degree, obtains spatial position of the matching characteristic o'clock in the camera coordinates system of the second history image frame.To it is subsequent can root According to the relativeness of two picture frames, to predict any feature point in current image frame (that is to say any image frame) Spatial position.

For example, it is assumed that any image frame is picture frame j, characteristic point r is a characteristic point in picture frame j, the spy Levying the corresponding second history image frame of point r is picture frame i, and matching characteristic point of this feature point r in the second history image frame is Characteristic point q.For matching characteristic point q, its normalized can be by we according to the camera internal reference matrix of camera modelFor pinhole camera model,K is camera internal reference matrix, it is assumed that the depth of matching characteristic point q is d, and d is big In or be equal to 0, then the spatial position of matching characteristic point q can be

In the related technology, optimization process generallys use re-projection error as optimization object function, and re-projection error is main Suitable for pinhole camera model, the spatial position of usual characteristic point needs to indicate using three parameters, that is to say, uses three Dimension coordinate (X, Y and Z coordinate) comes representation space position, and comes representation space position using depth in the embodiment of the present invention, can be with Memory consumption is reduced, optimal speed is improved, is readily applicable to a variety of cameras, improve the adaptability of the optimization process.

Step 2: space bit of the electronic equipment according to the matching characteristic o'clock in the camera coordinates system of the second history image frame It sets, the opposite phase of the camera pose of the second history image frame and any image frame relative to the second history image frame Seat in the plane appearance obtains the prediction relative direction of any feature point relative to the camera position of any image frame.

When comparing premeasuring and substantial amount, the opposite side that this feature point can be used relative to camera position is always measured The error of the spatial position of this feature point.In the step 2, electronic equipment can be according to any image frame and the second history That above-mentioned matching characteristic o'clock is converted from the second history image frame into any image frame, is obtained by the relativeness of picture frame To the premeasuring of any feature point in any image frame.

Specifically, electronic equipment can be according to sky of the matching characteristic o'clock in the camera coordinates system of the second history image frame Between position, the camera pose of the second history image frame and any image frame relative to the opposite of the second history image frame Camera pose obtains the prediction spatial position of any feature point in the camera coordinates system of any image frame.Electronic equipment Any feature can be obtained based on the prediction spatial position of any feature point in the camera coordinates system of any image frame Prediction relative direction of the point relative to the camera position of any image frame.

For example, it is assumed that any image frame is picture frame j, characteristic point r is a characteristic point in picture frame j, the spy Levying the corresponding second history image frame of point r is picture frame i, and matching characteristic point of this feature point r in the second history image frame is Characteristic point q.The camera pose of picture frame j is P_j, picture frame i is P relative to the opposite camera pose of picture frame j_i, above-mentioned steps In one, the spatial position for having obtained matching characteristic point q can beSo matching characteristic point q is in picture frame j camera Coordinate under coordinate system isThe coordinate that is to say, prediction of this feature point r under the camera coordinates system of picture frame j Spatial position.Electronic equipment can determine the spy according to coordinate of the characteristic point r under the camera coordinates system of any image frame j Levy prediction relative direction of the point r relative to camera position specifically can be by the coordinatePlace is normalized Reason, the direction p after being normalized.Wherein, which is the origin position of the camera coordinates system of any image frame j It sets.Direction p is the image center of any image frame j to the directions of rays of this feature point r, that is to say this feature point r phase For the prediction relative direction of the camera position of any image frame.

Step 3: electronic equipment obtains the opposite side of any feature point relative to the camera position of any image frame To.

Through the above steps one and step 2, the spatial position that electronic equipment passes through the matching characteristic point of any feature point And the relativeness of two picture frames, obtain the prediction relative direction of any feature point in any image frame, then it is electric Sub- equipment can also obtain the true relative direction of any feature point in any image frame, current true so as to compare Whether the spatial position of fixed this feature point is accurate, and whether the camera pose situation of change between two picture frames is consistent.

By features described above matching process, after electronic equipment matches any image frame, it is any that this can be obtained The spatial position of characteristic point, the electronic equipment can obtain any feature point phase according to the spatial position of any feature point To the relative direction of the camera position with any image frame.

For example, it is assumed that any image frame is picture frame j, characteristic point r is a characteristic point in picture frame j, the spy Levying the corresponding second history image frame of point r is picture frame i, and matching characteristic point of this feature point r in the second history image frame is Characteristic point q.The coordinate of characteristic point r is normalized, available this feature point r is relative to any image frame j's The relative direction of image center

It should be noted that above-mentioned steps one and step 2 are to obtain the process of prediction relative direction, which is to obtain The process of relative direction is taken, which can carry out simultaneously, can also first obtain prediction relative direction, then obtain opposite side To can also first obtain relative direction, then obtain prediction relative direction, that is to say, step 1 and step 2 are integrated work For a combining step, electronic equipment may be performed simultaneously the combining step and step 3, can also first carry out the combining step, Step 3 is executed again, can also first carry out step 3, then execute combining step, and the embodiment of the present invention is to above-mentioned combining step and step Rapid three execution timing is not construed as limiting.

Step 4: electronic equipment puts down the prediction relative direction and the relative direction in vertical with the relative direction respectively Error of the gap between projected position as any feature point in face.

Electronic equipment after getting the prediction relative direction and relative direction, both can according to premeasuring and substantial amount, To compare to obtain the error of any feature point.Specifically, when obtaining error, which can be projected to same In plane, to obtain the gap between two projected positions as error.In a kind of possible implementation, which can Think the plane vertical with relative direction, which can be the section of a spherical surface, therefore, it is possible to which the error is referred to as that spherical surface misses Difference.

For example, electronic equipment can be by prediction relative direction p and relative direction that above-mentioned steps three and step 4 obtain Projection to the relative directionIn vertical plane, using the difference of projected position as the error of characteristic point r.

Above-mentioned steps one provide a kind of acquisition modes of the error of characteristic point into step 4, indicate empty using depth Between position, it is possible to reduce memory consumption improves optimal speed, is readily applicable to a variety of cameras, improves the optimization process Adaptability.Certainly, the error of this feature point can not also make this using re-projection error or other errors, the embodiment of the present invention It limits.

204, for any image frame, electronic equipment obtains the corresponding multinomial position difference information of any image frame, should Multinomial position difference information is used to indicate the matching characteristic point of the previous key frame of any image frame and any image frame Position difference.

One to step 3 through the above steps, and electronic equipment obtains multiple images frame, and to multiple picture frame at Reason, the image data of available each picture frame, for example, the image data can be characterized spatial position and image a little The data such as the camera pose of frame, to can refer to these image datas when subsequent desired creation destination virtual scene, be somebody's turn to do The corresponding destination virtual scene of multiple images frame.

Electronic equipment can obtain multiple key frames from multiple picture frame, which compares in multiple picture frame More representative and key, electronic equipment can extract key frame from multiple picture frame, get enough figures As data, to be further processed, destination virtual scene is created.

For multiple picture frame, which can be using first picture frame as key frame, for subsequent One picture frame (current image frame), the available current image frame of electronic equipment are more with the matching characteristic point of previous key frame Item position difference information, determines whether using the picture frame as key frame.

Wherein, the position difference of the matched characteristic point of previous key frame of any image frame and any image frame is believed Breath may include it is a variety of, that is to say, various factors can be considered in when acquisition key images frame.Three kinds of positions are provided below Different information is set, the multinomial position difference information in the step 204 may include three kinds of position difference information at least two.Under It is illustrated in face of three kinds of position difference information.

Position difference information one: the first quantity of the matching characteristic point of any image frame and the previous key frame.

For two adjacent key frames, need the matching characteristic point of two key frames more, former and later two are closed in this way There is no too big variations for key frame camera position, in this way by handling two key frames when, matching characteristic point Position difference is small, can accurately know identical road sign point in the situation of change and two key frames of the camera position Situation of change., whereas if the matching characteristic point negligible amounts of the two adjacent key frames got, the position of matching characteristic point It is big to set difference, may then position failure by two key frames, can not know the situation of change of camera position, namely can not Get the high destination virtual scene of accuracy.Thus, when determining whether to obtain any image frame is key frame, Ke Yikao Consider the position difference information one, is wanted in the subsequent accuracy that may determine that whether the position difference information one meets scene creation It asks.

Then in the step 204, for any image frame, the available any image frame of electronic equipment and the previous pass First quantity of the matching characteristic point of key frame, using first quantity as a Xiang Zhibiao of the accuracy for measuring scene creation.

Position difference information two: the first quantity of the matching characteristic point of any image frame and the previous key frame and the The ratio of two quantity, second quantity are the quantity of the characteristic point of the previous key frame.

The position difference information can also be the relative attenuation rate of matching characteristic point, which is any figure As the ratio of the second quantity of the characteristic point of the first quantity and previous key frame of the matching characteristic of frame and previous key frame point Value.When the ratio is smaller, the position difference of matching characteristic point is small, illustrates that the variation of two picture frame camera poses is not very Greatly, there is no the too big variations of generation for the spatial position of matching characteristic point, conversely, when the ratio is larger, the position of matching characteristic point Difference is big, illustrates two picture frame camera pose variations very greatly, the spatial position of matching characteristic point is varied widely.

Similarly with above-mentioned position difference information one, if the ratio of two adjacent key frames is excessive, two phases The position difference of the matching characteristic point of adjacent key frame then may be larger, may then position mistake by two key frames It loses, can not know the situation of change of camera position, namely the high destination virtual scene of accuracy can not be got.Thus, sentencing It is disconnected when whether to obtain any image frame be key frame, it may be considered that the position difference information two may determine that the position subsequent Set the accuracy requirement whether different information two meets scene creation.Then in the step 204, for any image frame, electronics is set First quantity of the matching characteristic point of standby available any image frame and the previous key frame and the ratio of the second quantity, Second quantity is the quantity of the characteristic point of the previous key frame.

Position difference information three: the score of any image frame, the score are based on any image frame and the previous pass In each comfortable two picture frames of matching characteristic point of the baseline length of key frame and any image frame and the previous key frame Resolution ratio determine.

For any image frame, the score of the available any image frame of electronic equipment, to characterize any image frame If appropriate for as key frame, the score is bigger, and the possible position difference of matching characteristic point is smaller, and any image frame is then more suitable Cooperation is key frame.Then in the step 204, for any image frame, the score of the available any image frame of electronic equipment.

The score for considering the picture frame that is to say and consider picture frame and the baseline length of previous key frame and match spy Levy the ratio of the resolution ratio of point.It is to be appreciated that the baseline length is bigger, camera pose may change more in two picture frames Greatly, the position difference of matching characteristic point may be bigger.The resolution ratio of the matching characteristic point is smaller, then the representative of the matching characteristic point Property and determine that the meaning of position difference is smaller using the matching characteristic point.

In a kind of possible implementation, the determination process of the score of each picture frame can be with are as follows: for any image Any matching characteristic point at least one matching characteristic point of frame and the previous key frame, electronic equipment obtain any matching The normal distribution value of the corresponding baseline length of characteristic point.Electronic equipment obtains in the resolution ratio in each comfortable two picture frames Minimum value and the normal distribution value product.Electronic equipment is weighted the product of at least one matching characteristic point and asks With obtain the score of any image frame.

Specifically, the baseline length and resolution ratio can be the numerical value being calculated, and can also be characterized with other parameters. In a possible embodiment, the corresponding baseline length of any matching characteristic point is by any matching characteristic point at this The the first line direction and the second line direction of projected position and any matching characteristic point in one picture frame are formed by folder Angle characterization, the second line direction are the projected position of any matching characteristic point in the previous key frame and this any Line direction with characteristic point；The resolution ratio of any matching characteristic point passes through any matching characteristic point and any image frame The distance between camera position or with the distance between the camera position of previous key frame characterization.

For example, the determination process can realize that the formula is as follows using formula:

Wherein, f (C_i) be current image frame score, p is the matching characteristic point p of current image frame Yu previous key frame Spatial position, ∠ C_i-1pC_iIt is the angle between matching characteristic point p and current image frame and previous key frame, the angle More than or equal to 0, for characterizing baseline length, d (p, C_i-1) be matching characteristic point p and previous key frame camera position C_i-1The distance between, d (p, C_i) be matching characteristic point p and current image frame camera position C_iThe distance between, the distance More than or equal to 0, for characterizing resolution ratio.G (x) is normal distyribution function, and e is natural constant, and x is independent variable, in above-mentioned f (C_i) calculation formula in, independent variable x be ∠ C_i-1pC_i.μ is the mean value of independent variable, and σ is standard deviation.∑ is summing function.

It is above-mentioned that three kinds of position difference information are illustrated, in the available three kinds of position difference information of electronic equipment At least two are used as multinomial position difference information, to judge whether the accuracy requirement for meeting scene creation.It needs to illustrate It is that above-mentioned to provide only three kinds of position difference information, which can also include other positions different information, The score of above-mentioned key frame can also determine by other means, for example, the multinomial position difference information can also include this Depth obtained in one picture frame and each comfortable two picture frames of the matched characteristic point of previous key frame, the embodiment of the present invention pair This is not construed as limiting.

205, when any one of the multinomial position difference information meets the accuracy requirement of scene creation, electronic equipment will Any image frame is retrieved as key frame.

Electronic equipment is after getting multinomial position difference information, it can be determined that each single item in the multinomial position difference information The accuracy requirement for whether meeting scene creation, when any one meets, which can be retrieved as by electronic equipment Key frame.

It, should it is contemplated that above-mentioned three kinds of position difference information, that is to say in step 204 in a kind of possible implementation Multinomial position difference information may include above-mentioned three, and correspondingly, in the step 205, for any image frame, electronic equipment can To judge above-mentioned three, when any one meets the accuracy requirement of scene creation, electronic equipment then can be by any image Frame is retrieved as key frame.Certainly, step 204 was it is also contemplated that wantonly two in above-mentioned three kinds of position difference information correspondingly should In step 205, when judging any image frame, it also may determine that two accuracys for whether meeting scene creation are wanted It asks, when any one meets the accuracy requirement of scene creation, which can be retrieved as key frame by electronic equipment. The embodiment of the present invention is to specifically using which kind position difference information is not construed as limiting, also not to the accuracy requirement of the scene creation It limits.

In a specific embodiment, every kind of position difference information can be corresponding in above-mentioned three kinds of position difference information The accuracy requirement of corresponding scene creation, below for the accuracy requirement of the corresponding scene creation of every kind of position difference information It is illustrated.

It can be the first of the matching characteristic point of any image frame and previous key frame for position difference information one An amount threshold is arranged in quantity, and the accuracy requirement which meets scene creation can be with are as follows: any figure As the first quantity of the matching characteristic of frame and previous key frame point is less than amount threshold.It that is to say, which can be with are as follows: When the first quantity of the matching characteristic of any image frame and previous key frame point be less than amount threshold when, electronic equipment by this One picture frame is retrieved as key frame.The amount threshold can be configured according to demand by related technical personnel, and the present invention is implemented Example is not construed as limiting this.When the first quantity is already less than amount threshold, then illustrates to need to add key frame, otherwise be further continued for Judge next picture frame, the first quantity of next picture frame and the matching characteristic point of previous key frame can be smaller, matching The position difference of characteristic point becomes larger, and is likely to result in positioning failure, contact lost.

For position difference information two, a fractional threshold can be set for the ratio, which meets The accuracy requirement of scene creation can be with are as follows: the first quantity of the matching characteristic point of any image frame and previous key frame with The ratio of second quantity is greater than fractional threshold.It that is to say, which can be with are as follows: when obtaining any image frame and this is previous When first quantity of the matching characteristic point of a key frame and the ratio of the second quantity are greater than fractional threshold, electronic equipment is any by this Picture frame is retrieved as key frame.Wherein, which can be configured according to demand by related technical personnel, and the present invention is real It applies example and this is not construed as limiting.The ratio that current image frame and previous key frame obtain is greater than fractional threshold, then illustrates to need Key frame is added, is otherwise further continued for judging next picture frame, the corresponding relative attenuation rate (ratio) of next picture frame is then Can be bigger, the position difference of matching characteristic point then can be bigger, is likely to result in positioning failure, contact lost.

For position difference information three, the score of the picture frame can be provided with score threshold, the position difference information Three accuracy requirements for meeting scene creation can be with are as follows: the score of any image frame is greater than score threshold.It that is to say, the step 205 can be with are as follows: when the score of any image frame is greater than score threshold, which is retrieved as key by electronic equipment Frame.Wherein, which can be configured according to demand by related technical personnel, and the embodiment of the present invention does not limit this It is fixed.

It should be noted that the above-mentioned standard for meeting scene creation to every kind of position difference information in three kinds of position difference information True property requires to be illustrated, if electronic equipment obtains other positions different information in step 204, in the step 205 May include the setting of the accuracy requirement of the corresponding scene creation of other positions different information, the embodiment of the present invention to this not It limits.

206, for any key frame in multiple key frames for getting, feature of the electronic equipment based on any key frame It is empty to create the corresponding first partial of any key frame for point, and other key frames of the characteristic point including any key frame Quasi- scene.

It, then can be using the image data of multiple key frame as creation target after electronic equipment gets multiple key frames The data foundation of virtual scene.Electronic equipment can first create local virtual scene, then integrate multiple local virtual scenes, obtain Whole destination virtual scene.The step 206 can every time get a key frame when execution, with create this acquisition The corresponding first partial virtual scene of key frame, the step 206 can also execute after getting all key frames, and the present invention is real It applies example and this is not construed as limiting.

When electronic equipment can create each key frame corresponding first partial virtual scene, for any one key frame, For the characteristic point of the key frame by the foundation as creation first partial virtual scene, other key frames including these characteristic points can Using as with the related key frame of the key frame, include these characteristic points in other key frames, it can be said that the bright key The camera pose of frame and other key frames is not much different, it is believed that the road sign point in other key frames is located at the key Near road sign point in frame.Therefore, it is possible to using the key frame and other key frames, to create the key frame corresponding One local virtual scene.

Specifically, which may include in first partial virtual map and first partial motion profile At least one of.Electronic equipment can the spatial position based on any key frame He the characteristic point of other key frames, obtain the The spatial position of each road sign point in one local virtual map obtains the corresponding first partial virtual map of any key frame. The spatial position of each road sign point is corresponding with the spatial position of characteristic point in the first partial virtual scene.Electronic equipment can be with base In any key frame and the camera pose of other key frames, the first partial for acquiring the equipment of multiple picture frame is obtained Motion profile.Which kind of first partial virtual scene is specifically obtained in the step 206 to be arranged according to demand, the embodiment of the present invention This is not construed as limiting.

207, electronic equipment optimizes the first partial virtual scene of any key frame and other key frames, obtains To the corresponding second local virtual scene of any key frame.

Electronic equipment can also optimize it after obtaining first partial virtual scene, to obtain more accurately Two local virtual scenes.Similarly, which may include the second local virtual map and the second part fortune At least one of dynamic rail mark.Electronic equipment can at least one of first partial virtual map and first partial motion profile into Row optimization obtains at least one of the corresponding second local virtual map of any key frame and second local motion profile.

Wherein, when being optimized to the first partial virtual scene, can scheme using with above-mentioned steps 203 to single It is optimized as frame optimizes mode similarly, unlike, it is to be optimized to single image frame in above-mentioned steps 203, Thus the camera pose of current image frame is optimized.And in the step 207, optimization aim includes multiple key frames, It that is to say any key frame and other key frames, thus the feature of any key frame and other key frames can be optimized The spatial position of point and camera pose, to improve the accuracy of first partial virtual scene.

Specifically, which can be with are as follows: electronic equipment obtains all of any key frame and other key frames The sum of error of characteristic point, the error of each characteristic point are used to indicate the spatial position of each characteristic point and based on the second history The gap between spatial position that picture frame is predicted, the spatial position that should be predicted based on the second history image frame are based on The depth of the matching characteristic point of each characteristic point determines.Institute of the electronic equipment based on any key frame He other key frames Have the sum of the error of characteristic point, the depth of camera pose and all characteristic points to any key frame and other key frames into It is empty to obtain corresponding second part of any key frame until the sum of the error stops adjustment when meeting goal condition for row adjustment Quasi- scene.

Wherein, the error shown in step 1 to step 4 in the process and step 203 for obtaining the error of any feature point Similarly, the embodiment of the present invention does not repeat acquisition process herein.The step 206 and step 207 are to obtain the mistake of local message Journey, which can be realized by an individual thread, for example, the 4th thread can be based on, execute the step 206 and step 207。

In a kind of possible implementation, after the step 207, electronic equipment can also carry out reorientation detection, with inspection Whether the equipment for surveying the multiple picture frame of the acquisition has returned to the position of history process, and reorientation detection is also referred to as closed loop inspection It surveys.When the equipment in the position and history image frame for determining the equipment for acquiring multiple picture frame according to multiple continuous key frames The distance between position when being less than distance threshold, electronic equipment can go through the characteristic point of multiple continuous key frame with this The characteristic point of history picture frame is merged.Specifically, which can be with are as follows: electronic equipment is to multiple continuous key frame Average computation is carried out with the spatial position of the matching characteristic point of history image frame, obtains the fused space bit of matching characteristic point It sets.The step of in a possible embodiment, electronic equipment can be based on the 5th thread, execute reorientation detection and fusion, 5th thread and four above-mentioned threads may be different thread, that is to say that the reorientation detection and fusion process can be with It is realized by individual thread.

In a possible embodiment, the process of reorientation detection can also be realized by visual vocabulary tree, wherein should Visual vocabulary tree can be built based on image data obtained in characteristic extraction step in above-mentioned steps 202 and binocular ranging step It is vertical, in a kind of possible implementation, visual vocabulary tree realization is also based in the characteristic matching in above-mentioned steps 203, Characteristic matching speed can be further speeded up in this way.

In a kind of possible implementation, after having carried out above-mentioned reorientation fusion, above-mentioned multiple continuous key frames and go through The spatial position of the matching characteristic point of history picture frame is adjusted, then determining multiple passes according to matching characteristic point adjusted The camera pose of key frame may not then be met with the relativeness of front and back picture frame, thus can be correspondingly to multiple key frame Camera pose be modified so that the camera pose of the multiple key frames got at present is more acurrate.

208, electronic equipment is based on the corresponding second local virtual scene of multiple key frame, creates multiple key frame pair The initial virtual scene answered.

Electronic equipment can integrate multiple key frames corresponding second after having obtained the accurate second local virtual scene Local virtual scene, creates the corresponding initial virtual scene of multiple key frame, which is to obtain in step 201 The corresponding global virtual scene of the multiple images frame arrived.Similarly, initial virtual scene may include initial virtual map and just At least one of beginning motion profile.

Specifically, the data basis that the corresponding multiple second local virtual scene of above-mentioned multiple key frames is included may wrap The spatial position of same characteristic point or the camera pose of same key frame are included, and they can in the second different local virtual scenes It can be identical, it is also possible to it is different, thus, when creating initial virtual scene, the above-mentioned second local virtual scene can also be carried out Integrated treatment.The process can also use average treatment, can also handle by other means, the embodiment of the present invention does not make this It limits.

209, electronic equipment optimizes the initial virtual scene of multiple key frame, obtains destination virtual scene.

Similarly, after obtaining initial virtual scene, it is whole excellent that electronic equipment also needs to integrate all image datas progress Change, to obtain complete, accurate destination virtual scene.The destination virtual scene may include destination virtual map and target At least one of initial motion track.

Optimization process in the step 209 and above-mentioned steps 207 and step 203 similarly, unlike, the step 209 To be optimized to the corresponding initial virtual scene of all keys, and to the corresponding office of Partial key frame in above-mentioned steps 207 Portion's virtual scene optimizes, and optimizes in step 203 to the corresponding image data of single image frame, and the step 209 and The spatial position of characteristic point and camera pose are adjusted in step 207, and to the camera pose of single image frame in step 203 It optimizes.

Specifically, which can be with are as follows: electronic equipment obtain all characteristic points of multiple key frame error it With.The sum of the error of all characteristic points of the electronic equipment based on multiple key frame, to the camera pose of multiple key frame and The depth of all characteristic points is adjusted, until the sum of the error stops adjustment when meeting goal condition, obtains multiple image The corresponding destination virtual scene of frame.

Wherein, the error shown in step 1 to step 4 in the process and step 203 for obtaining the error of any feature point Similarly, the embodiment of the present invention does not repeat acquisition process herein.

The step 208 and step 209 are to obtain the process of destination virtual scene, in a kind of possible implementation, the mistake Journey can be realized by individual thread, for example, the step 208 and step 209 can be realized based on the 6th thread, the 6th line Journey and above-mentioned five threads are not identical threads.

It should be noted that above-mentioned steps 206 to step 209 is that it is empty to create target based on the multiple key frames got The process of quasi- scene, in this process, electronic equipment can execute at least one of following step one and step 2:

Step 1: the spatial position of characteristic point of the electronic equipment based on the multiple key frames got, obtains destination virtual The spatial position of each road sign point, obtains destination virtual map in scene.

Step 2: camera pose of the electronic equipment based on the multiple key frames got, obtains and acquires multiple picture frame Equipment target trajectory.Specifically, electronic equipment can be obtained based on the camera pose of the multiple key frames got Position of the equipment of multiple picture frame at the time of acquiring each key frame is acquired, to be based on multiple key frame, then may be used To obtain the target trajectory of the equipment.

For example, electronic equipment gets multiple images frame, multiple picture frame is handled, is extracted multiple keys Frame creates destination virtual scene based on multiple key frames.By taking the image processing method is applied to automatic Pilot field as an example, this is more A picture frame can be collected by the image capture device being installed on vehicle, which will be collected multiple Picture frame is sent to the electronic equipment with image processing function, which can carry out above-mentioned place to multiple picture frame Reason process can also position the vehicle to create destination virtual map, obtain target trajectory, can also both create Destination virtual map is built, target trajectory is also obtained.Not only to obtain the target trajectory of vehicle, but also create the vehicle driving For destination virtual map around path, as shown in fig. 6, the motion profile of available vehicle and dilute in the step 208 Point cloud map is dredged, shows four specific motion profiles and sparse cloud map in Fig. 6.Certainly, which can also be only The motion profile for exporting vehicle, can also only export the sparse cloud map.Wherein, the motion profile of the vehicle is target fortune Dynamic rail mark, which is destination virtual map.The motion profile and sparse cloud map are destination virtual field Scape.

The detailed process of the image processing method is illustrated below by a specific example, as shown in fig. 7, the electricity Sub- equipment can realize above-mentioned image processing methods by six individual threads, when getting a new frame image (picture frame) When, electronic equipment can carry out feature point extraction to the picture frame, that is to say above-mentioned steps 201 and step by thread one 202, if camera is binocular camera, electronic equipment can carry out binocular ranging by thread two, correspond in above-mentioned steps 201 Binocular ranging process.

After having carried out features described above and having extracted the process with binocular ranging, electronic equipment can execute posture initial estimation Step that is to say, predict camera pose, be then based on the guidance of the camera pose of prediction, carry out Feature Points Matching, correspond to step Rapid 203, after matching, electronic equipment can carry out single frames pose refinement, that is to say that the camera pose progress to single image frame is excellent Change, and based on the data after optimization, determines whether the picture frame is used as key frame, which is key frame selection process, right It should be in step 204 and step 205.The posture initial estimation, the Feature Points Matching based on guidance, single frames pose refinement and key frame Selection course can be realized based on thread three.

Electronic equipment can be based on thread four, carry out local scene creation, and optimize to the local scene of creation, need It is noted that local scene, can be adaptive for real-time map creation is with sliding window common in positioning system Ground adjusts local problem's size, takes into account efficiency and accuracy.

After the part scene Optimization Steps, electronic equipment can carry out reorientation detection and again with view-based access control model words tree Fusion steps are positioned, that is to say content shown in above-mentioned steps 207, the reorientation detection and fusion process can be real based on thread five Existing, electronic equipment can be based on thread six, then optimize to all key frames, that is to say, optimize to global scene, obtain Final destination virtual scene.

In above-mentioned image processing method, to the estimation procedure of camera pose, key frame extraction algorithm before characteristic matching Accuracy is more preferable, can increase the robustness of the image processing method.Module design is more careful, by feature extraction and binocular The process matched is independent, is realized by individual thread, improves the speed and efficiency of the image procossing, can effectively improve the figure As the real-time of processing, and convenient for extension other sensors, the application of image processing method is more preferable.

As shown in figure 8, the image processing method can acquire video or consecutive image by front end sensors, and after being sent to End, rear end can carry out above-mentioned image processing flow to video or consecutive image, obtain camera pose (motion profile) and map At least one of, it can be used for front end browser and show, or controlled for rear end automatic Pilot, or starved for game film industry Virtual scene creation etc., the embodiment of the present invention is not construed as limiting the application scenarios of the image processing method.

Destination virtual scene obtained in above-mentioned steps 209 is with the data under target-based coordinate system, which is Using the position of the equipment of the acquired image frames in first picture frame as the coordinate system of origin position, sat if necessary to obtain the world Destination virtual scene under mark system, or need to obtain the destination virtual scene under some preferred coordinates system, then can be based on should The relationship of world coordinate system or preferred coordinates system and the target-based coordinate system converts above-mentioned destination virtual scene.The conversion Process can be configured according to demand by related technical personnel, and the embodiment of the present invention is not construed as limiting this.

All the above alternatives can form alternative embodiment of the invention using any combination, herein no longer It repeats one by one.

Fig. 9 is a kind of structural schematic diagram of image processing apparatus provided in an embodiment of the present invention, referring to Fig. 9, the device packet It includes:

Image collection module 901, for obtaining multiple images frame；

Data obtaining module 902, for obtaining the corresponding multinomial position difference of any image frame for any image frame Information, the multinomial position difference information are used to indicate the matching of the previous key frame of any image frame and any image frame The position difference of characteristic point；

The image collection module 901 is also used to meet the standard of scene creation when any one of the multinomial position difference information When true property requires, which is retrieved as key frame；

Scene creation module 903, for creating destination virtual scene based on the multiple key frames got, the target is empty Quasi- scene is for indicating scene corresponding to multiple picture frame.

In a kind of possible implementation, the data obtaining module 902 is for executing following at least two:

For any image frame, the first number of any image frame and the matching characteristic point of the previous key frame is obtained Amount；

For any image frame, first quantity of any image frame and the matching characteristic point of the previous key frame is obtained With the ratio of the second quantity, which is the quantity of the characteristic point of the previous key frame；

For any image frame, the score of any image frame is obtained, which is based on any image frame and this is previous Each comfortable two images of matching characteristic point of the baseline length of a key frame and any image frame and the previous key frame Resolution ratio in frame determines.

In a kind of possible implementation, which further includes determining module, which is used for: for any figure As any matching characteristic point at least one matching characteristic point of frame and the previous key frame, any matching characteristic point is obtained The normal distribution value of the corresponding baseline length；Obtain minimum value in the resolution ratio in each comfortable two picture frames and this just The product of state Distribution Value；Summation is weighted to the product of at least one matching characteristic point, obtains any image frame Score.

In a kind of possible implementation, which is also used to execute any one of following:

When the first quantity of any image frame and the matched characteristic point of previous key frame is less than amount threshold, by this Any image frame is retrieved as key frame；

When the first quantity and the second quantity of the matching characteristic point for obtaining any image frame and the previous key frame When ratio is greater than fractional threshold, which is retrieved as key frame；

When the score of any image frame is greater than score threshold, which is retrieved as key frame.

In a kind of possible implementation, the device further include:

Characteristic extracting module obtains the characteristic point of each picture frame for carrying out feature extraction to multiple picture frame；

Characteristic matching module, the spy of the previous picture frame for characteristic point and each picture frame to each picture frame Sign point is matched.

In a kind of possible implementation, this feature extraction module and this feature matching module are also used to using different lines Journey carries out feature extraction to picture frame respectively and matches to the characteristic point that picture frame extracts.

In a kind of possible implementation, which further includes binocular ranging module, which is used to work as and adopt When integrating the camera of multiple picture frame as binocular camera, by individual thread, to first camera and second in the binocular camera The characteristic point of camera collected two picture frames respectively is matched.

In a kind of possible implementation, this feature matching module is used for: for each picture frame, according to each image The camera pose of the first history image frame before frame, predicts the camera pose of each picture frame；According to each picture frame Camera pose, the characteristic point of the previous picture frame of the characteristic point and each picture frame of each picture frame is matched, Obtain the spatial position of the characteristic point of each picture frame.

In a kind of possible implementation, which further includes optimization module, which is used for: obtaining each figure As the sum of the error of all characteristic points of frame, the error of each characteristic point is used to indicate spatial position and the base of each characteristic point Gap between the spatial position that the second history image frame is predicted, the sky that should be predicted based on the second history image frame Between position based on the matching characteristic point of each characteristic point depth determine；The mistake of all characteristic points based on each picture frame The sum of difference is adjusted the camera pose of each picture frame, until the sum of the error stops adjustment when meeting goal condition, The camera pose of each picture frame after being optimized.

In a kind of possible implementation, which is used for: for appointing in multiple key frames for getting One key frame, other key frames of the characteristic point based on the characteristic point of any key frame, and including any key frame, wound Build the corresponding first partial virtual scene of any key frame；It is empty to the first partial of any key frame and other key frames Quasi- scene optimizes, and obtains the corresponding second local virtual scene of any key frame；It is corresponding based on multiple key frame Second local virtual scene, creates the corresponding initial virtual scene of multiple key frame；To the initial virtual of multiple key frame Scene optimizes, and obtains the corresponding destination virtual scene of multiple picture frame.

In a kind of possible implementation, which is used for: obtaining any key frame and other passes The sum of the error of all characteristic points of key frame, the error of each characteristic point are used to indicate spatial position and the base of each characteristic point Gap between the spatial position that the second history image frame is predicted, the sky that should be predicted based on the second history image frame Between position based on the matching characteristic point of each characteristic point depth determine；Based on any key frame and other key frames The sum of the error of all characteristic points, the depth of camera pose and all characteristic points to any key frame and other key frames It is adjusted, until the sum of the error stops adjustment when meeting goal condition, obtains any key frame corresponding second locally Virtual scene；

The scene creation module 903 is used for: obtaining the sum of the error of all characteristic points of multiple key frame；It is more based on this The depth of the sum of the error of all characteristic points of a key frame, camera pose and all characteristic points to multiple key frame carries out Adjustment obtains the corresponding destination virtual scene of multiple picture frame until the sum of the error stops adjustment when meeting goal condition.

In a kind of possible implementation, which is used for: for any feature of any image frame Point obtains the matching characteristic point and exists according to the depth of matching characteristic point of any feature point in the second history image frame Spatial position in the camera coordinates system of second history image frame, the second history image frame be characterized a little with any feature point First picture frame in matched multiple images frame；According to the matching characteristic o'clock in the camera coordinates system of the second history image frame Spatial position, the second history image frame camera pose and any image frame relative to the second history image frame Opposite camera pose, obtain the prediction relative direction of any feature point relative to the camera position of any image frame；It obtains Take the relative direction of any feature point relative to the camera position of any image frame；The prediction relative direction is opposite with this Direction error of the gap between the projected position in the vertical plane of the relative direction as any feature point respectively.

In a kind of possible implementation, the scene creation module 903 is at least one of following for executing:

The spatial position of characteristic point based on the multiple key frames got obtains each road sign point in destination virtual scene Spatial position, obtain destination virtual map；

Based on the camera pose of the multiple key frames got, the target movement for the equipment for acquiring multiple picture frame is obtained Track.

Device provided in an embodiment of the present invention is judging whether when obtaining key frame in multiple images frame by any figure When being retrieved as key frame as frame, it is contemplated that any image frame and the previous key frame before any image frame are matched The multinomial position difference information of characteristic point, when any one meets the accuracy requirement of scene creation will any image frame obtain It is taken as key frame, to get enough key frames, equipment or road sign the point location mistake to acquired image frames can be avoided the occurrence of The problem for leading to the virtual scene inaccuracy of creation is lost, thus, the mesh that image processing method provided in an embodiment of the present invention obtains The accuracy for marking virtual scene is good.

It should be understood that image processing apparatus provided by the above embodiment is when handling image, only with above-mentioned The division progress of each functional module can according to need and for example, in practical application by above-mentioned function distribution by different Functional module is completed, i.e., the internal structure of electronic equipment is divided into different functional modules, to complete whole described above Or partial function.In addition, image processing apparatus provided by the above embodiment and image processing method embodiment belong to same structure Think, specific implementation process is detailed in embodiment of the method, and which is not described herein again.

Above-mentioned electronic equipment may be provided as following terminals shown in Fig. 10, also may be provided as shown in following Figure 11 Server, the embodiment of the present invention is not construed as limiting this.

Figure 10 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention.The terminal 1000 may is that intelligent hand (Moving Picture Experts Group Audio Layer III, dynamic image are special for machine, tablet computer, MP3 player Family's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image Expert's compression standard audio level 4) player, laptop or desktop computer.Terminal 1000 is also possible to referred to as user and sets Other titles such as standby, portable terminal, laptop terminal, terminal console.

In general, terminal 1000 includes: processor 1001 and memory 1002.

Processor 1001 may include one or more processing cores, such as 4 core processors, 8 core processors etc..Place Reason device 1001 can use DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field- Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed Logic array) at least one of example, in hardware realize.Processor 1001 also may include primary processor and coprocessor, master Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing Unit, central processing unit)；Coprocessor is the low power processor for being handled data in the standby state.? In some embodiments, processor 1001 can be integrated with GPU (Graphics Processing Unit, image processor), GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 1001 can also be wrapped AI (Artificial Intelligence, artificial intelligence) processor is included, the AI processor is for handling related machine learning Calculating operation.

Memory 1002 may include one or more computer readable storage mediums, which can To be non-transient.Memory 1002 may also include high-speed random access memory and nonvolatile memory, such as one Or multiple disk storage equipments, flash memory device.In some embodiments, the non-transient computer in memory 1002 can Storage medium is read for storing at least one instruction, at least one instruction for performed by processor 1001 to realize this hair The image processing method that bright middle embodiment of the method provides.

In some embodiments, terminal 1000 is also optional includes: peripheral device interface 1003 and at least one periphery are set It is standby.It can be connected by bus or signal wire between processor 1001, memory 1002 and peripheral device interface 1003.It is each outer Peripheral equipment can be connected by bus, signal wire or circuit board with peripheral device interface 1003.Specifically, peripheral equipment includes: In radio circuit 1004, touch display screen 1005, camera 1006, voicefrequency circuit 1007, positioning component 1008 and power supply 1009 At least one.

Peripheral device interface 1003 can be used for I/O (Input/Output, input/output) is relevant outside at least one Peripheral equipment is connected to processor 1001 and memory 1002.In some embodiments, processor 1001, memory 1002 and periphery Equipment interface 1003 is integrated on same chip or circuit board；In some other embodiments, processor 1001, memory 1002 and peripheral device interface 1003 in any one or two can be realized on individual chip or circuit board, this implementation Example is not limited this.

Radio circuit 1004 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal. Radio circuit 1004 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 1004 is by telecommunications Number being converted to electromagnetic signal is sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit 1004 include: antenna system, RF transceiver, one or more amplifiers, tuner, oscillator, digital signal processor, volume solution Code chipset, user identity module card etc..Radio circuit 1004 can by least one wireless communication protocol come with it is other Terminal is communicated.The wireless communication protocol includes but is not limited to: Metropolitan Area Network (MAN), each third generation mobile communication network (2G, 3G, 4G and 5G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, radio frequency electrical Road 1004 can also include NFC (Near Field Communication, wireless near field communication) related circuit, the present invention This is not limited.

Display screen 1005 is for showing UI (User Interface, user interface).The UI may include figure, text, Icon, video and its their any combination.When display screen 1005 is touch display screen, display screen 1005 also there is acquisition to exist The ability of the touch signal on the surface or surface of display screen 1005.The touch signal can be used as control signal and be input to place Reason device 1001 is handled.At this point, display screen 1005 can be also used for providing virtual push button and/or dummy keyboard, it is also referred to as soft to press Button and/or soft keyboard.In some embodiments, display screen 1005 can be one, and the front panel of terminal 1000 is arranged；Another In a little embodiments, display screen 1005 can be at least two, be separately positioned on the different surfaces of terminal 1000 or in foldover design； In still other embodiments, display screen 1005 can be flexible display screen, is arranged on the curved surface of terminal 1000 or folds On face.Even, display screen 1005 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 1005 can be with Using LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) etc. materials preparation.

CCD camera assembly 1006 is for acquiring image or video.Optionally, CCD camera assembly 1006 includes front camera And rear camera.In general, the front panel of terminal is arranged in front camera, the back side of terminal is arranged in rear camera.? In some embodiments, rear camera at least two is that main camera, depth of field camera, wide-angle camera, focal length are taken the photograph respectively As any one in head, to realize that main camera and the fusion of depth of field camera realize background blurring function, main camera and wide Pan-shot and VR (Virtual Reality, virtual reality) shooting function or other fusions are realized in camera fusion in angle Shooting function.In some embodiments, CCD camera assembly 1006 can also include flash lamp.Flash lamp can be monochromatic temperature flash of light Lamp is also possible to double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, can be used for Light compensation under different-colour.

Voicefrequency circuit 1007 may include microphone and loudspeaker.Microphone is used to acquire the sound wave of user and environment, and It converts sound waves into electric signal and is input to processor 1001 and handled, or be input to radio circuit 1004 to realize that voice is logical Letter.For stereo acquisition or the purpose of noise reduction, microphone can be separately positioned on the different parts of terminal 1000 to be multiple. Microphone can also be array microphone or omnidirectional's acquisition type microphone.Loudspeaker is then used to that processor 1001 or radio frequency will to be come from The electric signal of circuit 1004 is converted to sound wave.Loudspeaker can be traditional wafer speaker, be also possible to piezoelectric ceramics loudspeaking Device.When loudspeaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, can also be incited somebody to action Electric signal is converted to the sound wave that the mankind do not hear to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 1007 may be used also To include earphone jack.

Positioning component 1008 is used for the current geographic position of positioning terminal 1000, to realize navigation or LBS (Location Based Service, location based service).Positioning component 1008 can be the GPS (Global based on the U.S. Positioning System, global positioning system), the dipper system of China, Russia Gray receive this system or European Union The positioning component of Galileo system.

Power supply 1009 is used to be powered for the various components in terminal 1000.Power supply 1009 can be alternating current, direct current Electricity, disposable battery or rechargeable battery.When power supply 1009 includes rechargeable battery, which can support wired Charging or wireless charging.The rechargeable battery can be also used for supporting fast charge technology.

In some embodiments, terminal 1000 further includes having one or more sensors 1010.One or more sensing Device 1010 includes but is not limited to: acceleration transducer 1011, gyro sensor 1012, pressure sensor 1013, fingerprint sensing Device 1014, optical sensor 1015 and proximity sensor 1016.

Acceleration transducer 1011 can detecte the acceleration in three reference axis of the coordinate system established with terminal 1000 Size.For example, acceleration transducer 1011 can be used for detecting component of the acceleration of gravity in three reference axis.Processor The 1001 acceleration of gravity signals that can be acquired according to acceleration transducer 1011, control touch display screen 1005 with transverse views Or longitudinal view carries out the display of user interface.Acceleration transducer 1011 can be also used for game or the exercise data of user Acquisition.

Gyro sensor 1012 can detecte body direction and the rotational angle of terminal 1000, gyro sensor 1012 Acquisition user can be cooperateed with to act the 3D of terminal 1000 with acceleration transducer 1011.Processor 1001 is according to gyro sensors The data that device 1012 acquires, following function may be implemented: action induction (for example changing UI according to the tilt operation of user) is clapped Image stabilization, game control and inertial navigation when taking the photograph.

The lower layer of side frame and/or touch display screen 1005 in terminal 1000 can be set in pressure sensor 1013.When When the side frame of terminal 1000 is arranged in pressure sensor 1013, user can detecte to the gripping signal of terminal 1000, by Reason device 1001 carries out right-hand man's identification or prompt operation according to the gripping signal that pressure sensor 1013 acquires.Work as pressure sensor 1013 when being arranged in the lower layer of touch display screen 1005, is grasped by processor 1001 according to pressure of the user to touch display screen 1005 Make, realization controls the operability control on the interface UI.Operability control include button control, scroll bar control, At least one of icon control, menu control.

Fingerprint sensor 1014 is used to acquire the fingerprint of user, is collected by processor 1001 according to fingerprint sensor 1014 Fingerprint recognition user identity, alternatively, by fingerprint sensor 1014 according to the identity of collected fingerprint recognition user.Knowing Not Chu the identity of user when being trusted identity, authorize the user to execute relevant sensitive operation by processor 1001, which grasps Make to include solving lock screen, checking encryption information, downloading software, payment and change setting etc..Fingerprint sensor 1014 can be set Set the front, the back side or side of terminal 1000.When being provided with physical button or manufacturer Logo in terminal 1000, fingerprint sensor 1014 can integrate with physical button or manufacturer Logo.

Optical sensor 1015 is for acquiring ambient light intensity.In one embodiment, processor 1001 can be according to light The ambient light intensity that sensor 1015 acquires is learned, the display brightness of touch display screen 1005 is controlled.Specifically, work as ambient light intensity When higher, the display brightness of touch display screen 1005 is turned up；When ambient light intensity is lower, the aobvious of touch display screen 1005 is turned down Show brightness.In another embodiment, the ambient light intensity that processor 1001 can also be acquired according to optical sensor 1015, is moved The acquisition parameters of state adjustment CCD camera assembly 1006.

Proximity sensor 1016, also referred to as range sensor are generally arranged at the front panel of terminal 1000.Proximity sensor 1016 for acquiring the distance between the front of user Yu terminal 1000.In one embodiment, when proximity sensor 1016 is examined When measuring the distance between the front of user and terminal 1000 and gradually becoming smaller, by processor 1001 control touch display screen 1005 from Bright screen state is switched to breath screen state；When proximity sensor 1016 detect the distance between front of user and terminal 1000 by When gradual change is big, touch display screen 1005 is controlled by processor 1001 and is switched to bright screen state from breath screen state.

It, can be with it will be understood by those skilled in the art that the restriction of the not structure paired terminal 1000 of structure shown in Figure 10 Including than illustrating more or fewer components, perhaps combining certain components or being arranged using different components.

Figure 11 is a kind of structural schematic diagram of server provided in an embodiment of the present invention, the server 1100 can because of configuration or Performance is different and generates bigger difference, may include one or more processors (central processing units, CPU) 1101 and one or more memories 1102, wherein be stored at least one instruction in the memory 1102, this is at least One instruction is loaded by the processor 1101 and is executed the image processing method to realize above-mentioned each embodiment of the method offer.When So, which can also have the components such as wired or wireless network interface, keyboard and input/output interface, defeated to carry out Enter output, which can also include other for realizing the component of functions of the equipments, and this will not be repeated here.

In the exemplary embodiment, a kind of computer readable storage medium is additionally provided, the memory for example including instruction, Above-metioned instruction can be executed by processor to complete the image processing method in above-described embodiment.For example, the computer-readable storage Medium can be read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM), tape, floppy disk and light data Store equipment etc..

Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware It completes, relevant hardware can also be instructed to complete by program, which can store in a kind of computer-readable storage In medium, storage medium mentioned above can be read-only memory, disk or CD etc..It above are only preferable reality of the invention Example is applied, is not intended to limit the invention, all within the spirits and principles of the present invention, made any modification, changes equivalent replacement Into etc., it should all be included in the protection scope of the present invention.

Claims

1. a kind of image processing method, which is characterized in that the described method includes:

Obtain multiple images frame；

For any image frame, the corresponding multinomial position difference information of any image frame, the multinomial position difference are obtained Information is used to indicate the alternate position spike of the matching characteristic point of the previous key frame of any image frame and any image frame It is different；

When any one of described multinomial position difference information meets the accuracy requirement of scene creation, by any image frame It is retrieved as key frame；

Based on the multiple key frames got, destination virtual scene is created, the destination virtual scene is for indicating the multiple Scene corresponding to picture frame.

2. obtaining any image the method according to claim 1, wherein described for any image frame The corresponding multinomial position difference information of frame, including following at least two:

For any image frame, the first number of the matching characteristic point of any image frame and the previous key frame is obtained Amount；

For any image frame, the first quantity of the matching characteristic point of any image frame and the previous key frame is obtained With the ratio of the second quantity, second quantity is the quantity of the characteristic point of the previous key frame；

For any image frame, the score of any image frame is obtained, the score is based on any image frame and described Each leisure of matching characteristic point of the baseline length of previous key frame and any image frame and the previous key frame Resolution ratio in two picture frames determines.

3. according to the method described in claim 2, it is characterized in that, the score is based on any image frame and described previous The matching characteristic point of the baseline length of a key frame and any image frame and the previous key frame each comfortable two Resolution ratio in picture frame determines, comprising:

For any matching characteristic point at least one matching characteristic point of any image frame and the previous key frame, Obtain the normal distribution value of the corresponding baseline length of any matching characteristic point；

Obtain the product of the minimum value and the normal distribution value in the resolution ratio in each described two picture frames of leisure；

Summation is weighted to the product of at least one matching characteristic point, obtains the score of any image frame.

4. the method according to claim 1, wherein described when any one of described multinomial position difference information is full When the accuracy requirement of sufficient scene creation, any image frame is retrieved as key frame, including any one of following:

It, will be described when the first quantity of any image frame and the matched characteristic point of previous key frame is less than amount threshold Any image frame is retrieved as key frame；

When the first quantity and the second quantity of the matching characteristic point for obtaining any image frame and the previous key frame When ratio is greater than fractional threshold, any image frame is retrieved as key frame；

When the score of any image frame is greater than score threshold, any image frame is retrieved as key frame.

5. the method according to claim 1, wherein the method is also wrapped after the acquisition multiple images frame It includes:

Feature extraction is carried out to described multiple images frame, obtains the characteristic point of each picture frame；

The characteristic point of each picture frame is matched with the characteristic point of the previous picture frame of each picture frame.

6. according to the method described in claim 5, it is characterized in that, the method also includes:

Feature extraction is carried out to picture frame respectively using different threads and the characteristic point that picture frame extracts is matched.

7. according to the method described in claim 6, it is characterized in that, it is described to described multiple images frame carry out feature extraction, obtain To after the characteristic point of each picture frame, the method also includes:

When the camera for acquiring described multiple images frame is binocular camera, by individual thread, in the binocular camera the The characteristic point of one camera and second camera collected two picture frames respectively is matched.

8. according to the method described in claim 5, it is characterized in that, the characteristic point to each picture frame and each figure As the characteristic point of the previous picture frame of frame is matched, comprising:

For each picture frame, according to the camera pose of the first history image frame before each picture frame, described in prediction The camera pose of each picture frame；

According to the camera pose of each picture frame, to the previous of the characteristic point of each picture frame and each picture frame The characteristic point of picture frame is matched, and the spatial position of the characteristic point of each picture frame is obtained.

9. according to the method described in claim 8, it is characterized in that, the camera pose according to each picture frame, right The characteristic point of each picture frame is matched with the characteristic point of the previous picture frame of each picture frame, is obtained described each After the spatial position of the characteristic point of picture frame, the method also includes:

The sum of the error of all characteristic points of each picture frame is obtained, the error of each characteristic point is used to indicate described each Gap between the spatial position of characteristic point and the spatial position predicted based on the second history image frame, it is described to be based on second The spatial position that history image frame is predicted is determined based on the depth of the matching characteristic point of each characteristic point；

The sum of the error of all characteristic points based on each picture frame adjusts the camera pose of each picture frame It is whole, until the sum of described error stops adjustment, the phase seat in the plane of each picture frame after being optimized when meeting goal condition Appearance.

10. the method according to claim 1, wherein described based on the multiple key frames got, creation target Virtual scene, comprising:

For any key frame in multiple key frames for getting, based on the characteristic point of any key frame, and including institute Other key frames of the characteristic point of any key frame are stated, the corresponding first partial virtual scene of any key frame is created；

The first partial virtual scene of any key frame and other key frames is optimized, any pass is obtained The corresponding second local virtual scene of key frame；

Based on the corresponding second local virtual scene of the multiple key frame, the corresponding initial virtual of the multiple key frame is created Scene；

The initial virtual scene of the multiple key frame is optimized, the corresponding destination virtual field of described multiple images frame is obtained Scape.

11. according to the method described in claim 10, it is characterized in that, described to any key frame and other keys The first partial virtual scene of frame optimizes, and obtains the corresponding second local virtual scene of any key frame, comprising:

Obtain the sum of the error of all characteristic points of any key frame and other key frames, the error of each characteristic point The spatial position of each characteristic point is used to indicate between the spatial position predicted based on the second history image frame Gap, matching characteristic point of the spatial position predicted based on the second history image frame based on each characteristic point Depth determines；

The sum of the error of all characteristic points based on any key frame and other key frames, to any key frame It is adjusted with the camera pose of other key frames and the depth of all characteristic points, until the sum of described error meets target Stop adjustment when condition, obtains the corresponding second local virtual scene of any key frame；

The initial virtual scene to the multiple key frame optimizes, and it is empty to obtain the corresponding target of described multiple images frame Quasi- scene, comprising:

Obtain the sum of the error of all characteristic points of the multiple key frame；

The sum of the error of all characteristic points based on the multiple key frame, to the camera pose of the multiple key frame and all The depth of characteristic point is adjusted, until the sum of described error stops adjustment when meeting goal condition, obtains described multiple images The corresponding destination virtual scene of frame.

12. the method according to claim 1, wherein described based on the multiple key frames got, creation target Virtual scene, including at least one of following:

The spatial position of characteristic point based on the multiple key frames got obtains the sky of each road sign point in destination virtual scene Between position, obtain destination virtual map；

Based on the camera pose of the multiple key frames got, the target for obtaining the equipment of acquisition described multiple images frame moves rail Mark.

13. a kind of image processing apparatus, which is characterized in that described device includes:

Image collection module, for obtaining multiple images frame；

Data obtaining module, for obtaining the corresponding multinomial position difference information of any image frame for any image frame, The multinomial position difference information is used to indicate of the previous key frame of any image frame and any image frame Position difference with characteristic point；

Described image obtains module, is also used to meet the accuracy of scene creation when any one of described multinomial position difference information It is required that when, any image frame is retrieved as key frame；

Scene creation module, for creating destination virtual scene, the destination virtual scene based on the multiple key frames got For indicating scene corresponding to described multiple images frame.

14. a kind of electronic equipment, which is characterized in that the electronic equipment includes that one or more processors and one or more are deposited Reservoir is stored at least one instruction in one or more of memories, and described instruction is by one or more of processors It loads and executes to realize the operation as performed by claim 1 to claim 12 described in any item image processing methods.

15. a kind of computer readable storage medium, which is characterized in that be stored at least one in the computer readable storage medium Item instruction, described instruction are loaded by processor and are executed to realize such as claim 1 to the described in any item figures of claim 12 The operation as performed by processing method.