CN110298320A

CN110298320A - A kind of vision positioning method, device and storage medium

Info

Publication number: CN110298320A
Application number: CN201910586511.9A
Authority: CN
Inventors: 李照虎; 张永杰
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-07-01
Filing date: 2019-07-01
Publication date: 2019-10-01
Anticipated expiration: 2039-07-01
Also published as: CN110298320B

Abstract

The embodiment of the present invention proposes vision positioning method, device and storage medium, wherein the described method includes: acquisition panoramic view data；Using the panoramic view data as classifying in training sample input disaggregated model, classification results are obtained；The positioning map based on semantic feature is obtained according to the classification results；At least one image data to be processed of current target object acquisition is inputted into the disaggregated model, in conjunction with the positioning map, positioning obtains the direction of the target object.Using the embodiment of the present invention, it can realize that accurate direction positions with existing magnetometer, while reducing the hardware cost of upgrading magnetometer.

Description

A kind of vision positioning method, device and storage medium

Technical field

The present invention relates to the technical fields of computer vision more particularly to a kind of vision positioning method, device and storage to be situated between Matter.

Background technique

A kind of application scenarios of vision positioning processing are: (such as visual angle from left to right, view from right to left with different view Angle, the visual angle overlooked from top to bottom etc.) see the same target object (such as building, vehicle, mobile phone terminal, in surrounding enviroment A tree or the street lamp of curbside etc.), if checking that result is similar, it is difficult to determine the target object direction (or Direction), it needs to position the target object.For example, can be looked by magnetometer under current scene at the parting of the ways See the direction of certain target object.Can be more due to crossroad vehicle, the facilities such as traffic lights are also more, and these can all bring very greatly Electromagnetic interference so that biggish error can be brought after electromagnetic interference for the magnetometer in detected target object direction, thus Leading to the direction of target object can not accurately be determined.At present if it is intended to accurately determining direction under this scene, only The more advanced magnetometer of energy, this certainly will increase cost.However, the problem does not obtain effective solution.

Summary of the invention

The embodiment of the present invention provides a kind of vision positioning method, is asked with solving one or more technologies in the prior art Topic.

In a first aspect, the embodiment of the invention provides a kind of vision positioning methods, which comprises

Acquire panoramic view data；

Using the panoramic view data as classifying in training sample input disaggregated model, classification results are obtained；

The positioning map based on semantic feature is obtained according to the classification results；

At least one image data to be processed of current target object acquisition is inputted into the disaggregated model, in conjunction with described fixed Position map, positioning obtain the direction of the target object.

It is described using the panoramic view data as classifying in training sample input disaggregated model in one embodiment, it obtains To classification results, comprising:

In the disaggregated model, according to semantic segmentation strategy at least one image data in the panoramic view data into Row image preprocessing, obtains pre-processed results, and the pre-processed results are the parts of images at least one described image data Region；

Classify to the pre-processed results, obtains described in the semantic feature for corresponding to the partial image region and correspondence The coordinate information of partial image region；

The semantic feature and the coordinate information are determined as the classification results.

In one embodiment, it is described according to semantic segmentation strategy at least one image data in the panoramic view data into Row image preprocessing, obtains pre-processed results, comprising:

From at least one described image data, the object to remain static at the appointed time section is identified；

Using the corresponding image-region of the object as static information；

Using the static information as the pre-processed results.

It is described to obtain the positioning map based on semantic feature according to the classification results in one embodiment, comprising:

Obtain the semantic feature and the coordinate information；

Semantic chunk region according to the semantic feature and the coordinate information, in corresponding description map；

According to the coordinate information, observation visual angle of the configuration pin to the semantic chunk region；

According to the semantic feature, the coordinate information and the observation visual angle, obtain being made of multiple semantic chunk regions The positioning map.

In one embodiment, observation visual angle of the configuration pin to the semantic chunk region, comprising:

According to the corresponding different positioning accuracies of different objects direction of observation in the panoramic view data, different observation views is configured Angle；

The observation visual angle includes at least: the visual angle at least two directions in east, south, west, north.

In one embodiment, the method also includes: it is directed to the panoramic view data, divides the observation in the horizontal direction Direction；Alternatively,

For the panoramic view data, the direction of observation is divided in the pitch direction.

In one embodiment, at least one described point of image data input to be processed by current target object acquisition Class model, in conjunction with the positioning map, positioning obtains the direction of the target object, comprising:

In the disaggregated model, image is carried out at least one image data to be processed according to semantic segmentation strategy and is located in advance Reason retains the static information at least one described image data to be processed；

It is fixed by the positioning map according to the corresponding semantic feature of the static information, coordinate information and observation visual angle Position obtains the direction of the target object.

It is described according to the corresponding semantic feature of the static information, coordinate information and observation visual angle in one embodiment, lead to The positioning map is crossed to position to obtain the direction of the target object, comprising:

Semantic chunk region in the static information and the positioning map is subjected to images match, is obtained and the static state Information has at least one target semantic chunk region of image similarity, at least one described target semantic chunk region corresponds to same A coordinate information；

When at least one described target semantic chunk region is there are when the overlapping of multiple observation visual angles, according to multi-angle of view overlay region Domain obtains positional parameter；

According to the positional parameter, positioning obtains the direction of the target object.

Second aspect, the embodiment of the invention provides a kind of vision positioning device, described device includes:

Acquisition unit, for acquiring panoramic view data；

Taxon, for being divided using the panoramic view data as classifying in training sample input disaggregated model Class result；

Map generation unit, for obtaining the positioning map based on semantic feature according to the classification results；

Positioning unit, at least one image data to be processed for acquiring current target object input the classification mould Type, in conjunction with the positioning map, positioning obtains the direction of the target object.

In one embodiment, the taxon further comprises:

Subelement is pre-processed, is used in the disaggregated model, according to semantic segmentation strategy in the panoramic view data At least one image data carries out image preprocessing, obtains pre-processed results, and the pre-processed results are at least one described figure As the partial image region in data；

Classification subelement obtains the language for corresponding to the partial image region for classifying to the pre-processed results The coordinate information of adopted feature and the corresponding partial image region；

In one embodiment, the pretreatment subelement is further used for:

Using the corresponding image-region of the object as static information；

Using the static information as the pre-processed results.

In one embodiment, the map generation unit further comprises:

Acquisition of information subelement, for obtaining the semantic feature and the coordinate information；

Region description subelement, for according to the semantic feature and the coordinate information, the corresponding language described in map Adopted block region；

Visual angle configures subelement, for according to the coordinate information, observation visual angle of the configuration pin to the semantic chunk region；

Map generates subelement, for according to the semantic feature, the coordinate information and the observation visual angle, obtain by The positioning map that multiple semantic chunk regions are constituted.

In one embodiment, the visual angle configures subelement, is further used for:

In one embodiment, described device further includes direction division unit, is used for:

For the panoramic view data, the direction of observation is divided in the horizontal direction；Alternatively,

In one embodiment, the positioning unit further comprises:

Image preprocessing subelement, for waiting locating at least one according to semantic segmentation strategy in the disaggregated model It manages image data and carries out image preprocessing, retain the static information at least one described image data to be processed；

Object locator unit is used for according to the corresponding semantic feature of the static information, coordinate information and observation visual angle, It positions to obtain the direction of the target object by the positioning map.

In one embodiment, the object locator unit is used for:

The third aspect, the embodiment of the invention provides a kind of vision positioning device, the function of described device can be by hard Part is realized, corresponding software realization can also be executed by hardware.The hardware or software include one or more and above-mentioned function It can corresponding module.

It include processor and memory in the structure of described device in a possible design, the memory is used for Storage supports described device to execute the program of any above-mentioned vision positioning method, the processor is configured to described for executing The program stored in memory.Described device can also include communication interface, be used for and other equipment or communication.

Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, for storing information processing apparatus Set computer software instructions used comprising for executing program involved in any above-mentioned vision positioning method.

A technical solution in above-mentioned technical proposal have the following advantages that or the utility model has the advantages that

In the embodiment of the present invention, panoramic view data is acquired；Using the panoramic view data as in training sample input disaggregated model Classify, obtains classification results；The positioning map based on semantic feature is obtained according to the classification results；By current goal pair As at least one described disaggregated model of image data input to be processed of acquisition, in conjunction with the positioning map, positioning obtains described The direction of target object.Led to using the embodiment of the present invention for being difficult to the case where determining direction (or direction) of target object It crosses using the panoramic view data of extraction as classifying in training sample input disaggregated model, is obtained according to classification results based on semanteme Then the positioning map of feature applies the disaggregated model and positioning map, can orient the direction of current target object.Due to The direction that target object can be oriented by disaggregated model and positioning map, haves no need to change current hardware, therefore, uses Existing magnetometer can realize accurate direction positioning, while reduce the hardware cost of upgrading magnetometer.

Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further Aspect, embodiment and feature, which will be, to be readily apparent that.

Detailed description of the invention

In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Disclosed some embodiments, and should not serve to limit the scope of the present invention.

Fig. 1 shows the flow chart of vision positioning method according to an embodiment of the present invention.

Fig. 2 shows the flow charts of vision positioning method according to an embodiment of the present invention.

Fig. 3 shows vision positioning schematic diagram of a scenario according to an embodiment of the present invention.

Fig. 4 shows the structural block diagram of vision positioning device according to an embodiment of the present invention.

Fig. 5 shows the structural block diagram of vision positioning device according to an embodiment of the present invention.

Specific embodiment

Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.

In the related technology, in an application scenarios, with different view (as visual angle from left to right, visual angle from right to left, Visual angle overlooked from top to bottom etc.) see the same target object (such as building, vehicle, mobile phone terminal, one in surrounding enviroment Tree or the street lamp of curbside etc.), if checking that result is similar, it is difficult to determine direction (or the court of the target object To), it needs to position the target object.Direction is usually to be determined using magnetometer.But fields are waited at the parting of the ways Jing Chu, since there are various interference for surrounding, such as the vehicle (metal shell will affect magnetometer), electric pole and the column that largely pass by Bar (metal material will affect magnetometer), these can all bring very big electromagnetic interference, so that being used for detected target object direction Magnetometer biggish error can be brought after electromagnetic interference, cause direction not determine accurately.

Magnetometer is also earth magnetism, magnetic strength device, can be used for testing magnetic field strength and direction, positioning target object is (as currently set It is standby) orientation.The principle of magnetometer is similar with compass principle, can measure on target object and four corners of the world four direction Angle, so, gyroscope is known " target object has turned a body ", and accelerometer knows " target object has been gone ahead several meters again ", And magnetometer then knows " target object be westwards direction ".In practical applications, since error is modified and compensates needs, in addition to Magnetometer, can be combined with gyroscope and accelerometer positions together, using the speciality of every kind of sensor, allow final positioning result It is more acurrate, for example positioning result is obtained in combination with magnetic direction and direction motion conditions.

However, due to electromagnetic interference, it, can only be with more if to accurately determine the direction of target object in above-mentioned scene Advanced magnetometer, this certainly will increase hardware cost.In this regard, reducing hardware cost, specifically while needing to realize accurate positionin It is the vision positioning processing for proposing the embodiment of the present invention.

Fig. 1 shows the flow chart of vision positioning method according to an embodiment of the present invention.As shown in Figure 1, the process includes:

Step 101, acquisition panoramic view data.

The panoramic view data is inputted in disaggregated model as training sample and is classified by step 102, obtains classification knot Fruit.

In one example, panoramic view data can be the image data acquired at the parting of the ways, the row including crossroad walking People, vehicle, building, mobile phone terminal, a tree in surrounding enviroment or street lamp of curbside etc..It, can when acquiring panoramic view data It is same with (visual angle such as from left to right, visual angle from right to left, the from top to bottom visual angle overlooked) acquisition with different view A target object (such as vehicle, pedestrian, building, mobile phone terminal, a tree in surrounding enviroment or the street lamp of curbside), It (visual angle such as from left to right, visual angle from right to left, the visual angle overlooked from top to bottom) can acquire not with different view Same target object.

Using these panoramic view datas for using above-mentioned acquisition mode to obtain as training sample, training sample is inputted into disaggregated model In classify, obtain classification results, classification results can be in any image data of panoramic view data and distinguish each target pair As the image-region at place, which has semantic feature and corresponding coordinate information.For example, semanteme point can be passed through Class knows which region is vehicle in image data, which region is pedestrian, which region is building, which region is hand Machine terminal, which region is a tree in surrounding enviroment or which region is street lamp of curbside etc..

Step 103 obtains the positioning map based on semantic feature according to the classification results.

In one example, classification results include in any image data of panoramic view data for distinguishing each target object place Image-region, which has semantic feature and corresponding coordinate information.It is the image according to corresponding coordinate information Area assignment observation visual angle can correspond at least two observation visual angles for the same target object.Target object is in three-dimensional space Between in be that a hexahedron correspondingly, observation visual angle can be divided in three-dimensional space is certainly not limited to this, can also be Two-dimensional space divides, and can also be mapped to three-dimensional space etc. after two-dimensional space division.The example of observation visual angle, for example, from Left-to-right visual angle, visual angle from right to left, the visual angle overlooked from top to bottom etc..

At least one image data to be processed of current target object acquisition is inputted the disaggregated model, knot by step 104 The positioning map is closed, positioning obtains the direction of the target object.

In one example, target object may include: vehicle, pedestrian, building, mobile phone terminal, one in surrounding enviroment Tree or the street lamp of curbside etc..For example, target object is mobile phone terminal, user has clapped a current location using the mobile phone terminal Under scene image (scene image can be the acquisition image under different perspectives), which can be the image to be processed Data.Due to 101-104 through the above steps, the training sample constituted by inputting panoramic view data, trained obtain can With the disaggregated model of classification, and available corresponding classification results.So, in practical applications, still input image data (such as image data to be processed) in conjunction with obtained positioning map, that is, utilizes and step 101-104 into existing disaggregated model In the same processing logic, be directly positioned to the direction of mobile phone terminal, naturally it is also possible to using same processing logic positioning It obtains holding pose of people of the mobile phone terminal etc., it can also be by being positioned to set phase to the laggard line position in the direction of mobile phone terminal The pose is derived to transformation.

Using the embodiment of the present invention, above-mentioned processing logic can be located at terminal and acquire side, can also be located at the service on backstage Device side, it may be assumed that the excellent of direction positioning is carried out using the processing logic at front end (target object such as mobile phone terminal or vehicle termination etc.) Change, and in background server, carries out the optimization of direction positioning in the cluster being such as made of server cluster using the processing logic. For being difficult to the case where determining direction (or direction) of target object, the embodiment of the present invention passes through using panoramic view data as training Sample and inputting in disaggregated model is classified, and the positioning map based on semantic feature is obtained, then using the disaggregated model and Positioning map orients the corresponding direction of current target object (or direction).Due to by disaggregated model and positioning map just The direction that target object can be oriented haves no need to change current hardware, therefore, can realize standard with existing magnetometer True direction positioning, while reducing the hardware cost of upgrading magnetometer.

Fig. 2 shows the flow charts of vision positioning method according to an embodiment of the present invention.As shown in Fig. 2, the process includes:

Step 201, acquisition panoramic view data.

Step 202 inputs panoramic view data as training sample in disaggregated model, according to semantic segmentation in disaggregated model Strategy carries out image preprocessing at least one image data in the panoramic view data, obtains pre-processed results, the pre- place Managing result is the partial image region at least one described image data.

Step 203 classifies to the pre-processed results, obtain corresponding to the partial image region semantic feature and The semantic feature and the coordinate information are determined as the classification and tied by the coordinate information of the corresponding partial image region Fruit.

202-203 through the above steps may be implemented using panoramic view data as point in training sample input disaggregated model Class, obtained classification results include: the semantic feature of partial image region and corresponding coordinate letter at least one image data Breath.Partial image region can be using semantic segmentation extract image in static information (such as building, board etc. for a long time The image-region that will not become), since static information is the image-region that will not become for a long time, have number for sort operation According to stability and operational reliability, therefore, static information is extracted and is classified, can achieve accurate classification results, After then obtaining positioning map subsequently through the classification results, available accurate direction locating effect.Wherein, the parts of images Region can be the semantic chunk region in the semantic chunk region in positioning map, or corresponding positioning map.

Step 204 obtains the positioning map based on semantic feature according to the classification results.

At least one image data to be processed of current target object acquisition is inputted the disaggregated model, knot by step 205 The positioning map is closed, positioning obtains the direction of the target object.

In one example, target object may include: vehicle, pedestrian, building, mobile phone terminal, one in surrounding enviroment Tree or the street lamp of curbside etc..For example, target object is mobile phone terminal, user has clapped a current location using the mobile phone terminal Under scene image (scene image can be the acquisition image under different perspectives), which can be the image to be processed Data.Due to 201-205 through the above steps, the training sample constituted by inputting panoramic view data, trained obtain can With the disaggregated model of classification, and available corresponding classification results.So, in practical applications, still input image data (such as image data to be processed) in conjunction with obtained positioning map, that is, utilizes and step 201-205 into existing disaggregated model In the same processing logic, be directly positioned to the direction of mobile phone terminal, naturally it is also possible to using same processing logic positioning It obtains holding pose of people of the mobile phone terminal etc., it can also be by being positioned to set phase to the laggard line position in the direction of mobile phone terminal The pose is derived to transformation.

In one embodiment, figure is carried out at least one image data in the panoramic view data according to semantic segmentation strategy As pretreatment, pre-processed results are obtained, comprising: from least one described image data, identify place at the appointed time section In the object (such as building, the object that board etc. will not move for a long time) of stationary state, by the corresponding image-region of the object As static information, using the static information as the pre-processed results.

It is described to obtain the positioning map based on semantic feature according to the classification results in one embodiment, comprising: to obtain The semantic feature and the coordinate information；Language according to the semantic feature and the coordinate information, in corresponding description map Adopted block region；According to the coordinate information, observation visual angle of the configuration pin to the semantic chunk region；According to the semantic feature, The coordinate information and the observation visual angle obtain the positioning map being made of multiple semantic chunk regions.

In one example, the panoramic view data acquired using map, building carry semantic feature and Inertial Measurement Unit (IMU, Inertial measurement unit) information map, that is to say, that be to be constructed by panoramic view data based on semanteme The positioning map of feature can solve the direction orientation problem to target object.Including following content, i.e., how to construct this and be based on The positioning map of semantic feature, and vision positioning is carried out according to the image that target object uploads.

For constructing semantic map, details are as follows:

1, the panoramic view data of General maps acquisition has relatively accurate GPS, IMU, magnetometer etc. and position and side To related location information, therefore the coordinate of the camera site of every Zhang Quanjing's figure, direction are (for the bat of the variant part of panorama sketch Take the photograph direction) it is all existing and more accurate.

2, static information (such as the figure that will not become for a long time such as building, board in image is extracted using semantic segmentation As region) and classify.There are semantic information and coordinate information in each region of this sampled images.

3, database (database) such as in map data base one observation visual angle of each semantic chunk region assignment or Observation visual angle, is finally put in storage by direction of observation (such as all directions, be also possible to specific view angle), with building based on semanteme The positioning map (semantics for short map) of feature.It include multiple semantic regions in positioning map based on semantic feature, and corresponding Semantic feature, coordinate information and the directional information used for positioning of each semantic region.

In one embodiment, observation visual angle of the configuration pin to the semantic chunk region, comprising: according in the panoramic view data The corresponding different positioning accuracies of different objects direction of observation, configure different observation visual angles.Wherein, the observation visual angle at least wraps It includes: the visual angle at least two directions in east, south, west, north.For the panoramic view data, the sight is divided in the horizontal direction Examine direction；Alternatively, being directed to the panoramic view data, the direction of observation is divided in the pitch direction.For example, if only in order to full The focal need (such as determining all directions four direction) of sufficient lower accuracy, can be with the positioning accuracy of lower granularity for seeing It examines direction to be divided, such as 360 degree of panoramic view data is divided into the observation visual angle of above-mentioned all directions four, to be each The corresponding observation visual angle of semantic chunk area assignment corresponding to direction.It is of course also possible in order to realize higher positioning accuracy, it can Panoramic view data is divided into more observation visual angles.In addition, panoramic view data can also not only divide sight in the horizontal direction Direction is examined, can also be divided in the pitch direction.

In one embodiment, at least one described point of image data input to be processed by current target object acquisition Class model, in conjunction with the positioning map, positioning obtains the direction of the target object, comprising: in the disaggregated model, according to Semantic segmentation strategy carries out image preprocessing at least one image data to be processed, retains at least one described image to be processed Static information in data；According to the corresponding semantic feature of the static information, coordinate information and observation visual angle, by described fixed Position Orientation on map obtains the direction of the target object.

In one embodiment, the positioning to target object direction is realized using above-mentioned disaggregated model and above-mentioned positioning map, In simple terms, the fan-shaped region of overlapping can be to look at.Fan-shaped region is that in semantic region localization region in the embodiment of the present invention An example, the not concrete shape of limited area.According to the corresponding semantic feature of the static information, coordinate information and observation Visual angle positions to obtain the direction of the target object by the positioning map, comprising: by the static information and the positioning Semantic chunk region in map carries out images match, obtains at least one target for having image similarity with the static information Semantic chunk region, at least one described target semantic chunk region correspond to the same coordinate information；When at least one described target language Adopted block region obtains positional parameter according to multi-angle of view overlapping region there are when the overlapping of multiple observation visual angles；According to the positioning Parameter, positioning obtain the direction of the target object.

How the positioning map based on semantic feature is constructed, previously herein by the agency of mistake, now with regard to how according to target The image that object uploads carries out vision positioning, i.e., how to realize using above-mentioned disaggregated model and above-mentioned positioning map to target object The positioning in direction, details are as follows:

1, semantic segmentation is carried out to the acquisition image to be processed of upload, retains the static information in image.

2, image, semantic region in each semantic region in acquisition image and map data base is matched, thus The matching of many semantic levels is obtained.Wherein, semantic region can also be known as semantic chunk.

3, since when constructing positioning map, each semantic region has observation scope (such as corresponding one or more observations Visual angle), matching each in this way will generate one piece of fan-shaped region in 2D plane, and the most intensive region of intersection can be obtained slightly Direction slightly or pose (pose), and the observation visual angle of acquisition image.It is fan-shaped if observation visual angle further includes pitch angle Region is rather than the fan-shaped region in 2D plane in the 3 d space.

4, if it is intended to obtaining more accurate direction or towards (pose), then needing will positioningly using SFM technology Figure is built into point cloud data.Then the matching of 2D to 3D mapping is continued to execute in above-mentioned matching.

For SFM, include at least: feature extraction is (generally using SIFT operator, because it is with scale and invariable rotary Property) the step of；Matching and the step of establish tracking image (such as track list), such as by Euclidean distance to image to two-by-two Match；The step of initialisation image pair, to find the opposite acquisition maximum image pair of equipment (such as camera) baseline；Initialisation image pair Relative orientation the step of；The step of sparse reconstruction SFM, waits.It should be pointed out that in the pose information for obtaining target object Except, observer (user) position can also be pushed away come counter using the overlapping of multiple observation visual angles in above-mentioned localization process, from And user can be further obtained the location of when shooting image, rather than just target object (current device) Pose information.It should be pointed out that the panoramic view data in the embodiment of the present invention can also be a cloud map, so as to obtain more Accurate pose information and position.Using the embodiment of the present invention, in the case of higher hardware cost magnetometer can not be used, determine Direction of the current devices such as mobile phone terminal in the scene such as crossroad for being difficult to determine direction, it might even be possible to determine specific position It sets.

Using example:

Fig. 3 shows vision positioning schematic diagram of a scenario according to an embodiment of the present invention, as shown in figure 3, acquisition panoramic view data (is adopted Collect image), the acquisition image is inputted into disaggregated model, image can be subjected to digitized processing before the input.This is adopted During collection image in training sample input disaggregated model as being classified, first the acquisition image is pre-processed, with Retain the static information (building, board etc.) in image, is classified in disaggregated model based on semantic feature, obtain static state The semantic feature and coordinate information of image-region where information, using the corresponding semantic feature of the image-region and coordinate information as Classification results output.Acquisition image can be the image data acquired at the parting of the ways, the pedestrian including crossroad walking, vehicle , a tree in building, mobile phone terminal, surrounding enviroment or the street lamp of curbside etc. can be with for the acquisition image (visual angle such as from left to right, visual angle from right to left, the visual angle overlooked from top to bottom) acquisition is same with different view Target object (such as vehicle, pedestrian, building, mobile phone terminal, a tree in surrounding enviroment or the street lamp of curbside), can also It is different with (visual angle such as from left to right, visual angle from right to left, the from top to bottom visual angle overlooked) acquisition with different view Target object.By semantic classification, can know which region is vehicle in image data, which region is pedestrian, which area Domain is building, which region is mobile phone terminal, which region is a tree in surrounding enviroment or which region is the road of curbside Lamp etc..It is obtained based on semantic feature positioningly according at least one observation visual angle of semantic feature, coordinate information and assignment Figure.Wherein it is possible to be that the image-region assignment corresponds to observation visual angle according to the coordinate information, it, can for the same target object With corresponding at least two observation visual angles.Target object is a hexahedron in three dimensions, correspondingly, observation visual angle can be Three-dimensional space is divided, this is certainly not limited to, and can also be divided in two-dimensional space, can also be mapped after two-dimensional space division Arrive three-dimensional space etc..The example of observation visual angle, for example, visual angle from left to right, visual angle from right to left, overlooking from top to bottom Visual angle etc..Finally, positioning map and disaggregated model based on semantic feature carry out direction positioning to target object.

Fig. 4 shows the structural block diagram of vision positioning device of the embodiment of the present invention, and described device includes: acquisition unit 31, uses In acquisition panoramic view data；Taxon 32 is divided for inputting in disaggregated model the panoramic view data as training sample Class obtains classification results；Map generation unit 33, for being obtained based on semantic feature positioningly according to the classification results Figure；Positioning unit 34, at least one image data to be processed for acquiring current target object input the disaggregated model, In conjunction with the positioning map, positioning obtains the direction of the target object.

In one embodiment, the taxon further comprises: pretreatment subelement, in the disaggregated model In, image preprocessing is carried out at least one image data in the panoramic view data according to semantic segmentation strategy, obtains pre- place Reason is as a result, the pre-processed results are the partial image region at least one described image data；

Classification subelement obtains the language for corresponding to the partial image region for classifying to the pre-processed results The coordinate information of adopted feature and the corresponding partial image region；The semantic feature and the coordinate information are determined as described Classification results.

In one embodiment, the pretreatment subelement is further used for: from least one described image data, knowing The object that Chu do not remain static in section at the appointed time；Using the corresponding image-region of the object as static information；It will The static information is as the pre-processed results.

In one embodiment, the map generation unit further comprises: acquisition of information subelement, described for obtaining Semantic feature and the coordinate information；Region description subelement, for corresponding to according to the semantic feature and the coordinate information Semantic chunk region in map is described；Visual angle configures subelement, for according to the coordinate information, configuration pin to be to the semantic chunk The observation visual angle in region；Map generates subelement, for being regarded according to the semantic feature, the coordinate information and the observation Angle obtains the positioning map being made of multiple semantic chunk regions.

In one embodiment, the visual angle configures subelement, is further used for: according to different objects in the panoramic view data The corresponding different positioning accuracies of direction of observation, configure different observation visual angles；The observation visual angle includes at least: east, south, west, The visual angle at least two directions in north.

In one embodiment, described device further includes direction division unit, is used for: the panoramic view data is directed to, in level The direction of observation is divided on direction；Alternatively, being directed to the panoramic view data, the direction of observation is divided in the pitch direction.

In one embodiment, the positioning unit further comprises: image preprocessing subelement, in the classification In model, image preprocessing is carried out at least one image data to be processed according to semantic segmentation strategy, retains described at least one Static information in a image data to be processed；Object locator unit, for corresponding semantic special according to the static information Sign, coordinate information and observation visual angle, position to obtain the direction of the target object by the positioning map.

In one embodiment, the object locator unit is used for: will be in the static information and the positioning map Semantic chunk region carries out images match, obtains at least one the target semantic chunk area for having image similarity with the static information Domain, at least one described target semantic chunk region correspond to the same coordinate information；When at least one described target semantic chunk region There are when the overlapping of multiple observation visual angles, positional parameter is obtained according to multi-angle of view overlapping region；According to the positional parameter, positioning Obtain the direction of the target object.

The function of each module in each device of the embodiment of the present invention may refer to the corresponding description in the above method, herein not It repeats again.

Fig. 5 shows the structural block diagram of information processing unit according to an embodiment of the present invention.As shown in figure 5, the device includes: Memory 910 and processor 920 are stored with the computer program that can be run on processor 920 in memory 910.Processor The automatic Pilot method in above-described embodiment is realized when 920 execution computer program.The quantity of memory 910 and processor 920 It can be one or more.

The device further include: communication interface 930 carries out data interaction for being communicated with external device.

Memory 910 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.

If memory 910, processor 920 and the independent realization of communication interface 930, memory 910,920 and of processor Communication interface 930 can be connected with each other by bus and complete mutual communication.Bus can be industry standard architecture (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component) bus or extended industry-standard architecture (EISA, Extended Industry Standard Component) bus etc..Bus can be divided into address bus, data/address bus, control bus etc..For convenient for indicating, in Fig. 5 only It is indicated with a thick line, it is not intended that an only bus or a type of bus.

Optionally, in specific implementation, if memory 910, processor 920 and communication interface 930 are integrated in one piece of core On piece, then memory 910, processor 920 and communication interface 930 can complete mutual communication by internal interface.

The embodiment of the invention provides a kind of computer readable storage mediums, are stored with computer program, the program quilt Processor realizes any method in above-described embodiment when executing.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples Sign is combined.

In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise Clear specific restriction.

Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.

Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable read-only memory (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable Jie Matter, because can then be edited, be interpreted or when necessary with other for example by carrying out optical scanner to paper or other media Suitable method is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..

Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.

It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement, These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim It protects subject to range.

Claims

1. a kind of vision positioning method, which is characterized in that the described method includes:

Acquire panoramic view data；

At least one image data to be processed of current target object acquisition is inputted into described disaggregated model, in conjunction with it is described positioningly Figure, positioning obtain the direction of the target object.

2. the method according to claim 1, wherein described using the panoramic view data as training sample input point Classify in class model, obtain classification results, comprising:

In the disaggregated model, figure is carried out at least one image data in the panoramic view data according to semantic segmentation strategy As pretreatment, pre-processed results are obtained, the pre-processed results are the partial image region at least one described image data；

Classify to the pre-processed results, obtains the semantic feature for corresponding to the partial image region and the corresponding part The coordinate information of image-region；

3. according to the method described in claim 2, it is characterized in that, it is described according to semantic segmentation strategy in the panoramic view data At least one image data carry out image preprocessing, obtain pre-processed results, comprising:

Using the corresponding image-region of the object as static information；

Using the static information as the pre-processed results.

4. according to the method described in claim 2, it is characterized in that, described obtain according to the classification results based on semantic feature Positioning map, comprising:

Obtain the semantic feature and the coordinate information；

According to the semantic feature, the coordinate information and the observation visual angle, the institute being made of multiple semantic chunk regions is obtained State positioning map.

5. according to the method described in claim 4, it is characterized in that, the configuration pin regards the observation in the semantic chunk region Angle, comprising:

According to the corresponding different positioning accuracies of different objects direction of observation in the panoramic view data, different observation visual angles is configured；

6. direction according to claim 5, which is characterized in that the method also includes: it is directed to the panoramic view data, in water Square the direction of observation is divided upwards；Alternatively,

7. method according to claim 1 to 6, which is characterized in that described to acquire current target object extremely A few image data to be processed inputs the disaggregated model, and in conjunction with the positioning map, positioning obtains the target object Direction, comprising:

In the disaggregated model, image preprocessing is carried out at least one image data to be processed according to semantic segmentation strategy, Retain the static information at least one described image data to be processed；

According to the corresponding semantic feature of the static information, coordinate information and observation visual angle, it is positioned to by the positioning map To the direction of the target object.

8. the method according to the description of claim 7 is characterized in that it is described according to the corresponding semantic feature of the static information, Coordinate information and observation visual angle position to obtain the direction of the target object by the positioning map, comprising:

Semantic chunk region in the static information and the positioning map is subjected to images match, is obtained and the static information Has at least one target semantic chunk region of image similarity, at least one described target semantic chunk region corresponds to the same seat Mark information；

When at least one described target semantic chunk region is there are when the overlapping of multiple observation visual angles, obtained according to multi-angle of view overlapping region To positional parameter；

9. a kind of vision positioning device, which is characterized in that described device includes:

Acquisition unit, for acquiring panoramic view data；

Taxon, for obtaining classification knot using the panoramic view data as classifying in training sample input disaggregated model Fruit；

Positioning unit, at least one image data to be processed for acquiring current target object input the disaggregated model, In conjunction with the positioning map, positioning obtains the direction of the target object.

10. device according to claim 9, which is characterized in that the taxon further comprises:

Pre-process subelement, in the disaggregated model, according to semantic segmentation strategy in the panoramic view data at least One image data carries out image preprocessing, obtains pre-processed results, and the pre-processed results are at least one described picture number Partial image region in；

Subelement of classifying obtains corresponding to the semantic special of the partial image region for classifying to the pre-processed results The coordinate information for the corresponding partial image region of seeking peace；

11. device according to claim 10, which is characterized in that the pretreatment subelement is further used for:

Using the corresponding image-region of the object as static information；

Using the static information as the pre-processed results.

12. device according to claim 10, which is characterized in that the map generation unit further comprises:

Region description subelement, for according to the semantic feature and the coordinate information, the corresponding semantic chunk described in map Region；

Map generates subelement, for obtaining by multiple according to the semantic feature, the coordinate information and the observation visual angle The positioning map that semantic chunk region is constituted.

13. device according to claim 12, which is characterized in that the visual angle configures subelement, is further used for:

14. device according to claim 13, which is characterized in that described device further includes direction division unit, is used for:

15. the device according to any one of claim 9-14, which is characterized in that the positioning unit further comprises:

Image preprocessing subelement is used in the disaggregated model, according to semantic segmentation strategy at least one figure to be processed As data progress image preprocessing, retain the static information at least one described image data to be processed；

Object locator unit, for passing through according to the corresponding semantic feature of the static information, coordinate information and observation visual angle The positioning map positions to obtain the direction of the target object.

16. device according to claim 15, which is characterized in that the object locator unit is used for:

17. a kind of vision positioning device, which is characterized in that described device includes:

One or more processors；

Storage device, for storing one or more programs；

When one or more of programs are executed by one or more of processors, so that one or more of processors Realize such as method described in any item of the claim 1 to 8.

18. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor Such as method described in any item of the claim 1 to 8 is realized when row.