CN113947751A

CN113947751A - Multi-scale scene recognition device and method based on deep learning direction features

Info

Publication number: CN113947751A
Application number: CN202111162828.3A
Authority: CN
Inventors: 王相龙; 刘春�; 严忠贞; 叶志伟; 叶方平; 崔宇晨
Original assignee: Hubei University of Technology
Current assignee: Hubei University of Technology
Priority date: 2021-09-30
Filing date: 2021-09-30
Publication date: 2022-01-18

Abstract

The invention discloses a multi-scale scene recognition method based on deep learning direction characteristics, which comprises the steps of determining a node at intervals of the same distance by using a map acquisition vehicle and a test vehicle through the same path, and acquiring node information; respectively extracting the depth learning direction characteristics of the forward-looking images, and storing the depth learning direction characteristics into corresponding node information; the set of map nodes forms a map set; the collection of the test nodes forms a test point collection; matching the map nodes with the test nodes by using GPS information, and screening out the map nodes within a certain distance from a certain specific test node; matching each test node, screening out all map nodes meeting the requirements, and forming a candidate map node set; scene recognition is achieved by matching the candidate map node set with the deep learning direction characteristics of the test nodes. The device and the method provided by the invention can realize scene recognition with low cost, high precision, strong robustness and high efficiency.

Description

Multi-scale scene recognition device and method based on deep learning direction features

Technical Field

The invention relates to the field of mapping, in particular to a multi-scale scene recognition device and method based on deep learning direction features.

Background

With the continuous development of the robot technology, the requirement of the robot on environment perception is continuously improved, and the positioning technology is the basis for realizing the environment perception technology of the robot. Scene recognition is an important step in achieving localization, and since images contain a large amount of information, image-based robot localization techniques have received a great deal of attention. Currently, image-based robotic positioning techniques characterize road scenes using manually extracted features. Road scenes, particularly outdoor road scenes, are susceptible to illumination and weather, and manually-extracted features are limited by a fixed feature extraction mode, so that scene roads in a changing environment cannot be stably characterized.

At present, some achievements exist in scene identification, for example, an authorized patent CN 106960591B, an authorized date 2018, 11, 30 and a patent name "a vehicle high-precision setting device and method based on road fingerprints", and discloses an intelligent vehicle positioning method based on road fingerprint characteristics. The method realizes scene recognition by extracting road scene features at different visual angles. But the method is to extract local features of the ORB image at different view angles. The applied patent CN 111091099 a, application date 2019, 12 and 20, entitled "method for constructing scene recognition model, method and apparatus for scene recognition", discloses a method and apparatus for scene recognition implemented by using scene semantic features. The method requires training a scene recognition model.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the extracted deep learning direction features have higher representation capability and stronger robustness.

The technical scheme adopted by the invention is as follows: a multi-scale scene recognition method based on deep learning direction features comprises the following steps:

s1, data acquisition:

determining a node at intervals of the same distance by using a map acquisition vehicle carrying an inertial navigation and RTK-GPS module and a first image sensor and a test vehicle carrying a consumer grade GPS module and a second image sensor through the same path, and acquiring node information; nodes acquired by the map acquisition vehicle are called map nodes, and the map node information comprises RTK-GPS information and a forward-looking image of the corresponding nodes; the nodes collected by the test vehicle are called test nodes, and the test node information comprises GPS information and a forward-looking image of the corresponding node;

s2, constructing a map set and a test point set:

respectively extracting the deep learning direction characteristics of the forward-looking images, and storing the deep learning direction characteristics into corresponding node information; the set of map nodes forms a map set; the collection of the test nodes forms a test point collection;

s3, matching of the map node information and the test node information:

matching the map nodes with the test nodes by using GPS information, and screening out the map nodes within a certain distance from a certain specific test node;

matching each test node, screening out all map nodes meeting the requirements, and forming a candidate map node set;

s4, scene recognition based on deep learning direction features:

scene recognition is achieved by matching the candidate map node set with the deep learning direction characteristics of the test nodes;

the deep learning direction characteristic of the foresight image is extracted in the following mode: and loading the forward-looking image into a deep learning network, extracting parameters of a specific layer, vectorizing the parameters of the specific layer, and mapping the vectorized parameters of the specific layer into a direction feature model to obtain deep learning direction features.

According to the method, the deep learning direction characteristics of the foresight image are specifically as follows:

(1) direction feature model construction

Loading the foresight image into a pre-trained deep learning network, then extracting a specific layer parameter p of the deep learning network, and vectorizing p to obtain a vectorized specific layer parameter f:

f＝flatten(p)，size(f)＝1×n

wherein n is the number of neurons in the deep learning network;

and (3) loading the forward-looking images into a deep learning network one by one, and forming a matrix F:

F＝(f₁ f₂ … f_K)^T，size(F)＝K×n

wherein K is the number of the map set images;

carrying out covariance matrix decomposition on the matrix F to obtain an eigenvalue vector lambda and a matrix V

DV＝λV

λ＝(λ₁ λ₂ … λ_n)

V＝(v₁ v₂ … v_n)^T

D is a symmetric matrix of the matrix F, K is the number of the map set images, and V is a characteristicMatrix, λ_iFor the i-th eigenvalue of the decomposition, v_iIs the i-th feature vector of the decomposition;

arranging the eigenvalue vector lambda from large to small according to the magnitude of the eigenvalue, thereby extracting main characteristics in the matrix F and further obtaining a directional characteristic model V':

λ′＝(λ′₁ λ′₂ … λ′_n)

V′＝(v′₁ v′₂ … v′_n)^T

λ′₁-λ′_nis a 1-n characteristic value, v'₁-v′_n1-n characteristic vectors;

current k^*When the cumulative sum of the information amounts in one direction is beta of the cumulative sum of the information amounts in all directions, k is the front^*The individual directions are the main directions:

feature vector corresponding to main direction constitutes matrix V'_main，V′_mainIs the matrix V principal direction subspace:

(2) deep learning directional feature extraction

Loading the front-view image into a deep learning network, extracting parameters corresponding to a specific layer, vectorizing the parameters to obtain Q, and extracting front k according to the step (1)^*Characteristic Q of one direction_convAnd Q is_convMapping to matrix V'_mainObtaining a deep learning direction characteristic p:

p＝Q_conv×V′_main。

according to the method, the certain distance in S3 is a euclidean distance or a cosine distance.

According to the method, the certain distance has a certain threshold, and the threshold H is specifically as follows:

in the formula, S_GPSMaximum error for GPS positioning, d_interThe stand-off distance of adjacent nodes is located for GPS.

According to the method, the S4 specifically comprises the following steps:

and finding the candidate map node closest to the test node by calculating the distance between the candidate map node set and the deep learning direction characteristic of the test node, so as to realize scene recognition of the test node.

According to the method, the distance between the candidate map node set and the deep learning direction feature of a certain test node is the Euclidean distance or the cosine distance.

A device for realizing the deep learning direction feature-based multi-scale scene recognition method comprises a map acquisition vehicle and a test vehicle;

the map acquisition vehicle comprises a mobile platform, wherein an inertial navigation and RTK-GPS module and a first image sensor for shooting a forward-looking image are fixed at the front end of the mobile platform, a first data transmission module, a first signal conversion module, a first data processing module and a power supply module are fixed in the mobile platform through a rigid fixing module, the output ends of the inertial navigation and RTK-GPS module and the first image sensor are connected with the signal conversion module through the first data transmission module and then connected with the data processing module, and the power supply module is used for supplying power;

the test vehicle and the map acquisition vehicle have the same structure, and only the inertial navigation and RTK-GPS modules are replaced by the consumption-level GPS module;

and carrying out multi-scale scene recognition by utilizing a first data processing module in the map acquisition vehicle and a second data processing module in the test vehicle.

According to the device, the mobile platform is also provided with a display unit.

The invention has the following beneficial effects: firstly, a map is manufactured by using inertial navigation, an RTK-GPS and an image sensor, in a scene identification stage, road information is collected by using a consumption-level GPS and the image sensor, and deep learning direction characteristics are extracted for matching, so that multi-scale scene identification is realized; compared with the traditional method, the scene recognition device and method based on the deep learning direction features can realize scene recognition with low cost, high precision, strong robustness and high efficiency.

Drawings

The invention will be further described with reference to the accompanying drawings and examples, in which:

FIG. 1 is a diagram of a mapping and testing apparatus.

Fig. 2 is a flowchart of a deep learning direction feature construction method.

FIG. 3 is a schematic view of a nodal map set.

Fig. 4 is a flow chart of a multi-scale scene recognition method.

In the figure: 1-a map acquisition vehicle, 1-1-a first image sensor, 1-2-an inertial navigation and RTK-GPS module, 1-3-a first data transmission module, 1-4-a first signal conversion module, 1-5-a first power supply module, 1-6-a first rigid fixing module, 1-7-a first data processing and displaying module and 1-8-a first mobile platform;

the system comprises a test vehicle 2, a second image sensor 2-1, a GPS module 2-2, a second data transmission module 2-3, a second signal conversion module 2-4, a second power supply module 2-5, a second rigid fixing module 2-6, a second data processing and displaying module 2-7 and a second mobile platform 2-8.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, the present invention provides a device for implementing a deep learning direction feature-based multi-scale scene recognition method, including a map collection vehicle 1 and a test vehicle 2; the map collecting vehicle 1 comprises a first mobile platform 1-8, wherein an inertial navigation and RTK-GPS module 1-2 and a first image sensor 1-1 for shooting a forward-looking image are fixed at the front end of the first mobile platform 1-8, a first data transmission module 1-3, a first signal conversion module 1-4, a first data processing and display module 1-7 and a first power supply module 1-5 are fixed in the first mobile platform 1-8 through a first rigid fixing module 1-6, the output ends of the inertial navigation and RTK-GPS module 1-2 and the first image sensor 1-1 are connected with the first signal conversion module 1-4 through the first data transmission module 1-3, then the first data processing and display module 1-7 is connected, and the first power supply module 1-5 is used for supplying power.

The test vehicle and the map acquisition vehicle have the same structure, and only the inertial navigation and RTK-GPS modules are replaced by the consumption-level GPS module; the ground test vehicle 2 comprises a second mobile platform 2-8, a GPS module 2-2, a second image sensor 2-1 for shooting a forward-looking image are fixed at the front end of the second mobile platform 2-8, a second data transmission module 2-3, a second signal conversion module 2-4, a second data processing and display module 2-7 and a second power supply module 2-5 are fixed in the second mobile platform 2-8 through a second rigid fixing module 2-6, the output ends of the GPS module 2-2 and the second image sensor 2-1 are connected with the second signal conversion module 2-4 through the second data transmission module 2-3 and then connected with the second data processing and display module 2-7, and the second power supply module 2-5 is used for supplying power.

And carrying out multi-scale scene recognition by utilizing a first data processing module in the map acquisition vehicle 1 and a second data processing module in the test vehicle 2.

The invention also provides a multi-scale scene recognition method based on the deep learning direction characteristics, as shown in fig. 4, the method comprises the following steps:

s1, data acquisition:

determining a node at intervals of the same distance by using a map acquisition vehicle carrying an inertial navigation and RTK-GPS module and a first image sensor and a test vehicle carrying a consumer grade GPS module and a second image sensor through the same path, and acquiring node information; the nodes collected by the map collection vehicle are called map nodes, and as shown in fig. 3, the map node information comprises RTK-GPS information and a forward-looking image of the corresponding nodes; the nodes collected by the test vehicle are called test nodes, and the test node information comprises GPS information and a forward-looking image of the corresponding nodes.

S2, constructing a map set and a test point set: respectively extracting the deep learning direction characteristics of the forward-looking images, and storing the deep learning direction characteristics into corresponding node information; the set of map nodes forms a map set; the set of test nodes constitutes a set of test points.

The deep learning direction characteristics of the foresight image are specifically as follows:

(1) direction feature model construction

f＝flatten(p)，size(f)＝1×n

wherein n is the number of neurons in the deep learning network;

F＝(f₁ f₂ … f_K)^T，size(F)＝K×n

wherein K is the number of the map set images;

DV＝λV

λ＝(λ₁ λ₂ … λ_n)

V＝(v₁ v₂ … v_n)^T

D is a symmetric matrix of the matrix F, K is the number of the map set images, V is a feature matrix, and lambda_iFor the i-th eigenvalue of the decomposition, v_iIs the i-th feature vector of the decomposition;

λ′＝(λ′₁ λ′₂ … λ′_n)

V′＝(v′₁ v′₂ … v′_n)^T

(2) deep learning directional feature extraction

p＝Q_conv×V′_main。

extracting the deep learning direction characteristic p of the foresight image, and then storing the information d of the node_i：

D＝[d₁d₂…d_n]

d_i＝[p_ig_i]

Wherein d is_iIs the ith node, p_i、g_iThe deep learning direction characteristics and the RTK-GPS information of the ith node are respectively. Other nodes are thus constructed, and finally a map set is constructed.

S3, matching of the map node information and the test node information: matching the map nodes with the test nodes by using GPS information, and screening out the map nodes within a certain distance from a certain specific test node; and matching each test node, and screening out all map nodes meeting the requirements to form a candidate map node set.

Firstly, matching GPS information of a test node and a map node, and screening out the map node which is close to the test node:

M_c＝{m_i|dis(g_i，G)≤S}

wherein, g_iThe distance threshold value H is related to the maximum error of GPS positioning and the distance between adjacent nodes:

wherein H is the distance threshold of the candidate map node, S_GPSMaximum error for GPS positioning, d_interThe stand-off distance of adjacent nodes is located for GPS. After GPS screening, a candidate map node set M is obtained_c：

M_c＝{m₁ m₂ … m_N}

S4, scene recognition based on deep learning direction features: scene recognition is achieved by matching the candidate map node set with the deep learning direction characteristics of the test nodes.

By matching a set of candidate map nodes M_cAnd a test node M_qThe direction characteristics are deeply learned, and scene recognition is achieved. In particular, by computing a set M of candidate map nodes_cAnd a test node M_qThe distance dis (Euclidean distance, cosine distance, correlation coefficient and the like) of the deep learning direction features is used for finding the candidate map node closest to the test node to realize scene recognition:

the method comprises the steps of firstly, utilizing an RTK-GPS, inertial navigation and an image sensor to make a map, utilizing a consumption-level GPS and the image sensor to collect road information in a scene identification stage, and extracting deep learning direction characteristics to carry out matching so as to realize multi-scale scene identification. Compared with the traditional method, the scene recognition device and method based on the deep learning direction features can realize scene recognition with low cost, high precision, strong robustness and high efficiency. The method only needs one image sensor, the extracted deep learning direction features have higher robustness, scene recognition can be realized only by utilizing the pre-trained deep learning network, and the method is suitable for realizing scene recognition in the environment with large change, high change frequency and quick deployment requirement.

The innovation of the invention is that: 1. a forward-looking image characterization method based on deep learning direction features is provided, so that matching basis is provided for scene recognition. 2. A nodal map construction method is provided that includes deep learning directional features and RTK-GPS information. 3. A multi-scale scene recognition method based on deep learning direction features is provided. 4. A corresponding set of data acquisition and scene recognition devices is provided.

It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims

1. A multi-scale scene recognition method based on deep learning direction features is characterized by comprising the following steps:

s1, data acquisition:

s2, constructing a map set and a test point set:

s3, matching of the map node information and the test node information:

s4, scene recognition based on deep learning direction features:

2. The method for multi-scale scene recognition based on deep learning direction features according to claim 1, wherein the deep learning direction features of the forward-looking image are specifically:

(1) direction feature model construction

f＝flatten(p),size(f)＝1×n

wherein n is the number of neurons in the deep learning network;

F＝(f₁ f₂ … f_K)^T,size(F)＝K×n

wherein K is the number of the map set images;

DV＝λV

λ＝(λ₁ λ₂ … λ_n)

V＝(v₁ v₂ … v_n)^T

λ′＝(λ′₁ λ′₂ … λ′_n)

V′＝(v′₁ v′₂ … v′_n)^T

(2) deep learning directional feature extraction

p＝Q_conv×V′_main。

3. the method for multi-scale scene recognition based on deep learning direction features of claim 1, wherein the certain distance in S3 is a euclidean distance or a cosine distance.

4. The method for multi-scale scene recognition based on deep learning direction features according to claim 3, wherein the certain distance has a certain threshold, and the threshold H specifically is:

5. The method for multi-scale scene recognition based on deep learning direction features according to claim 1, wherein the S4 specifically includes:

6. The method of claim 5, wherein the distance between the candidate map node set and the deep learning direction feature of a test node is Euclidean distance or cosine distance.

7. An apparatus for implementing the deep learning direction feature-based multi-scale scene recognition method according to any one of claims 1 to 6, wherein: the device comprises a map collecting vehicle and a testing vehicle;

8. The apparatus of claim 7, wherein the mobile platform further comprises a display unit.