CN116883966A - Vehicle-mounted camera pose calculation method and device and computer readable storage medium - Google Patents

Vehicle-mounted camera pose calculation method and device and computer readable storage medium Download PDF

Info

Publication number
CN116883966A
CN116883966A CN202310904859.4A CN202310904859A CN116883966A CN 116883966 A CN116883966 A CN 116883966A CN 202310904859 A CN202310904859 A CN 202310904859A CN 116883966 A CN116883966 A CN 116883966A
Authority
CN
China
Prior art keywords
vehicle
actual
mounted camera
key
matched
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310904859.4A
Other languages
Chinese (zh)
Inventor
童文超
罗小平
曾峰
张轩宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Longhorn Automotive Electronic Equipment Co Ltd
Original Assignee
Shenzhen Longhorn Automotive Electronic Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Longhorn Automotive Electronic Equipment Co Ltd filed Critical Shenzhen Longhorn Automotive Electronic Equipment Co Ltd
Priority to CN202310904859.4A priority Critical patent/CN116883966A/en
Publication of CN116883966A publication Critical patent/CN116883966A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The embodiment of the application provides a vehicle-mounted camera pose calculation method, a vehicle-mounted camera pose calculation device and a computer-readable storage medium, wherein the method comprises the steps of obtaining a current image frame and obtaining a global descriptor of the current image frame; screening in a pre-stored scene database to select a reference scene image matched with the global descriptor from the scene database; up-sampling the current image frame by adopting each feature extraction layer of the feature extraction network model to correspondingly obtain features to be fused, adding and fusing the features to be fused, and processing by adopting a nonlinear activation function to obtain an actual dense high-dimensional feature image of the current image frame; determining an interested region in a reference scene image, and calculating matching feature points in the actual dense high-dimensional feature map, wherein the matching feature points correspond to key feature points in the interested region one by one; and calculating and obtaining a current rotation and translation matrix of the vehicle-mounted camera based on a local odometer map construction principle and a P3P pose estimation algorithm. The embodiment can effectively improve the calculation precision.

Description

Vehicle-mounted camera pose calculation method and device and computer readable storage medium
Technical Field
The embodiment of the application relates to the technical field of vehicle-mounted camera image processing, in particular to a vehicle-mounted camera pose calculating method, a vehicle-mounted camera pose calculating device and a computer readable storage medium.
Background
At present, the realization of visual positioning (also called pose estimation or pose calculation) of a vehicle-mounted camera based on image feature matching is a common processing method, wherein the image feature matching is a key step, so that the problem of data association in visual positioning can be effectively solved, and the accuracy of the image feature matching determines the effect of visual positioning to a great extent.
The existing image feature matching method is mainly a Scale-Invariant Feature Trans-form (Scale-invariant feature transform) algorithm or an ORB (Oriented Fast and Rotated Brief, accelerating feature point extraction and description) algorithm, and the main steps comprise feature extraction, feature description and feature matching. However, the inventors have found that the above method for matching image features has many drawbacks when implemented, for example: the reconstruction result obtained by the method based on the feature points is sparse, and the feature points with enough quantity are difficult to detect when the method is applied to a real environment; and secondly, when the method is applied to the gesture calculation of the vehicle-mounted camera, the gesture calculation accuracy is relatively poor, and the positioning of the obstacle by the image vision is not facilitated.
Disclosure of Invention
The technical problem to be solved by the embodiment of the application is to provide a vehicle-mounted camera attitude calculation method which can effectively improve calculation accuracy.
The technical problem to be further solved by the embodiment of the application is to provide the vehicle-mounted camera attitude calculating device which can effectively improve the calculating precision.
The technical problem to be further solved by the embodiments of the present application is to provide a computer readable storage medium for storing a computer program capable of effectively improving the pose calculation precision of a vehicle-mounted camera.
In order to solve the technical problems, the embodiment of the application provides the following technical scheme: a vehicle-mounted camera attitude calculation method comprises the following steps:
extracting a current image frame from an original image of the surrounding environment of the motor vehicle, which is acquired and transmitted by a vehicle-mounted camera in real time, and sequentially processing the current image frame through a pre-stored feature extraction network model and a NetVLAD network model to obtain a global descriptor of the current image frame, wherein the feature extraction network model is formed by sequentially connecting a plurality of feature extraction layers;
screening a reference scene image matched with the global descriptor from a pre-stored scene database, wherein the scene database is scene map data of a pre-constructed motor vehicle in an actual driving environment;
up-sampling the current image frame by adopting each feature extraction layer of the feature extraction network model to correspondingly obtain features to be fused, adding and fusing the features to be fused, and then processing by adopting a nonlinear activation function to obtain an actual dense high-dimensional feature image of the current image frame;
determining an interested region in the reference scene image, and calculating matching feature points in the actual dense high-dimensional feature map, wherein the matching feature points are in one-to-one correspondence with each key feature point in the interested region, the actual dot product of the same key feature point and the matching feature points matched correspondingly is larger than the actual dot product of each other feature point of the actual dense high-dimensional feature map, and the actual distance between the same key feature point and the matching feature points matched correspondingly is smaller than the actual distance between the same key feature point and each other feature point of the actual dense high-dimensional feature map; and
and calculating and obtaining the current rotation and translation matrix of the vehicle-mounted camera by adopting key feature points and matching feature points which are mutually matched based on a local odometer map construction principle and a P3P pose estimation algorithm.
Further, the screening the reference scene image matched with the global descriptor from the pre-stored scene database specifically includes:
calculating the actual hamming distance between the global descriptor and the VLAD vector of each key frame in the scene database, and based on a decision tree algorithm model, key frames meeting a preset preliminary screening condition from the scene database, wherein the preset preliminary screening condition is that the descending order sequence number of the actual hamming distance is smaller than or equal to a preset sequence number; and
and calculating the actual similarity between each adjacent image frame of the key frames meeting the preset preliminary screening conditions and the current image frame, and determining the key frame with the maximum actual similarity and meeting the preset preliminary screening conditions as the reference scene image.
Further, the global descriptor is subjected to dimension reduction processing based on a principal component analysis algorithm, and then the actual hamming distance between the global descriptor after the dimension reduction processing and the VLAD vector of each key frame in the scene database is used.
Further, the calculating the current rotation translation matrix of the vehicle-mounted camera by adopting the key feature points and the matching feature points which are matched with each other based on the local odometer map construction principle and the P3P pose estimation algorithm specifically comprises the following steps:
constructing an actual three-dimensional space of a running scene of the motor vehicle based on a local odometer map construction principle;
correspondingly projecting the key feature points and the matching feature points which are correspondingly matched to the actual three-dimensional space to correspondingly generate key three-dimensional points and matching three-dimensional points respectively; and
and calculating to obtain the current rotation translation matrix of the vehicle-mounted camera by adopting three pairs of key three-dimensional points which are not collinear and are matched with each other and the matched three-dimensional points based on a P3P pose estimation algorithm.
Further, the method further comprises:
calculating the actual projection errors of the other key three-dimensional points matched with each other and the rotation translation matrix corresponding to the matched three-dimensional points based on a random sampling algorithm model; and
and calculating the minimum value of the actual projection error based on an error optimization algorithm model, and recalculating and updating the rotation translation matrix according to the key three-dimensional points and the matched three-dimensional points which meet the minimum value and are matched with each other.
Further, the region of interest is determined by determining whether the actual number of corner points of each region in the reference scene image is greater than a predetermined number.
On the other hand, in order to solve the above technical problems, the embodiment of the present application provides the following technical solutions: a vehicle-mounted camera pose calculation device connected with a vehicle-mounted camera of a motor vehicle, comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, wherein the processor implements the vehicle-mounted camera pose calculation method according to any one of the above when executing the computer program.
On the other hand, in order to solve the above technical problems, the embodiment of the present application provides the following technical solutions: a computer readable storage medium comprising a stored computer program, wherein the computer program, when run, controls a device in which the computer readable storage medium is located to perform the method for calculating the pose of the vehicle-mounted camera according to any one of the above.
After the technical scheme is adopted, the embodiment of the application has at least the following beneficial effects: the embodiment of the application firstly obtains the global descriptor by processing the current image frame acquired by the vehicle-mounted camera through the feature extraction network model and the NetVLAD network model, and further determines the reference scene image matched with the global descriptor by screening in the scene database, and the scene map data of the pre-constructed motor vehicle in the actual driving environment ensures that the global descriptor is matched with the actual driving environment of the motor vehicle as much as possible; further, each feature extraction layer of the feature extraction network model is adopted to up-sample the current image frame so as to correspondingly obtain features to be fused, each feature to be fused is added and fused and then processed by a nonlinear activation function so as to obtain an actual dense high-dimensional feature map, the actual dense high-dimensional feature map not only fuses features of different scales, but also processes by the nonlinear activation function so as to improve the nonlinear description of the high-dimensional features on key points, and the generalization capability of the actual dense high-dimensional feature map is improved; further, matching feature points in the actual dense high-dimensional feature map, which are in one-to-one correspondence with each key feature point in the region of interest, are calculated, screening and judging are carried out in a dot product matching and actual distance mode, so that the matching feature points matched with the key feature points are screened out based on the principle of maximum correlation by determining the cross correlation between the key feature points and each feature point in the actual dense high-dimensional feature map; and finally, the current rotation translation matrix of the vehicle-mounted camera can be obtained by adopting the key feature points and the matching feature points which are matched with each other based on the local odometer map construction principle and the P3P pose estimation algorithm, so that the pose calculation of the vehicle-mounted camera is realized, and the calculation accuracy is higher.
Drawings
Fig. 1 is a flowchart illustrating steps of an alternative embodiment of the method for calculating the pose of a vehicle-mounted camera according to the present application.
Fig. 2 is a specific flowchart of step S2 of an alternative embodiment of the method for calculating the pose of the vehicle-mounted camera according to the present application.
Fig. 3 is a flowchart showing an alternative embodiment of the method for calculating the pose of the vehicle-mounted camera according to the present application in step S5.
Fig. 4 is a schematic block diagram of an alternative embodiment of the vehicle-mounted camera pose calculation device of the present application.
FIG. 5 is a functional block diagram of an alternative embodiment of the vehicle camera pose calculation device of the present application.
Detailed Description
The application will be described in further detail with reference to the drawings and the specific examples. It should be understood that the following exemplary embodiments and descriptions are only for the purpose of illustrating the application and are not to be construed as limiting the application, and that the embodiments and features of the embodiments of the application may be combined with one another without conflict.
As shown in fig. 1, an alternative embodiment of the present application provides a vehicle-mounted camera pose calculating method, which includes the following steps:
s1: extracting a current image frame from an original image of the surrounding environment of the motor vehicle, which is acquired and transmitted by the vehicle-mounted camera 1 in real time, and sequentially processing the current image frame through a pre-stored feature extraction network model and a NetVLAD network model to obtain a global descriptor of the current image frame, wherein the feature extraction network model is formed by sequentially connecting a plurality of feature extraction layers;
s2: screening a reference scene image matched with the global descriptor from a pre-stored scene database, wherein the scene database is scene map data of a pre-constructed motor vehicle in an actual driving environment;
s3: up-sampling the current image frame by adopting each feature extraction layer of the feature extraction network model to correspondingly obtain features to be fused, adding and fusing the features to be fused, and then processing by adopting a nonlinear activation function to obtain an actual dense high-dimensional feature image of the current image frame;
s4: determining an interested region in the reference scene image, and calculating matching feature points in the actual dense high-dimensional feature map, wherein the matching feature points are in one-to-one correspondence with each key feature point in the interested region, the actual dot product of the same key feature point and the matching feature points matched correspondingly is larger than the actual dot product of each other feature point of the actual dense high-dimensional feature map, and the actual distance between the same key feature point and the matching feature points matched correspondingly is smaller than the actual distance between the same key feature point and each other feature point of the actual dense high-dimensional feature map; and
s5: and calculating to obtain the current rotation translation matrix of the vehicle-mounted camera 1 by adopting key feature points and matching feature points which are mutually matched based on a local odometer map construction principle and a P3P pose estimation algorithm.
The embodiment of the application firstly obtains the global descriptor by processing the current image frame acquired by the vehicle-mounted camera 1 through a feature extraction network model and a NetVLAD network model, and further determines the reference scene image matched with the global descriptor by screening in a scene database, and the scene map data of the pre-constructed motor vehicle in the actual driving environment ensures that the global descriptor is matched with the actual driving environment of the motor vehicle as much as possible; further, each feature extraction layer of the feature extraction network model is adopted to up-sample the current image frame so as to correspondingly obtain features to be fused, each feature to be fused is added and fused and then processed by a nonlinear activation function so as to obtain an actual dense high-dimensional feature map, the actual dense high-dimensional feature map not only fuses features of different scales, but also processes by the nonlinear activation function so as to improve the nonlinear description of the high-dimensional features on key points, and the generalization capability of the actual dense high-dimensional feature map is improved; further, matching feature points in the actual dense high-dimensional feature map, which are in one-to-one correspondence with each key feature point in the region of interest, are calculated, screening and judging are carried out in a dot product matching and actual distance mode, so that the matching feature points matched with the key feature points are screened out based on the principle of maximum correlation by determining the cross correlation between the key feature points and each feature point in the actual dense high-dimensional feature map; and finally, the current rotation translation matrix of the vehicle-mounted camera 1 can be obtained by adopting the key feature points and the matching feature points which are matched with each other based on the local odometer map construction principle and the P3P pose estimation algorithm, so that the pose calculation of the vehicle-mounted camera 1 is realized, and the calculation accuracy is higher.
In implementation, the actual dense high-dimensional feature map is set to be dhc_q, and the region of interest in the reference scene image is set to be shc_r, and then the dot product of the two can be expressed as:
HC cross =shc_r, dhc_q (formula 1)
Wherein HC is cross Representing the sense of a reference scene imageThe dot product of the high-dimensional features of the region of interest and the actual dense high-dimensional feature map.
In an alternative embodiment of the present application, as shown in fig. 2, the step S2 specifically includes:
s21: calculating the actual hamming distance between the global descriptor and the VLAD vector of each key frame in the scene database, and based on a decision tree algorithm model, key frames meeting a preset preliminary screening condition from the scene database, wherein the preset preliminary screening condition is that the descending order sequence number of the actual hamming distance is smaller than or equal to a preset sequence number; and
s22: and calculating the actual similarity between each adjacent image frame of the key frames meeting the preset preliminary screening conditions and the current image frame, and determining the key frame with the maximum actual similarity and meeting the preset preliminary screening conditions as the reference scene image.
In this embodiment, firstly, the actual hamming distance between the global descriptor and the VLAD vector of each key frame in the scene database is calculated, and the greater the actual hamming distance is, the higher the similarity between the global descriptor and the key frame is, so that the actual hamming distance corresponding to each key frame is adopted to perform descending sorting on the key frames, and each key frame with the front actual hamming distance is screened out; based on the principle that the similarity between the adjacent frames (i.e. the previous frame and the next frame) of the real reference scene image and the current image frame should be higher than that of the false reference scene image, the actual similarity between the adjacent image frames of the key frame meeting the preset preliminary screening condition and the current image frame is judged, so that the real reference scene image is accurately determined in further screening.
In specific implementation, it can be understood that, by performing descending order on each key frame with reference to the corresponding actual hamming distance, a predetermined number 10 can be set, i.e. each key frame with the actual hamming distance number 1-10 is screened out.
In an alternative embodiment of the present application, the global descriptor is first subjected to a dimension reduction process based on a principal component analysis algorithm (PCA, principal Component Analysis), and then the actual hamming distance between the dimension-reduced global descriptor and the VLAD vectors of each key frame in the scene database is based on the dimension reduction process. In this embodiment, the dimension reduction processing is performed on the global descriptor based on the principal component analysis algorithm, so that the calculation amount of the global descriptor can be effectively reduced, and the calculation efficiency is improved.
In specific implementation, the actual hamming distance calculation formula is as follows:
wherein D is hanming Representing the actual Hamming distance, V Pca K-dimensional binary vector representing reduced-dimension global descriptors, M i Represents the k-dimensional binary global descriptor corresponding to the ith key frame of the scene database.
In an alternative embodiment of the present application, as shown in fig. 3, the step S5 specifically includes:
s51: constructing an actual three-dimensional space of a running scene of the motor vehicle based on a local odometer map construction principle;
s52: correspondingly projecting the key feature points and the matching feature points which are correspondingly matched to the actual three-dimensional space to correspondingly generate key three-dimensional points and matching three-dimensional points respectively; and
s53: and calculating to obtain the current rotation translation matrix of the vehicle-mounted camera 1 by adopting three pairs of key three-dimensional points which are not collinear and are matched with each other and the matched three-dimensional points based on a P3P pose estimation algorithm.
In this embodiment, an actual three-dimensional space of a driving scene of a motor vehicle is firstly constructed based on a local odometer map construction principle, then matched key feature points and the matched feature points are projected to the actual three-dimensional space in sequence, and finally a current rotation translation matrix of the vehicle-mounted camera 1 can be rapidly calculated based on a P3P pose estimation algorithm and the key three-dimensional points and the matched three-dimensional points generated by projection, so that pose calculation is realized, and a calculation process is relatively simple.
In an alternative embodiment of the application, the method further comprises:
calculating the actual projection errors of the other key three-dimensional points matched with each other and the rotation translation matrix corresponding to the matched three-dimensional points based on a random sampling algorithm model; and
and calculating the minimum value of the actual projection error based on an error optimization algorithm model, and recalculating and updating the rotation translation matrix according to the key three-dimensional points and the matched three-dimensional points which meet the minimum value and are matched with each other.
In this embodiment, the actual projection errors of the other key three-dimensional points and the matching three-dimensional points which are matched with each other and corresponding to the rotation translation matrix are calculated by adopting a random sampling algorithm model, so that the minimum value of the actual projection errors is calculated by an error optimization algorithm model, and the rotation translation matrix is recalculated and updated according to the key three-dimensional points and the matching three-dimensional points which meet the minimum value and are matched with each other, thereby realizing optimization of the rotation translation matrix and improving calculation precision.
In an alternative embodiment of the application, the region of interest is determined by determining whether the actual number of corner points of the respective region in the reference scene image is greater than a predetermined number. The more the number of angular points in the region is, the more the features are, and the image matching is facilitated, so that the region of interest is determined by judging the number of angular points of each region in the reference scene image, and the judgment principle is simple.
On the other hand, as shown in fig. 4, an embodiment of the present application provides a vehicle-mounted camera pose calculation device 3 connected to a vehicle-mounted camera 1 of a motor vehicle, including a processor 30, a memory 32, and a computer program stored in the memory and configured to be executed by the processor 30, wherein the processor 30 implements the vehicle-mounted camera pose calculation method according to the above embodiment when executing the computer program.
Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory 32 and executed by the processor 30 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program in the in-vehicle camera pose calculation device 3. For example, the computer program may be divided into functional modules in the in-vehicle camera pose calculation apparatus 3 as illustrated in fig. 5, wherein the image acquisition and descriptor extraction module 41, the scene graph screening module 42, the dense feature calculation module 43, the feature point matching module 44, and the pose calculation module 45 respectively perform the above steps S1 to S5 correspondingly.
The vehicle-mounted camera gesture calculating device 3 can be a desktop computer, a notebook computer, a palm computer, a cloud server and other calculating equipment. The onboard camera pose computing device 3 may include, but is not limited to, a processor 30, a memory 32. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the in-vehicle camera pose calculation apparatus 3 and does not constitute a limitation of the in-vehicle camera pose calculation apparatus 3, and may include more or less components than illustrated, or combine certain components, or different components, e.g. the in-vehicle camera pose calculation apparatus 3 may further include input and output devices, network access devices, buses, etc.
The processor 30 may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor, etc., and the processor 30 is a control center of the vehicle-mounted camera pose calculation apparatus 3, and connects various parts of the entire vehicle-mounted camera pose calculation apparatus 3 using various interfaces and lines.
The memory 32 may be used to store the computer program and/or module, and the processor 30 may implement various functions of the in-vehicle camera pose computing device 3 by running or executing the computer program and/or module stored in the memory 32 and invoking data stored in the memory 32. The memory 32 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function (such as a pattern recognition function, a pattern layering function, etc.), and the like; the storage data area may store data created according to the use of the in-vehicle camera pose calculation device 3 (such as graphic data, etc.), and the like. In addition, the memory 32 may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
The functionality of the embodiments of the present application, if implemented in the form of software functional modules or units and sold or used as a stand-alone product, may be stored in a computing device readable storage medium. Based on such understanding, the implementation of all or part of the flow of the method of the foregoing embodiment according to the embodiments of the present application may also be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and the computer program may implement the steps of each of the foregoing method embodiments when executed by the processor 30. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
In another aspect, an embodiment of the present application provides a computer readable storage medium, including a stored computer program, where when the computer program runs, a device where the computer readable storage medium is controlled to execute the method for calculating the pose of the vehicle-mounted camera according to the above embodiment.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are all within the scope of the present application.

Claims (8)

1. The vehicle-mounted camera attitude calculation method is characterized by comprising the following steps of:
extracting a current image frame from an original image of the surrounding environment of the motor vehicle, which is acquired and transmitted by a vehicle-mounted camera in real time, and sequentially processing the current image frame through a pre-stored feature extraction network model and a NetVLAD network model to obtain a global descriptor of the current image frame, wherein the feature extraction network model is formed by sequentially connecting a plurality of feature extraction layers;
screening a reference scene image matched with the global descriptor from a pre-stored scene database, wherein the scene database is scene map data of a pre-constructed motor vehicle in an actual driving environment;
up-sampling the current image frame by adopting each feature extraction layer of the feature extraction network model to correspondingly obtain features to be fused, adding and fusing the features to be fused, and then processing by adopting a nonlinear activation function to obtain an actual dense high-dimensional feature image of the current image frame;
determining an interested region in the reference scene image, and calculating matching feature points in the actual dense high-dimensional feature map, wherein the matching feature points are in one-to-one correspondence with each key feature point in the interested region, the actual dot product of the same key feature point and the matching feature points matched correspondingly is larger than the actual dot product of each other feature point of the actual dense high-dimensional feature map, and the actual distance between the same key feature point and the matching feature points matched correspondingly is smaller than the actual distance between the same key feature point and each other feature point of the actual dense high-dimensional feature map; and
and calculating and obtaining the current rotation and translation matrix of the vehicle-mounted camera by adopting key feature points and matching feature points which are mutually matched based on a local odometer map construction principle and a P3P pose estimation algorithm.
2. The method for calculating the pose of the vehicle-mounted camera according to claim 1, wherein the step of screening the reference scene image matched with the global descriptor from the pre-stored scene database specifically comprises:
calculating the actual hamming distance between the global descriptor and the VLAD vector of each key frame in the scene database, and based on a decision tree algorithm model, key frames meeting a preset preliminary screening condition from the scene database, wherein the preset preliminary screening condition is that the descending order sequence number of the actual hamming distance is smaller than or equal to a preset sequence number; and
and calculating the actual similarity between each adjacent image frame of the key frames meeting the preset preliminary screening conditions and the current image frame, and determining the key frame with the maximum actual similarity and meeting the preset preliminary screening conditions as the reference scene image.
3. The method for calculating the pose of the vehicle-mounted camera according to claim 2, wherein the global descriptor is subjected to dimension reduction processing based on a principal component analysis algorithm, and the actual hamming distance between the dimension-reduced global descriptor and the VLAD vector of each key frame in the scene database is then used.
4. The method for calculating the pose of the vehicle-mounted camera according to claim 1, wherein the calculating the current rotational translation matrix of the vehicle-mounted camera based on the local odometer map construction principle and the P3P pose estimation algorithm by using the key feature points and the matching feature points which are matched with each other specifically comprises:
constructing an actual three-dimensional space of a running scene of the motor vehicle based on a local odometer map construction principle;
correspondingly projecting the key feature points and the matching feature points which are correspondingly matched to the actual three-dimensional space to correspondingly generate key three-dimensional points and matching three-dimensional points respectively; and
and calculating to obtain the current rotation translation matrix of the vehicle-mounted camera by adopting three pairs of key three-dimensional points which are not collinear and are matched with each other and the matched three-dimensional points based on a P3P pose estimation algorithm.
5. The vehicle-mounted camera pose calculation method according to claim 4, wherein the method further comprises:
calculating the actual projection errors of the other key three-dimensional points matched with each other and the rotation translation matrix corresponding to the matched three-dimensional points based on a random sampling algorithm model; and
and calculating the minimum value of the actual projection error based on an error optimization algorithm model, and recalculating and updating the rotation translation matrix according to the key three-dimensional points and the matched three-dimensional points which meet the minimum value and are matched with each other.
6. The vehicle-mounted camera pose calculation method according to claim 1, wherein the region of interest is determined by judging whether an actual number of corner points of each region in the reference scene image is greater than a predetermined number.
7. A vehicle-mounted camera pose calculation device connected to a vehicle-mounted camera of a motor vehicle, characterized in that it comprises a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the vehicle-mounted camera pose calculation method according to any of claims 1 to 6 when executing the computer program.
8. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program when run controls a device in which the computer readable storage medium is located to perform the vehicle-mounted camera pose calculation method according to any one of claims 1 to 6.
CN202310904859.4A 2023-07-21 2023-07-21 Vehicle-mounted camera pose calculation method and device and computer readable storage medium Pending CN116883966A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310904859.4A CN116883966A (en) 2023-07-21 2023-07-21 Vehicle-mounted camera pose calculation method and device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310904859.4A CN116883966A (en) 2023-07-21 2023-07-21 Vehicle-mounted camera pose calculation method and device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN116883966A true CN116883966A (en) 2023-10-13

Family

ID=88256570

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310904859.4A Pending CN116883966A (en) 2023-07-21 2023-07-21 Vehicle-mounted camera pose calculation method and device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN116883966A (en)

Similar Documents

Publication Publication Date Title
CN113361428B (en) Image-based traffic sign detection method
CN108345882B (en) Method, apparatus, device and computer-readable storage medium for image recognition
JP7152554B2 (en) CALIBRATION METHOD, APPARATUS, SYSTEM AND STORAGE MEDIUM OF VEHICLE CAMERA EXTERNAL PARAMETERS
CN115147598B (en) Target detection segmentation method and device, intelligent terminal and storage medium
CN111862222B (en) Target detection method and electronic equipment
CN114491399A (en) Data processing method and device, terminal equipment and computer readable storage medium
CN111709415B (en) Target detection method, device, computer equipment and storage medium
CN112435193A (en) Method and device for denoising point cloud data, storage medium and electronic equipment
Zoev et al. Convolutional neural networks of the YOLO class in computer vision systems for mobile robotic complexes
CN116843901A (en) Medical image segmentation model training method and medical image segmentation method
CN116883966A (en) Vehicle-mounted camera pose calculation method and device and computer readable storage medium
CN115761425A (en) Target detection method, device, terminal equipment and computer readable storage medium
CN112816959B (en) Clustering method, device, equipment and storage medium for vehicles
CN113283821B (en) Virtual scene processing method and device, electronic equipment and computer storage medium
CN115406452A (en) Real-time positioning and mapping method, device and terminal equipment
CN112990305A (en) Method, device and equipment for determining occlusion relationship and storage medium
CN111597375B (en) Picture retrieval method based on similar picture group representative feature vector and related equipment
CN113139617B (en) Power transmission line autonomous positioning method and device and terminal equipment
US20230401670A1 (en) Multi-scale autoencoder generation method, electronic device and readable storage medium
JP7425169B2 (en) Image processing method, device, electronic device, storage medium and computer program
CN113627460B (en) Target identification system and method based on time slicing convolutional neural network
CN116091751B (en) Point cloud classification method and device, computer equipment and storage medium
CN112597787B (en) Method, apparatus, server and medium for fusing partial images
CN117095262A (en) Traffic sign board recognition object processing method, device, equipment and storage medium
CN116110027A (en) Traffic sign recognition method, medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination