CN115984346A - Hardware acceleration circuit, chip and robot based on correlation scanning matching algorithm - Google Patents

Hardware acceleration circuit, chip and robot based on correlation scanning matching algorithm Download PDF

Info

Publication number
CN115984346A
CN115984346A CN202111202200.1A CN202111202200A CN115984346A CN 115984346 A CN115984346 A CN 115984346A CN 202111202200 A CN202111202200 A CN 202111202200A CN 115984346 A CN115984346 A CN 115984346A
Authority
CN
China
Prior art keywords
point cloud
index
discrete
adder
multiplier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111202200.1A
Other languages
Chinese (zh)
Inventor
王珂
肖刚军
包敏杰
马宝腾
周和文
许登科
孙明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Zhuhai Amicro Semiconductor Co Ltd
Original Assignee
Harbin Institute of Technology
Zhuhai Amicro Semiconductor Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology, Zhuhai Amicro Semiconductor Co Ltd filed Critical Harbin Institute of Technology
Priority to CN202111202200.1A priority Critical patent/CN115984346A/en
Publication of CN115984346A publication Critical patent/CN115984346A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Image Processing (AREA)

Abstract

The invention discloses a hardware accelerating circuit, a chip and a robot based on a correlation scanning matching algorithm, wherein the hardware circuit comprises a memory module, a point cloud processing module, a grid index module, a state machine module and an interconnection bus; the point cloud processing module is also used for reading the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read point cloud to execute rotation transformation; the point cloud processing module is used for reading the result of the current rotation transformation under the control of the state machine module, controlling the result of the current rotation transformation to perform discretization to obtain discrete point cloud, and storing the discrete point cloud into the memory module through the interconnection bus; and the grid index module is used for acquiring the discrete point cloud from the memory module under the control of the state machine module, calculating an index value of the discrete point cloud mapped to the map storage space according to a preset coordinate offset value and a search step length, and setting the index value as a reading address of the occupation probability of the grid points.

Description

Hardware accelerating circuit, chip and robot based on correlation scanning matching algorithm
Technical Field
The invention relates to the technical field of computer accelerators, in particular to a hardware acceleration circuit, a chip and a robot based on a correlation scanning matching algorithm.
Background
At present, in a SLAM algorithm of a laser radar-based mobile robot, a correlation scan matching algorithm is generally used to convert and match point cloud data acquired by a laser radar to a grid map, the correlation scan matching algorithm is a probability-based point cloud interframe matching algorithm, a set window can be used as a search template with a certain size in a currently processed grid map, searching is started from a predicted pose, all possible grid poses are traversed, point cloud is transformed from a scanning frame coordinate system (laser coordinate system) to a coordinate system of the currently processed grid map through a transformation matrix, the probability of occupation of the possible grid positions is calculated to obtain score values of the grid positions, and then accurate positioning of the robot is completed based on the calculation results of the score values.
The Correlation Scanning Matching (CSM) algorithm is one of the most widely applied scanning matching algorithms at present, but the real-time scanning matching algorithm involves a large amount of cyclic calculation, and the generated calculation complexity is extremely high, so that the real-time scanning matching algorithm is difficult to deploy on an embedded calculation platform with limited calculation power, and becomes a bottleneck of the real-time performance of robot positioning.
Disclosure of Invention
Aiming at the problem that a point cloud set generated by scanning of a laser radar of a robot needs a transformation process with a large calculation amount when the correlation scanning matching algorithm is executed on a software level, the method adopts a hardware acceleration mode to accelerate the calculation to obtain the probability value of the corresponding grid position, and the specific technical scheme is as follows:
a hardware accelerating circuit based on a correlation scanning matching algorithm is disclosed, wherein the hardware accelerating circuit is electrically connected with a laser radar; the hardware acceleration circuit comprises a memory module, a point cloud processing module, a grid index module, a state machine module and an interconnection bus; the memory module, the point cloud processing module, the grid index module and the state machine module establish a data transmission relation through the interconnection bus; the point cloud processing module is also used for reading the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read point cloud to execute rotation transformation; the point cloud processing module is used for reading the result of the current rotation transformation under the control of the state machine module, controlling the result of the current rotation transformation to execute discretization, setting the discretization result into a discrete point cloud, and storing the discrete point cloud into the memory module through the interconnection bus; the grid index module is used for acquiring the discrete point cloud from the memory module under the control of the state machine module, calculating an index value of the discrete point cloud mapped to a map storage space according to a preset coordinate offset value and a search step length, and setting the index value as a reading address of the occupation probability of the corresponding grid point; the memory module is provided with a map storage space and is used for storing the grid map to be searched transmitted by the bus interface; the memory module is used for storing a circle of point cloud collected by the laser radar and a preset trigonometric function value; the currently stored point cloud is a circle of point cloud collected by a laser radar; the occupation probability of the corresponding grid point is the probability value of the grid point matched in the grid map to be searched after the point cloud is processed by the point cloud processing module; and in the grid map to be searched, a matched index value exists in the occupation probability of the grid points.
According to the technical scheme, a hardware acceleration circuit controlled by a state machine is used for realizing the correlation scanning matching algorithm, specifically, the index value of the occupation probability of grid points in the grid map to be searched is calculated and divided into the point cloud rotation transformation which is firstly read, then the discretization treatment is carried out on the rotation transformation result, the scanning matching of the current frame point cloud and the grid map to be searched is realized, so that the discrete point cloud coordinate can be converted into the index value of the probability value of the grid points matched in the grid map to be searched, the index value in the map storage space where the probability value of the grid map to be searched is located is obtained, a CPU (central processing unit) can conveniently read the occupation probability of the corresponding grid points in the map storage space according to the index value, the occupation probability becomes the necessary parameter of the SLAM algorithm, and the efficiency and the precision of robot positioning are improved.
On the other hand, when the point cloud processing module is designed as a pipeline structure for coordinate system transformation, and the grid index module is designed as a pipeline structure for indexing the occupation probability, the state machine can schedule the process of reading the latest rotationally transformed point cloud by the point cloud processing module and the process of performing discretization on the original point cloud by the point cloud processing module to be executed in parallel under a specific clock cycle, or schedule the horizontal coordinate and the vertical coordinate of the point cloud to be transformed simultaneously and obtain the discretization result of the horizontal coordinate and the discrete result of the vertical coordinate, and further ensure that the occupation probability drawn by the cable is shared on the interconnection bus in real time so as to be read by a CPU.
As a technical solution, the point cloud processing module includes a point cloud rotation sub-module; the point cloud rotation submodule comprises a first register, a second register, a first pipeline structure and a second pipeline structure; the first register and the second register are connected with a first pipeline structure, and the first pipeline structure is used for reading the abscissa of the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read abscissa of the point cloud to execute rotation transformation; the first register and the second register are connected with a second pipeline structure, and the second pipeline structure is used for reading the vertical coordinate of the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read vertical coordinate of the point cloud to execute rotation transformation; wherein the interconnection bus transmits the abscissa of the point cloud to a first register; the abscissa of the point cloud transmitted by the interconnection bus to the first register is derived from the abscissa of the currently stored point cloud; wherein the interconnection bus caches a vertical coordinate of the point cloud to a second register; the vertical coordinate of the point cloud transmitted to the second register by the interconnection bus is derived from the vertical coordinate of the currently stored point cloud; wherein the first and second pipeline structures are parallel pipeline structures.
Further, the first pipeline structure comprises a first point cloud multiplier, a second point cloud multiplier and a point cloud subtracter; the first point cloud multiplier and the second point cloud multiplier belong to multipliers, and the point cloud subtracter belongs to a subtracter; the first input end of the first point cloud multiplier is connected with the output end of the first register; the second input end of the first point cloud multiplier is used for receiving a cosine function value corresponding to a rotation angle reached by the current rotation transformation, wherein the rotation angle reached by the current rotation transformation is the sum of an angle search step length and the rotation angle reached by the last rotation transformation; the first input end of the second point cloud multiplier is connected with the output end of the second register; the second input end of the second point cloud multiplier is used for receiving a sine function value corresponding to the rotation angle reached by the current rotation transformation; the first input end of the point cloud subtracter is connected with the output end of the first point cloud multiplier, and the second input end of the point cloud subtracter is connected with the output end of the second point cloud multiplier; the point cloud subtracter is used for transmitting the output difference value to the point cloud discretization submodule and setting the difference value output by the point cloud subtracter as an abscissa result output by the first assembly line structure and converted from the current rotation.
Further, the second pipeline structure comprises a third point cloud multiplier, a fourth point cloud multiplier and a point cloud adder; the third point cloud multiplier and the fourth point cloud multiplier belong to multipliers, and the point cloud adder belongs to an adder; a first input end of the third point cloud multiplier is connected with an output end of the first register; a second input end of the third point cloud multiplier, configured to receive a sine function value corresponding to a rotation angle reached by the current rotation transformation; the rotation angle reached by the current rotation transformation is the sum of the angle search step length and the rotation angle reached by the last rotation transformation; the first input end of the fourth point cloud multiplier is connected with the output end of the second register; a second input end of the fourth point cloud multiplier, configured to receive a cosine function value corresponding to the rotation angle reached by the current rotation transformation; the first input end of the point cloud adder is connected with the output end of the third point cloud multiplier, and the second input end of the point cloud adder is connected with the output end of the fourth point cloud multiplier; and the point cloud adder is used for transmitting the output sum value to the point cloud discretization submodule and setting the sum value output by the point cloud adder as a vertical coordinate result converted from the current rotation output by the second pipeline structure.
The two technical schemes use the rotation matrix as a basic operation framework, only use an adder, a subtracter and a multiplier to construct a parallel pipeline structure for simultaneously rotating and transforming the abscissa of the point cloud and the ordinate of the point cloud, and the parallel pipeline structure is used for receiving and processing a trigonometric function value corresponding to a stepping rotation angle calculated in advance by software.
As a technical solution, the point cloud processing module further comprises a point cloud discrete sub-module; the point cloud discretization sub-module comprises a third pipeline structure; and the third pipeline structure is used for reading the abscissa result output by the first pipeline structure and converted from the current rotation under the control of the state machine module, controlling the abscissa result converted from the current rotation to execute discretization, setting the discretization result as the abscissa value of the discrete point cloud, and storing the abscissa value of the discrete point cloud into the memory module through the interconnection bus.
Further, the third pipeline structure comprises a first discrete adder, a first discrete subtractor and a first discrete multiplier; the first discrete adder belongs to the adder, the first discrete subtracter belongs to the subtracter, and the first discrete multiplier belongs to the multiplier; the first input end of the first discrete adder is connected with the output end of the point cloud subtracter in the first pipeline structure, and the first input end of the first discrete adder is used for receiving the abscissa result converted from the current rotation; the second input end of the first discrete adder is used for receiving the abscissa of the robot position transmitted by the interconnection bus, wherein the abscissa of the robot position is a pre-calculated abscissa value of the robot in a world coordinate system; the first discrete adder is used for controlling the addition of the abscissa result converted by the current rotation and the abscissa of the robot position, and outputting a sum value obtained by the addition so that the sum value becomes a horizontal axis coordinate value converted from point cloud to a world coordinate system; the first input end of the first discrete subtracter is connected with the output end of the first discrete adder; the second input end of the first discrete subtracter is used for receiving the maximum abscissa value of the map transmitted by the interconnection bus; the first input end of the first discrete multiplier is connected with the output end of the first discrete subtracter; the second input end of the first discrete multiplier is used for receiving the reciprocal of the map resolution transmitted by the interconnection bus; a first discrete multiplier for outputting the product value obtained by the multiplication to the memory module through the interconnection bus; and the product value output by the first discrete multiplier is configured as an abscissa value of the discrete point cloud, discretization of the abscissa is determined to be completed, the aim that the abscissa value of the currently stored point cloud is aligned into the coordinate system of the grid map to be searched is achieved, and meanwhile, the product value output by the first discrete multiplier is set as a discretization result output by the third pipeline structure.
Further, the point cloud discretization sub-module further comprises a fourth pipeline structure; and the fourth pipeline structure is used for reading the vertical coordinate result output by the second pipeline structure and converted by the current rotation under the control of the state machine module, controlling the vertical coordinate result converted by the current rotation to execute discretization, setting the discretization result as the vertical coordinate value of the discrete point cloud, and storing the vertical coordinate value of the discrete point cloud into the memory module through the interconnection bus.
Further, the fourth pipeline structure comprises a first discrete subtractor, a second discrete adder and a second discrete multiplier; the second discrete adders belong to the adders, the second discrete subtractors belong to the subtractors, and the second discrete multipliers belong to the multipliers; the first input end of the second discrete adder is connected with the output end of the point cloud adder of the second pipeline structure, and the first input end of the second discrete adder is used for receiving the ordinate result converted from the current rotation; the second input end of the second discrete adder is used for receiving the vertical coordinate of the robot position transmitted by the interconnection bus, wherein the vertical coordinate of the robot position is a pre-calculated vertical coordinate value of the robot in a world coordinate system; the second discrete adder is used for controlling the addition of the ordinate result converted by the current rotation and the ordinate of the robot position, and outputting the sum obtained by the addition so that the sum becomes the ordinate value of the point cloud converted to the world coordinate system; the first input end of the second discrete subtracter is connected with the output end of the second discrete adder; the second input end of the second discrete subtracter is used for receiving the maximum longitudinal coordinate value of the map transmitted by the interconnection bus; the first input end of the second discrete multiplier is connected with the output end of the second discrete subtracter; the second input end of the second discrete multiplier is used for receiving the reciprocal of the map resolution transmitted by the interconnection bus; the product value output by the second discrete multiplier is configured to be a longitudinal coordinate value of the discrete point cloud, the discretization of the longitudinal coordinate is determined to be completed, the longitudinal coordinate value of the currently stored point cloud is aligned to the coordinate system of the grid map to be searched, and meanwhile, the product value output by the second discrete multiplier is set to be a discretization result output by the fourth pipeline structure; a second discrete multiplier further configured to output the multiplied product value to the memory module via the interconnect bus.
According to the technical scheme of the third pipeline structure and the fourth pipeline structure, for each point cloud, the horizontal coordinate and the vertical coordinate are read from a storage space in which the point cloud is stored, the point cloud discretization sub-module is input, and the horizontal coordinate and the vertical coordinate of the point cloud are converted to the coordinate system of the grid map to be searched from the laser coordinate system in the same offset mode along the adaptive coordinate axis direction through an adder and a subtracter, wherein the coordinate axis of the point cloud is exchanged to realize coordinate system conversion; then multiplying the data by the reciprocal of the map resolution, and completing discretization processing of the horizontal and vertical coordinates of the point cloud through a hardware assembly line.
As a technical solution, the grid index module includes a fifth pipeline structure; the fifth pipeline structure comprises a first index subtracter and a first index adder, wherein the first index subtracter belongs to the subtracter, and the first index adder belongs to the adder; the first input end of the first index subtracter is used for receiving the abscissa of the discrete point cloud transmitted by the interconnection bus; the second input end of the first index subtracter is used for receiving the horizontal axis coordinate deviation value transmitted by the interconnection bus; the preset coordinate deviation value comprises a horizontal axis coordinate deviation value; the first input end of the first index adder is connected with the output end of the first index subtracter; the second input end of the first index adder is used for receiving the horizontal axis coordinate searching step length transmitted by the interconnection bus; wherein the search step comprises a horizontal axis coordinate search step; the sum value output by the first index adder is configured as the index value of the direction of the horizontal axis of the discrete point cloud mapped to the map storage space and is set as the result output by the fifth pipeline structure.
Further, the grid index module includes a sixth pipeline structure; the sixth pipeline structure comprises a second index subtracter and a second index adder; the second index subtracter belongs to the subtracter, and the second index adder belongs to the adder; the first input end of the second index subtracter is used for receiving the vertical coordinate of the discrete point cloud transmitted by the interconnection bus, wherein the vertical coordinate of the discrete point cloud transmitted by the interconnection bus to the second index subtracter is from the memory module; the second input end of the second index subtracter is used for receiving the vertical axis coordinate offset value transmitted by the interconnection bus; wherein the preset coordinate offset value comprises a vertical axis coordinate offset value; the first input end of the second index adder is connected with the output end of the second index subtracter; the second input end of the second index adder is used for receiving the longitudinal axis coordinate searching step length transmitted by the interconnection bus; wherein the search step comprises a horizontal axis coordinate search step; the sum value output by the second index adder is configured as the longitudinal axis direction index value of the discrete point cloud mapped to the map storage space and is set as the result output by the sixth pipeline structure.
The fifth pipeline structure and the sixth pipeline structure are parallel in the same grid index module, and by using an addition and subtraction operation combined structure, a horizontal axis direction index value and a vertical axis direction index value mapped into a map storage space are respectively calculated, so that the horizontal coordinate and the vertical coordinate of the discrete point cloud are parallelly converted into the discrete index information of the grid points currently participating in searching the grid map to be searched.
Further, the grid index module further comprises a third index adder and an index multiplier, wherein the third index adder belongs to the adder, and the index multiplier belongs to the multiplier; the first input end of the index multiplier is connected with the output end of the second index adder; a second input end of the index multiplier is used for receiving the row grid number transmitted by the interconnection bus, wherein the row grid number is the number of grids which can be occupied by each row of the map storage space; the first input end of the third index adder is connected with the output end of the first index adder, the second input end of the third index adder is connected with the output end of the index multiplier, the third index adder is used for controlling the product of the row grid number and the sum value output by the second index adder to be added with the index value in the horizontal axis direction, and then the added sum value is set as an index value of the discrete point cloud mapped into the map storage space and sent to the interconnection bus. And calculating to obtain an index value of the occupation probability of the discrete point cloud mapped to the grid points in a line-by-line query mode, wherein the index value is also used as an index value mapped to the map storage space, is equivalent to a storage address of the occupation probability of the point cloud in the map storage space and is equivalent to a reading address of the occupation probability stored in an external reading map storage space, so that the interconnection bus can read the occupation probability from the memory module.
Further, the grid index module further comprises a third index adder and an index multiplier, wherein the third index adder belongs to the adder, and the index multiplier belongs to the multiplier; a first input end of the index multiplier is connected with an output end of the first index adder; the second input end of the index multiplier is used for receiving the number of the column grids transmitted by the interconnection bus; wherein the number of column grids is the number of grids that each column of the map storage space can occupy; the first input end of the third index adder is connected with the output end of the second index adder, and the second input end of the third index adder is connected with the output end of the index multiplier; and the third index adder is used for controlling the addition of the product of the number of the column grids and the sum value output by the first index adder and the index value in the longitudinal axis direction, setting the sum value obtained by the addition as an index value of the discrete point cloud mapped into the map storage space and sending the index value to the interconnection bus. And calculating to obtain an index value of the occupation probability of the discrete point cloud mapped to the grid points in a column-by-column query mode, wherein the index value is also used as the index value mapped to the map storage space, is equivalent to a storage address of the occupation probability of the point cloud in the map storage space and is equivalent to a reading address of the occupation probability stored in an external reading map storage space, so that the interconnection bus can read the occupation probability from the memory module.
As a technical scheme, the state machine module belongs to a finite state machine; the state machine module is used for scheduling the working states of the memory module, the point cloud processing module and the grid index module, so that the grid index module calculates an index value of a point cloud at a matching grid point of the grid map to be searched and determines to finish a matching operation when the point cloud processing module executes the rotation transformation and the discretization once; the index value is the index value of the probability value of the point cloud at a matching grid point of the grid map to be searched.
Further, the point cloud processing module is further configured to update the currently stored point cloud by using a result obtained by the current rotation transformation each time the matching operation is completed or the rotation transformation is performed each time, so that the point cloud processing module performs a new rotation transformation only by using the updated currently stored point cloud until the index values of the occupancy probabilities corresponding to all the currently acquired point clouds are calculated.
In the two technical schemes of the state machine module, the state machine module controls the point cloud processing module to sequentially execute one-time rotation and one-time discretization processing on each point cloud in a way of hardware scheduling working state and signal interruption, so that the occupation probability of index value matching is read from the memory module, the operation in a circulating state is realized, and the real-time performance of the hardware accelerating circuit for searching the map position is ensured.
As a technical solution, a bus interface module is arranged outside the hardware acceleration circuit, and the bus interface module includes a DMA controller module and a transmission bus; the DMA controller module is used for continuously transmitting data stored in the physical storage space of the discontinuous addresses in batches and reducing the triggering times of software interruption of the CPU; the transmission bus comprises a first bus and a second bus, the first bus is in signal receiving and transmitting contact with the memory module, the point cloud processing module, the grid index module, the state machine module, the interconnection bus and the DMA controller module respectively, the first bus is used for configuring data transmission parameters for the DMA controller module, and the first bus is also used for configuring a parameter register arranged in the hardware acceleration circuit; the second bus is connected with the DMA controller module and is used for transmitting a grid map to be searched, which is constructed in advance by the CPU, a preset sine function value under a corresponding rotation angle, a preset cosine function value under a corresponding rotation angle and a point cloud currently acquired by the laser radar to the memory module; wherein the transmission bus is in compliance with AMBA protocol. The technical scheme provides a bus interface architecture module for the hardware acceleration circuit and the CPU, and respectively designs a first bus suitable for simple and low-throughput memory mapping communication and a second bus for high-speed data flow according to data transmission performance; the real-time performance of the operation work of the hardware acceleration circuit is improved.
A chip integrates the hardware acceleration circuit. When the processor unit, the bus interface and the hardware acceleration circuit are integrated on the same chip, the chip forms a heterogeneous chip.
A robot is internally equipped with the chip for positioning in a pre-constructed grid map. Compared with the prior art that the robot executes a correlation scanning matching algorithm in a pure software platform mode, the technical scheme has the advantage of obvious running speed even under the condition that the frequency of the hardware acceleration circuit is not high, and can better meet the repositioning requirement of the mobile robot platform.
Drawings
FIG. 1 is a diagram of a hardware acceleration circuit based on a correlation scan matching algorithm according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail below with reference to the accompanying drawings in the embodiments of the present invention. To further illustrate the various embodiments, the invention provides the accompanying drawings. The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the embodiments.
One embodiment of the invention discloses a hardware acceleration circuit based on a correlation scan matching algorithm; the hardware acceleration circuit and the laser radar are electrically connected; the hardware acceleration circuit comprises a memory module, a point cloud processing module, a grid index module, a state machine module and an interconnection bus; the memory module, the point cloud processing module, the grid index module and the state machine module establish a data transmission relation through the interconnection bus; the pipeline structure arranged in the point cloud processing module and the pipeline structure arranged in the node searching module are parallel; the interconnection bus is used as an interconnection structure of the hardware acceleration circuit and plays a role in data transmission and data sharing among circuit modules in the hardware acceleration circuit, so that the point cloud processing module, the grid index module and the state machine control module are all connected to the memory space of the memory module through the interconnection structure, and the interconnection bus is communicated and data transmitted with the CPU or an external bus through a bus interface arranged in the hardware acceleration circuit; the state machine control module is used as a finite state machine and is configured as a control unit in the hardware acceleration circuit, and the control unit is used for scheduling data transmission of the interconnection bus.
The point cloud processing module is also used for reading the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read point cloud to execute rotation transformation; the currently stored point clouds are a batch of point clouds for performing correlation scan matching operation in the hardware acceleration circuit, and are referred to as a currently processed batch of point clouds for short, and the cycle number of the working state scheduled by the state machine control module is the number of the batch of point clouds. Specifically, existing Point clouds are read from the boxes marked with Point Cloud shown in fig. 1; and then controlling the read point cloud to execute rotation transformation so as to realize that the point cloud completes the coverage of the angle search range of the search window according to the preset angle search step length, then updating the rotation transformation result into a batch of currently processed point clouds, and then controlling the updated point cloud to execute the next rotation transformation.
The point cloud processing module is used for reading a current rotation transformation result (a newly obtained batch of point clouds processed currently) under the control of the state machine module, then controlling the current rotation transformation result to perform discretization, including offset transformation of horizontal and vertical coordinate values and proportion adjustment of map resolution, setting the discretization result into a discrete point cloud, falling into a coordinate system of a pre-expanded grid map, and aligning the batch of point clouds processed currently into the coordinate system of the grid map to be searched; the discrete point cloud is then stored into a memory module through the interconnect bus. It is emphasized that each matching process of the correlation scan matching of the point cloud to the grid map to be searched comprises a rotation transformation and a discretization, and the discretization point cloud is obtained, namely the point cloud is discretized on a coordinate system (grid coordinate system) of the grid map to be searched, wherein the discretization comprises coordinate translation and resolution conversion. The currently stored Point Cloud, that is, the Point Cloud existing in the square frame marked with Point Cloud in fig. 1, is acquired from the laser radar. It should be noted that, in the foregoing grid map, the value in each grid represents the probability that the grid is occupied; the process of aligning the point cloud to the grid map to be searched is understood as follows: and preferably, the point cloud processing unit calls a connected computing unit to execute rotation and translation so as to realize the superposition of the point cloud and the obstacle and improve the robustness.
The grid index module is used for acquiring the discrete point cloud from the memory module under the control of the state machine module, calculating an index value of the discrete point cloud mapped to a map storage space according to a preset coordinate offset value and a search step length, and setting the index value as a reading address of the occupation probability of the corresponding grid point; and the search step length is the number of the grid searches which need to be traversed for searching one pose in the same search window. One node currently searched by the search window corresponds to: and the currently read point cloud is transformed to grid points based on the pose parameters of the search window. And when the point cloud processing module executes discretization once, the grid indexing module executes accumulation operation once to obtain a probability value contained in a grid point currently searched by the search window as a score value of the grid point.
The memory module is provided with a map storage space and is used for storing the grid map to be searched transmitted by the bus interface; the memory module is used for storing a circle of point cloud collected by the laser radar and a preset trigonometric function value; the probability of occupation of the point cloud in the grid map to be searched for is matched with the probability of occupation of the grid points; those skilled in the art can easily understand on the basis of understanding the grid map constructed by the laser point cloud that: the occupation probability of the corresponding grid point is the occupation probability of the grid point matched with the point cloud in the grid map, and a matched index value exists, and the occupation probability of the corresponding grid point is also the probability value of the occupation probability of the matched grid point after the point cloud is converted into the grid in the grid map to be searched through the point cloud processing module.
It should be noted that the Memory module in the above embodiments is essentially a storage medium, and the storage medium may be, but is not limited to, a Read-Only Memory (ROM), a Random Access Memory (RAM), and other storage media capable of storing program codes; the point cloud processing module, the grid index module and the state machine module disclosed by the embodiment of the invention can be, but are not limited to, a digital circuit module formed by compiling a designer by using a hardware description language Verilog HDL (hardware description language), or a digital circuit module formed by circuit drawing or compiling a designer on software with a circuit drawing or compiling function. In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module.
In this embodiment, a hardware acceleration circuit controlled by a state machine is used to implement a correlation scan matching algorithm mentioned in the background art, and specifically, the index value calculation of the occupation probability of a grid point in the grid map to be searched is divided into the point cloud rotation transformation of the point cloud that is read first, and then the result of the rotation transformation is discretized, so as to implement the scan matching of the current frame point cloud and the grid map to be searched, thereby converting the discrete point cloud coordinate into the index value of the probability value of the grid point that is matched in the grid map to be searched, and becoming the index value in the map storage space where the probability value of the grid map to be searched is located.
On the other hand, when the point cloud processing module is designed as a pipeline structure for coordinate system transformation, and the grid index module is designed as a pipeline structure for indexing the occupation probability values, the state machine can schedule the process of reading the latest rotationally transformed point cloud by the point cloud processing module and the process of performing discretization on the original point cloud by the point cloud processing module at a specific clock cycle to be executed in parallel, or schedule the horizontal coordinate and the vertical coordinate of the point cloud to be transformed simultaneously and obtain the discretization result of the horizontal coordinate and the discrete result of the vertical coordinate, and further ensure that the occupation probability drawn by the cable is shared on the interconnection bus in real time so as to be read by a CPU.
As an embodiment, as shown in fig. 1, the point cloud processing module includes a point cloud rotation sub-module; the point cloud rotation submodule comprises a first register, a second register, a first pipeline structure and a second pipeline structure; the first register and the second register are connected with a first pipeline structure, and the first pipeline structure is used for reading the abscissa of the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read abscissa of the point cloud to execute rotation transformation; the first register and the second register are connected with a second pipeline structure, and the second pipeline structure is used for reading the vertical coordinate of the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read vertical coordinate of the point cloud to execute rotation transformation; the interconnection bus transmits the abscissa of the point cloud to the first register; the abscissa of the point cloud transmitted by the interconnection bus to the first register is derived from the abscissa of the currently stored point cloud; wherein the interconnection bus caches a vertical coordinate of the point cloud to a second register; the vertical coordinate of the point cloud transmitted to the second register by the interconnection bus is derived from the vertical coordinate of the currently stored point cloud; wherein the first pipeline structure and the second pipeline structure are parallel pipeline structures.
As an embodiment of the first pipeline structure, the first pipeline structure includes a first point cloud multiplier, a second point cloud multiplier, and a point cloud subtractor; the first point cloud multiplier and the second point cloud multiplier both belong to multipliers, and are represented as circles marked with 'x' in the point cloud rotation sub-module of fig. 1; the point cloud subtracter belongs to a subtracter and is represented as a circle marked with a mark in a point cloud rotation submodule of fig. 1.
The first input end of the first point cloud multiplier is connected with the output end of the first register; wherein the interconnection bus transmits the abscissa of the point cloud to the input of the first register; the abscissa of the point cloud transmitted by the interconnection bus to the first register is derived from the abscissa of the currently stored point cloud; in the embodiment shown in FIG. 1, the first input terminal of the first point cloud multiplier (the first multiplier arranged from top to bottom in the point cloud rotation submodule of FIG. 1) and the register P x (is marked with P x Block) of the register P, wherein the register P is connected to the output of the register P x Is the first register.
The second input end of the first point cloud multiplier is used for receiving a cosine function value corresponding to a rotation angle reached by the current rotation transformation, wherein the rotation angle reached by the current rotation transformation is a sum of an angle search step length and a rotation angle reached by the last rotation transformation, and is derived from the rotation angle obtained by stepping rotation operation executed on a CPU or software level and the cosine function value corresponding to the rotation angle, so that the complexity of a hardware unit caused by trigonometric function operation is avoided, and hardware transmission delay is prevented from being caused. Preferably, the cosine function value transmitted by the interconnection bus is supported to be registered in a register inside the point cloud rotation submodule, and the second input end of the first point cloud multiplier receives the cosine function value transmitted by the interconnection bus through the connected register. In the embodiment shown in fig. 1, the second input terminal of the first point cloud multiplier is connected to the output terminal of the register COS (box labeled with COS), where the register COS is used to buffer the cosine function value transmitted from the interconnect bus, and then transmit the currently buffered cosine function value to the second input terminal of the first point cloud multiplier. And the first point cloud multiplier is used for controlling the abscissa of the point cloud received by the first input end of the first point cloud multiplier to be multiplied by the cosine function value corresponding to the rotation angle reached by the current rotation transformation received by the second input end of the first point cloud multiplier, and outputting the product value obtained by multiplication. The angle searching step length is stored in a searching window parameter register; the search window parameter register is a parameter register set inside the hardware acceleration circuit and used for storing the pose included by the search window and the associated stepping search information.
The first input end of the second point cloud multiplier is connected with the output end of the second register; wherein the interconnection bus caches a vertical coordinate of the point cloud to a second register; the vertical coordinate of the point cloud transmitted to the second register by the interconnection bus is derived from the vertical coordinate of the currently stored point cloud; in the embodiment shown in FIG. 1, the first input terminal of the second point cloud multiplier (the second multiplier arranged from top to bottom in the point cloud rotation sub-module of FIG. 1) and the register P y (labeled with P) y Block) of the register P, wherein the register P is connected to the output of the register P y Is the second register.
The second input end of the second point cloud multiplier is used for receiving a sine function value corresponding to the rotation angle reached by the current rotation transformation; the rotation angle achieved by the current rotation transformation is the sum of the angle search step length and the rotation angle achieved by the last rotation transformation, and is derived from the rotation angle obtained by the stepping rotation operation executed on the CPU or software level and the corresponding sine function value, so that the increase of the complexity of a hardware unit due to trigonometric function operation is avoided, and hardware transmission delay is prevented from being caused. Preferably, the sine function value transmitted by the interconnection bus is supported to be registered in a register inside the point cloud rotation submodule, and a second input end of the second point cloud multiplier receives the sine function value transmitted by the interconnection bus through a connected register. In the embodiment shown in fig. 1, a second input terminal of the second point cloud multiplier is connected to an output terminal of a register SIN (a box labeled with SIN), where the register SIN is configured to buffer the sine function value transmitted by the interconnection bus, and then transmit the currently buffered sine function value to the second input terminal of the second point cloud multiplier. And the second point cloud multiplier is used for controlling the vertical coordinate of the point cloud received by the first input end of the second point cloud multiplier to be multiplied by a sine function value corresponding to the rotation angle reached by the current rotation transformation received by the second input end of the second point cloud multiplier, and outputting a product value obtained by multiplication.
The first input end of the point cloud subtracter is connected with the output end of the first point cloud multiplier, and the second input end of the point cloud subtracter is connected with the output end of the second point cloud multiplier; the point cloud subtracter is used for controlling the product value output by the first point cloud multiplier to subtract the product value output by the second point cloud multiplier, outputting the subtracted difference value and setting the subtracted difference value as an abscissa value of the rotating point cloud obtained after the point cloud is subjected to current rotation transformation; the point cloud subtracter is used for transmitting the output difference value to the point cloud discretization submodule and setting the difference value output by the point cloud subtracter as an abscissa result converted from the current rotation output by the first assembly line structure so as to execute the rotation conversion next time.
As an embodiment of the second pipeline structure, the second pipeline structure comprises a third point cloud multiplier, a fourth point cloud multiplier, and a point cloud adder; the third point cloud multiplier and the fourth point cloud multiplier belong to multipliers, and the third point cloud multiplier and the fourth point cloud multiplier belong to multipliers, which are represented as circles marked with 'x' in the point cloud rotation sub-module in fig. 1; the point cloud adder belongs to an adder, and is represented as a circle marked with "+" in the point cloud rotation sub-module of fig. 1.
The first input terminal of the third point cloud multiplier is connected to the output terminal of the first register, the third point cloud multiplier is used for receiving the abscissa of the point cloud output by the first register, in the embodiment shown in fig. 1, the first input terminal of the third point cloud multiplier (the third multiplier arranged from top to bottom in the point cloud rotation sub-module of fig. 1) and the register P x (labeled with P) x Block of (c) is connected.
A second input end of the third point cloud multiplier, configured to receive a sine function value corresponding to the rotation angle reached by the current rotation transformation; the rotation angle reached by the current rotation transformation is the sum of the angle search step length and the rotation angle reached by the last rotation transformation; preferably, the sine function value transmitted by the interconnection bus is supported to be registered in a register inside the point cloud rotation submodule, and a second input end of the second point cloud multiplier receives the sine function value transmitted by the interconnection bus through the connected register. In the embodiment shown in fig. 1, the second input terminal of the second point cloud multiplier is connected to the output terminal of a register SIN (a box labeled with SIN), where the register SIN is configured to buffer the sine function value transmitted by the interconnection bus, and then transmit the currently buffered sine function value to the second input terminal of the third point cloud multiplier. And the third point cloud multiplier is used for controlling the vertical coordinate of the point cloud currently transmitted by the interconnection bus to be multiplied by the sine function value corresponding to the rotation angle reached by the current rotation transformation, and outputting the product value obtained by the multiplication.
The first input end of the fourth point cloud multiplier is connected with the output end of the second register and is used for receiving the vertical coordinate of the point cloud output by the second register; in the embodiment shown in FIG. 1, the first input terminal of the fourth cloud multiplier (the fourth multiplier arranged from top to bottom in the point cloud rotation sub-module of FIG. 1) and the register P y (labeled with P) y Block of (c) is connected.
A second input end of the fourth point cloud multiplier, configured to receive a cosine function value corresponding to the rotation angle reached by the current rotation transformation; in the embodiment shown in fig. 1, the second input terminal of the fourth point cloud multiplier is connected to the output terminal of a register COS (box labeled with COS), wherein the register COS is used for buffering the cosine function value transmitted by the interconnect bus, and then transmitting the currently buffered cosine function value to the second input terminal of the fourth point cloud multiplier. And the fourth cloud multiplier is used for controlling the vertical coordinate of the point cloud received by the first input end of the fourth cloud multiplier to be multiplied by the cosine function value corresponding to the rotation angle reached by the current rotation transformation received by the second input end of the fourth cloud multiplier, and outputting the product value obtained by multiplication.
The first input end of the point cloud adder is connected with the output end of the third point cloud multiplier, and the second input end of the point cloud adder is connected with the output end of the fourth point cloud multiplier; the first input end of the point cloud adder is used for receiving the product value output by the third point cloud multiplier; the second input end of the point cloud adder is used for receiving the product value output by the fourth point cloud multiplier; the point cloud adder is used for controlling the product value output by the third point cloud multiplier to be added with the product value output by the fourth point cloud multiplier, outputting the sum value obtained by adding, and setting the sum value as a longitudinal coordinate value of the rotating point cloud obtained after the point cloud is subjected to current rotation transformation; the point cloud adder is further used for transmitting the output difference value to the point cloud discretization submodule and setting the sum value output by the point cloud adder as a vertical coordinate result converted from the current rotation output by the second pipeline structure so as to execute the rotation conversion next time.
In the embodiments of the first pipeline structure and the second pipeline structure, the clock cycle delay of the output result of the first pipeline structure and the clock cycle delay of the output result of the second pipeline structure both depend on the number of stages of the pipelines, specifically, depend on the number of serially connected computing units employed in each pipeline structure. The embodiment of the first pipeline structure and the second pipeline structure takes a rotation matrix as a basic operation framework, only an adder, a subtracter and a multiplier are used for constructing a parallel pipeline structure for simultaneously rotating and transforming the abscissa of the point cloud and the ordinate of the point cloud, and the parallel pipeline structure is used for receiving and processing a trigonometric function value corresponding to a stepping rotation angle calculated in advance by software, so that a trigonometric function hardware calculation module is omitted, a point cloud rotation sub-module is controlled to iteratively execute the rotation transformation on the basis of stepping rotation, the abscissa and the ordinate after the rotation transformation are obtained, and the obtained abscissa and the ordinate are provided for the point cloud discretization sub-module to accelerate the discretization operation.
As an embodiment, the point cloud processing module further comprises a point cloud discrete sub-module; the point cloud discretization submodule comprises a third pipeline structure; and the third pipeline structure is used for reading the abscissa result (actually, the output result of the first pipeline structure in a specific clock cycle) converted by the current rotation output by the first pipeline structure under the control of the state machine module, controlling the abscissa result converted by the current rotation to execute discretization, setting the discretization result as the abscissa value of the discrete point cloud, and storing the abscissa value of the discrete point cloud into the memory module through the interconnection bus.
The third pipeline structure comprises a first discrete adder, a first discrete subtractor and a first discrete multiplier; the first discrete adder belongs to the adder, and is a circle marked with "+" in the point cloud discretization submodule of fig. 1; the first discrete subtracter belongs to a subtracter and is represented as a circle marked with a mark in a point cloud discretization sub-module of the point cloud discretization sub-module in the figure 1; the first discretization multiplier belongs to the multiplier and is represented by the circle marked with an "x" in the point cloud discretization sub-module of fig. 1.
The first input end of the first discrete adder is connected with the output end of the point cloud subtracter in the first pipeline structure, and the first input end of the first discrete adder is used for receiving the abscissa result converted from the current rotation.
The second input end of the first discrete adder is used for receiving the abscissa of the robot position transmitted by the interconnection bus, wherein the abscissa of the robot position is a pre-calculated abscissa coordinate value of the robot in a world coordinate system; preferably, the abscissa of the robot position transmitted by the interconnection bus is supported to be registered in a register inside the point cloud discretization sub-module, and the second input end of the first discretization adder receives the abscissa of the robot position transmitted by the interconnection bus through the connected register. As shown in FIG. 1, the second input terminal of the first discrete adder is connected to the register Pose x (labeled with Pose x Block of) phase, in which the register Pose x For buffering the transmission from the interconnection busAnd the abscissa of the robot position transmits the abscissa of the robot position which is cached currently to the second input end of the first discrete adder. The abscissa of the robot position is a horizontal axis coordinate value of the robot under a world coordinate system, which is calculated in advance (by software or a CPU unit); the lidar is mounted on the robot and acts as a sensor device for robot positioning. And the first discrete adder is used for controlling the addition of the abscissa result converted by the current rotation and the abscissa of the robot position, and outputting a sum value obtained by the addition so that the sum value becomes a horizontal axis coordinate value converted from the point cloud to a world coordinate system.
The first input end of the first discrete subtracter is connected with the output end of the first discrete adder; the second input end of the first discrete subtracter is used for receiving the maximum abscissa value of the map transmitted by the interconnection bus; preferably, the maximum abscissa value of the map transmitted by the interconnection bus is supported to be registered in a register inside the point cloud discrete submodule, and the second input end of the first discrete subtractor receives the maximum abscissa value of the map transmitted by the interconnection bus through the connected register. As shown in FIG. 1, the second input terminal of the first discrete subtractor and the register MAX x (marked with MAX x Block of) in which the register MAX is connected x The system is used for caching the maximum abscissa value of the map transmitted by the interconnection bus, and the maximum abscissa value of the map is preferably a boundary coordinate in the transverse direction of the map; register MAX x And transmitting the maximum abscissa value of the current cached map to a second input end of the first discrete subtracter. The maximum abscissa value of the map is derived from a map size register and is transmitted to the interconnection bus by the map size register; the map size register is a parameter register arranged in the hardware acceleration circuit and used for storing the size range of the grid map meeting the transmission requirement of the bus bit width; the parameter register set shown in fig. 1 includes a map size register. A first discrete subtracter for controlling the subtraction of the maximum abscissa value of the map and the sum value output by the first discrete adder, then outputting the difference value obtained by the subtraction, and the first discrete subtracter outputsThe difference value of (a) becomes an abscissa value of the grid map satisfying the transmission requirement of the bus bit width.
The first input end of the first discrete multiplier is connected with the output end of the first discrete subtracter; the second input end of the first discrete multiplier is used for receiving the reciprocal of the map resolution transmitted by the interconnection bus; preferably, the reciprocal of the map resolution transmitted by the interconnection bus is supported to be registered in a register inside the point cloud discrete submodule, and the second input end of the first discrete multiplier receives the reciprocal of the map resolution transmitted by the interconnection bus through the connected register.
In the embodiment shown in fig. 1, the second input terminal of the first discrete multiplier is connected to a register 1/R (a box marked with 1/R), where the register 1/R is used for buffering the inverse number of the map resolution transmitted by the interconnection bus, and then transmitting the inverse number of the currently buffered map resolution to the second input terminal of the first discrete multiplier, so that the second input terminal of the first discrete multiplier receives the inverse number of the map resolution transmitted by the interconnection bus; wherein the inverse of the map resolution is derived from a map resolution register and transmitted by the map resolution register to the interconnect bus; the map resolution register is a parameter register arranged in the hardware acceleration circuit and used for storing resolution information of the first grid map to be searched corresponding to one layer. The parameter register set shown in fig. 1 includes a map resolution register. The first discrete multiplier is used for controlling the reciprocal of the map resolution to be multiplied by the difference value output by the first discrete subtracter, outputting a product value obtained by multiplication, and configuring the product value as an abscissa value of the discrete point cloud; a first discrete multiplier for outputting the product value obtained by the multiplication to the memory module through the interconnection bus; and the product value output by the first discrete multiplier is configured as an abscissa value of the discrete point cloud, discretization of the abscissa is determined to be completed, the aim that the abscissa value of the currently stored point cloud is aligned into the coordinate system of the grid map to be searched is achieved, and meanwhile, the product value output by the first discrete multiplier is set as a discretization result output by the third pipeline structure.
As an embodiment, the point cloud discretization sub-module further comprises a fourth pipeline structure; and the fourth pipeline structure is used for reading the vertical coordinate result (actually, the output result of the first pipeline structure in a specific clock cycle) converted by the current rotation output by the first pipeline structure under the control of the state machine module, controlling the vertical coordinate result converted by the current rotation to execute discretization, setting the discretization result as the vertical coordinate value of the discrete point cloud, and storing the vertical coordinate value of the discrete point cloud into the memory module through the interconnection bus.
The fourth pipeline structure comprises a first discrete subtracter, a second discrete adder and a second discrete multiplier; the second discrete adders all belong to adders, and are circles marked with "+" in the point cloud discretization sub-module of fig. 1; the second discrete subtractors all belong to subtractors, and a circle marked with a mark is arranged in the point cloud discretization submodule of the point cloud discretization in the figure 1; the second discrete multipliers belong to the multipliers, and are circles marked with an "x" in the point cloud discretization sub-module of fig. 1.
The first input end of the second discrete adder is connected with the output end of the point cloud adder of the second pipeline structure, and the first input end of the second discrete adder is used for receiving the ordinate result converted from the current rotation.
The second input end of the second discrete adder is used for receiving the vertical coordinate of the robot position transmitted by the interconnection bus, wherein the vertical coordinate of the robot position is a pre-calculated vertical coordinate value of the robot in a world coordinate system; preferably, the vertical coordinate of the robot position transmitted by the interconnection bus is registered in a register inside the point cloud discretization sub-module, and the second input end of the second discretization adder receives the vertical coordinate of the robot position transmitted by the interconnection bus through the connected register. As shown in FIG. 1, the second input terminal of the second discrete adder is connected to the register Pose y (labeled with Pose y Block of) phase, in which the register Pose y For caching the mutualAnd the vertical coordinate of the robot position transmitted by the connecting bus is transmitted to the second input end of the second discrete adder. The robot position register is used for storing a longitudinal coordinate of the robot position and a transverse coordinate of the robot position, and the robot position register is a parameter register arranged in the hardware acceleration circuit; and the second discrete adder is used for controlling the addition of the vertical coordinate result converted from the current rotation and the vertical coordinate of the position of the robot, and outputting a sum obtained by the addition so that the sum becomes a vertical coordinate value converted from the point cloud to the world coordinate system.
The first input end of the second discrete subtracter is connected with the output end of the second discrete adder; the second input end of the second discrete subtracter is used for receiving the maximum longitudinal coordinate value of the map transmitted by the interconnection bus; preferably, the maximum longitudinal coordinate value of the map transmitted by the interconnection bus is supported to be registered in a register inside the point cloud discrete submodule, and the second input end of the second discrete subtractor receives the maximum longitudinal coordinate value of the map transmitted by the interconnection bus through the connected register. As shown in FIG. 1, the second input terminal of the second discrete subtractor is connected to the register MAX y (marked with MAX y Block of) in which the register MAX is connected y The device is used for caching the maximum longitudinal coordinate value of the map (preferably a boundary coordinate in the longitudinal direction of the map) transmitted by the interconnection bus and then transmitting the maximum longitudinal coordinate value of the map cached currently to the second input end of the second discrete subtracter. The map maximum ordinate value is derived from the map size register and transmitted by the map size register to the interconnect bus; the second discrete subtracter is used for controlling the sum value output by the second discrete adder to be subtracted from the maximum longitudinal coordinate value of the map, and then outputting the difference value obtained by subtracting; and the difference value output by the second discrete subtracter becomes the longitudinal coordinate value of the grid map meeting the transmission requirement of the bus bit width.
The first input end of the second discrete multiplier is connected with the output end of the second discrete subtracter; the second input end of the second discrete multiplier is used for receiving the reciprocal of the map resolution transmitted by the interconnection bus; preferably, the reciprocal of the map resolution transmitted by the interconnection bus is supported to be registered in a register inside the point cloud discrete submodule, and a second input end of the second discrete multiplier receives the reciprocal of the map resolution transmitted by the interconnection bus through the connected register. As shown in FIG. 1, the second input of the second discrete multiplier is connected to register 1/R (the box labeled 1/R); the second discrete multiplier is used for controlling the reciprocal of the map resolution to be multiplied by the difference value output by the second discrete subtracter, outputting a product value obtained by multiplication, and configuring the product value as a longitudinal coordinate value of the discrete point cloud; the product value output by the second discrete multiplier is configured to be a longitudinal coordinate value of the discrete point cloud, the discretization of the longitudinal coordinate is determined to be completed, the longitudinal coordinate value of the currently stored point cloud is aligned to the coordinate system of the grid map to be searched, and meanwhile, the product value output by the second discrete multiplier is set to be a discretization result output by the fourth pipeline structure; and the second discrete multiplier is also used for outputting a product value obtained by multiplication to the memory module through the interconnection bus and determining that the product value is a matched vertical coordinate of the current alignment matched node on the grid map to be searched. A second discrete multiplier further configured to output the multiplied product value to the memory module via the interconnect bus.
Based on the above embodiments of the third pipeline structure and the fourth pipeline structure, it can be known that the third pipeline structure and the fourth pipeline structure are parallel; for each point cloud, reading an abscissa and an ordinate from a storage space in which the point cloud is stored, inputting the point cloud discretization sub-module, and transforming the abscissa and the ordinate of the point cloud from a laser coordinate system to a coordinate system of the grid map to be searched in the same offset mode along the respectively adaptive coordinate axis direction through an adder and a subtracter, wherein the coordinate axis of the point cloud is exchanged to realize coordinate system transformation; then multiplying the data by the reciprocal of the map resolution, and completing discretization processing of the horizontal and vertical coordinates of the point cloud through a hardware production line.
As an embodiment, as shown in fig. 1, the grid index module includes a fifth pipeline structure; the fifth pipeline structure comprises a first index subtracter and a first index adder; the first index subtracter belongs to a subtracter, and a circle marked with a mark is arranged in the grid index module of fig. 1; the first index adder belongs to the adder and is a circle marked with "+" in the grid index block of fig. 1.
The first input end of the first index subtracter is used for receiving the abscissa of the discrete point cloud transmitted by the interconnection bus, wherein the abscissa of the discrete point cloud transmitted by the interconnection bus to the first index subtracter is derived from the memory module; preferably, the abscissa of the discrete point cloud transmitted by the interconnection bus is supported to be registered in a register inside the grid index module, and the first input end of the first index subtracter receives the abscissa of the discrete point cloud transmitted by the interconnection bus through the connected register. In the embodiment shown in FIG. 1, the first input terminal of the first index subtractor (the first subtractor arranged from top to bottom in the grid index module of FIG. 1) and the register DP x (marked with DP x Block of) in which the register DP is connected, wherein x The system comprises a first index subtracter, a second index subtracter, a first bus and a second bus, wherein the first index subtracter is used for buffering the abscissa of the discrete point cloud transmitted by the interconnection bus, and then transmitting the abscissa of the currently buffered discrete point cloud to the first input end of the first index subtracter to enable the first input end of the first index subtracter to receive the abscissa of the discrete point cloud transmitted by the interconnection bus; wherein the discrete point cloud transmitted by the interconnection bus to the first index subtractor is derived from the memory module.
The second input end of the first index subtracter is used for receiving the horizontal axis coordinate offset value transmitted by the interconnection bus; the preset coordinate deviation value comprises a horizontal axis coordinate deviation value; preferably, the horizontal axis coordinate offset value transmitted by the interconnection bus is supported to be registered in a register inside the grid index submodule, and the second input end of the first index subtracter receives the horizontal axis coordinate offset value transmitted by the interconnection bus through the connected register. As shown in fig. 1, the second input terminal of the first index subtractor is connected to the register Offset x (marked with Offset) x Block of) is connected, wherein register Offset x The second input end of the first index subtracter is used for receiving the cross-axis coordinate deviation value transmitted by the interconnection bus. The preset coordinate deviation value comprises a horizontal axis coordinate deviation value, and the horizontal axis coordinate deviation value transmitted by the interconnection bus is derived from the preset coordinate deviation value stored in the map deviation value register; the map offset value register is a parameter register arranged in the hardware acceleration circuit and is used for storing the coordinate offset value associated with the grid map to be searched; the parameter register set shown in fig. 1 includes a map offset value register. And the first index subtracter is used for controlling the subtraction of the abscissa of the discrete point cloud received by the first input end and the abscissa offset value received by the second input end and outputting a difference value.
The first input end of the first index adder is connected with the output end of the first index subtracter; the second input end of the first index adder is used for receiving the horizontal axis coordinate searching step length transmitted by the interconnection bus; wherein the search step comprises a horizontal axis coordinate search step; preferably, the horizontal axis coordinate search step transmitted by the interconnection bus is supported to be registered in a register inside the grid index submodule, and the second input end of the first index adder receives the horizontal axis coordinate search step transmitted by the interconnection bus through a connected register. As shown in FIG. 1, the second input terminal of the first index adder and the register Step x (labeled with Step) x Block of (1), wherein register Step x And the second input end is used for caching the horizontal axis coordinate searching step length transmitted by the interconnection bus and then transmitting the currently cached horizontal axis coordinate searching step length to the second input end of the first index adder. The horizontal axis coordinate searching step transmitted by the interconnection bus is derived from the searching step stored in the searching window parameter register; the search window parameter register is a parameter register set in the hardware acceleration circuit for storing the search windowSearching information of the positions of the nodes in the windows and the associated child nodes; the set of parameter registers shown in fig. 1 includes a search window parameter register. The sum value output by the first index adder is configured as the index value of the direction of the horizontal axis of the discrete point cloud mapped to the map storage space and is set as the result output by the fifth pipeline structure.
For one embodiment, the grid index module includes a sixth pipeline structure; the sixth pipeline structure comprises a second index subtracter and a second index adder; wherein, the second index subtracter belongs to the subtracter, and the circle marked with "-" is in the grid index module of fig. 1; the second index adder belongs to the adder and is a circle marked with "+" in the grid index block of fig. 1.
The first input end of the second index subtracter is used for receiving the vertical coordinate of the discrete point cloud transmitted by the interconnection bus, preferably, the vertical coordinate of the discrete point cloud transmitted by the interconnection bus is supported to be registered in a register inside the grid index module, and the first input end of the second index subtracter receives the vertical coordinate of the discrete point cloud transmitted by the interconnection bus through the connected register. As shown in FIG. 1, the first input terminal of the second index subtracter (the second subtracter arranged from top to bottom in the grid index module of FIG. 1) and the register DP y (marked with DP y Block) of the register DP, wherein the register DP is connected to the output of the register DP y The device is used for caching the vertical coordinate of the discrete point cloud transmitted by the interconnection bus, transmitting the vertical coordinate of the currently cached discrete point cloud to the first input end of the second index subtracter, and enabling the first input end of the second index subtracter to finish receiving the vertical coordinate of the discrete point cloud transmitted by the interconnection bus; wherein the vertical coordinate of the discrete point cloud transmitted to the second index subtracter by the interconnection bus is derived from the memory module.
The second input end of the second index subtracter is used for receiving the vertical axis coordinate offset value transmitted by the interconnection bus; wherein the preset coordinate offset value comprises a vertical axis coordinate offset value; preferably, the longitudinal axis coordinate offset transmitted by the interconnection busThe value support is registered in a register inside the grid index module, and a second input end of the second index subtracter receives the vertical axis coordinate deviation value transmitted by the interconnection bus through the connected register. As shown in FIG. 1, the second input terminal of the second index subtractor is connected to the register Offset y (marked with Offset) y Block of) phase connection, wherein register Offset y The vertical axis coordinate deviation value is used for buffering the vertical axis coordinate deviation value transmitted by the interconnection bus, and then the current buffered vertical axis coordinate deviation value is transmitted to a second input end of a second index subtracter; the preset coordinate deviation value also comprises a vertical axis coordinate deviation value, and the vertical axis coordinate deviation value transmitted by the interconnection bus is also derived from the preset coordinate deviation value stored in the map deviation value register. And the second index subtracter is used for controlling the subtraction of the ordinate of the discrete point cloud received by the first input end and the ordinate offset value received by the second input end and outputting a difference value.
The first input end of the second index adder is connected with the output end of the second index subtracter; the second input end of the second index adder is used for receiving the longitudinal axis coordinate searching step length transmitted by the interconnection bus; preferably, the search step of the longitudinal axis coordinate transmitted by the interconnection bus is supported to be registered in a register inside the grid index submodule, and the second input end of the second index adder receives the search step of the longitudinal axis coordinate transmitted by the interconnection bus through the connected register. As shown in FIG. 1, the second input terminal of the second index adder is connected to the register Step y (labeled with Step) y Block of (1), wherein register Step y And the searching step length of the longitudinal axis coordinate transmitted by the interconnection bus is cached, and then the searching step length of the current cached longitudinal axis coordinate is transmitted to the second input end of the second index adder. The search step length also comprises a longitudinal axis coordinate search step length, and the longitudinal axis coordinate search step length transmitted by the interconnection bus is also derived from the search step length stored in the search window parameter register; the search step length also comprises a vertical axis coordinate search step length, and the vertical axis coordinate search step length transmitted by the interconnection bus is also derived from the search window parameterThe search step length stored in a register; the sum value output by the second index adder is configured as the longitudinal axis direction index value of the discrete point cloud mapped to the map storage space and is set as the result output by the sixth pipeline structure.
The fifth pipeline structure and the sixth pipeline structure are parallel in the same grid index module, and by using an addition and subtraction operation combined structure, a horizontal axis direction index value and a vertical axis direction index value mapped into a map storage space are respectively calculated, so that the horizontal coordinate and the vertical coordinate of the discrete point cloud are parallelly converted into the discrete index information of the grid points currently participating in searching the grid map to be searched.
Preferably, the grid index module further comprises a third index adder and an index multiplier, the third index adder belongs to an adder, and corresponds to the first adder arranged from left to right in the grid index module of fig. 1; the index multiplier belongs to the multiplier, belonging to the circle marked with an "x" uniquely in the grid index module of fig. 1. The first input end of the index multiplier is connected with the output end of the second index adder, and the first input end of the index multiplier is used for receiving the sum value output by the second index adder; a second input end of the index multiplier is used for receiving the row grid number transmitted by the interconnection bus, wherein the row grid number is the number of grids which can be occupied by each row of the map storage space; preferably, the number of the row grids transmitted by the interconnection bus is supported to be registered in a register inside the grid index module, and the second input end of the index multiplier receives the number of the row grids transmitted by the interconnection bus through the connected register. In the embodiment shown in FIG. 1, the second input of the index multiplier is coupled to a register NUM x (marked with NUM) x Block of) phase, wherein the register NUM x The second input end of the index multiplier is used for receiving the row grid quantity transmitted by the interconnection bus; wherein the number of the line grids is the groundThe number of grids that can be occupied by each line of map storage space is stored in a map size register and is transmitted to the interconnection bus by the map size register; the parameter register set shown in fig. 1 includes the map size register. The map size register is a parameter register arranged in the hardware acceleration circuit and is used for storing the size range of the grid map to be searched and associated extended information; and the index multiplier is used for controlling the row grid quantity to be multiplied by the sum value output by the second index adder and then outputting the product. The first input end of the third index adder is connected with the output end of the first index adder, the second input end of the third index adder is connected with the output end of the index multiplier, the third index adder is used for controlling the product of the number of the line grids and the sum value output by the second index adder to be added with the index value in the horizontal axis direction, then the sum value obtained by addition is set as the index value of the discrete point cloud mapping in the map storage space, and the index value is sent to the interconnection bus, so that the occupation probability of matching of the index values can be read from the memory module. In this preferred embodiment, the index multiplier controls the number of the row grids to be multiplied by the longitudinal axis direction index value, and then the third index adder controls the product of the transverse axis direction index value and the output of the index multiplier to be added, and the index value mapped to the occupation probability by the discrete point cloud is obtained by calculation in a row scanning query manner, and is also used as the index value mapped to the map storage space, which is equivalent to the storage address of the occupation probability of the corresponding grid point in the map storage space and the read address of the occupation probability stored in the map storage space read externally, so that the interconnection bus can read the occupation probability from the memory module.
Preferably, the grid index module further comprises a third index adder and an index multiplier, the third index adder belongs to the adder, and the index multiplier belongs to the multiplier; a first input end of the index multiplier is connected with an output end of the first index adder; the second input end of the index multiplier is used for receiving the number of the column grids transmitted by the interconnection bus; the grid number of the columns is the number of grids which can be occupied by each column of the map storage space, and the grid number is stored in a map size register and is transmitted to the interconnection bus by the map size register; the map size register is a parameter register set inside the hardware acceleration circuit and is used for storing the size range of the grid map to be searched and associated extended information. The first input end of the third index adder is connected with the output end of the second index adder, and the second input end of the third index adder is connected with the output end of the index multiplier; and the third index adder is used for controlling the addition of the product of the number of the column grids and the sum value output by the first index adder and the index value in the longitudinal axis direction, setting the sum value obtained by the addition as an index value of the discrete point cloud mapped into the map storage space and sending the index value to the interconnection bus. In this preferred embodiment, the index multiplier controls the number of columns of grids to be multiplied by the index value in the horizontal axis direction, then the third index adder controls the addition of the index value in the vertical axis direction and the product output by the index multiplier, and the index value of the occupation probability of the discrete point cloud mapped to the corresponding grid point is obtained by calculation in a column scanning query manner and is also used as the index value mapped to the map storage space, which is equivalent to the storage address of the occupation probability of the grid point in the map storage space and the read address of the occupation probability stored in the map storage space read from the outside, so that the interconnection bus reads the occupation probability from the memory module.
On the basis of the above embodiment, the state machine module belongs to a finite state machine; the state machine module is used for scheduling the working states of the memory module, the point cloud processing module and the grid index module, so that the grid index module calculates an index value of a point cloud at a matching grid point of the grid map to be searched and determines to finish a matching operation when the point cloud processing module executes the rotation transformation and the discretization once; the index value is the index value of the probability of occupation of a grid point matched with the point cloud in the grid map to be searched. Specifically, the point cloud processing module is further configured to update the currently stored point cloud by using a result obtained by the current rotation transformation each time a matching operation is completed or the rotation transformation is performed each time, so that the point cloud processing module performs a new rotation transformation only by using the updated currently stored point cloud until an index value of an occupation probability corresponding to all currently acquired point clouds is calculated, that is, all poses in the searched search window are counted, and occupation probabilities matched with all the poses are obtained. In the embodiment of the state machine module, the state machine module controls the point cloud processing module to sequentially perform one-time rotation and one-time discretization processing on each point cloud in a way of hardware scheduling working state and interrupt signal, so as to read out the occupation probability of index value matching from the memory module, realize the operation of a circulation state and ensure the real-time performance of the hardware accelerating circuit for searching the map position.
In this embodiment, the hardware acceleration circuit is controlled by a finite state machine to run a correlation scan matching algorithm to align point clouds to a grid map and obtain an index value, where the finite state machine includes an idle state, a point cloud reading state, a trigonometric function reading state, a map reading state, a reset reading state, a point cloud rasterizing state, a rasterization reset state, a search state, a loop state, and a data writing state.
The initial state is an idle state, the working state of the state machine module is switched to a Point Cloud reading state, and at the moment, the state machine module is used for controlling the memory module to read the currently stored Point Cloud from an external FIFO (first in first out) and then store the read Point Cloud into a first block of storage space, namely, the Point Cloud is transmitted to a block of storage space corresponding to a square frame marked with Point Cloud by the external FIFO.
If the currently collected circle of point clouds are read completely, the state jumps to a read trigonometric function state, and at the moment, the memory module is controlled to read a pre-calculated trigonometric function value from an external read FIFO (first in first out) module, wherein the pre-calculated trigonometric function value comprises the cosine function value corresponding to the rotation angle reached by the current rotation transformation and the sine function value corresponding to the rotation angle reached by the current rotation transformation; and storing the read precomputed trigonometric function value into a second storage space.
And after all the precomputed trigonometric function values are read, the state jumps to a map reading state, the memory module is controlled to read the grid map to be searched from an external FIFO, and the read weight values are stored in a map storage space.
After the grid map to be searched is completely stored in the map storage space, the working state jumps to a reading reset state, the reading address of the memory module is controlled to reset, meanwhile, the writing address of the memory module is controlled to start to self-add, and a fourth storage space is opened up, so that the discrete point cloud output by the point cloud processing module has an empty storage space in the memory module.
After the write address of the memory module completes address setting in a self-adding mode, the working state jumps to a point cloud rasterization state, the point cloud processing module is controlled to execute the rotation conversion and discretization operations, then the discretization result (including an output result of a third pipeline structure and an output result of a fourth pipeline structure) is controlled to be written into the fourth storage space through the interconnection bus, a result obtained by current coordinate conversion in the correlation scan matching of the raster map to be searched is written into the fourth storage space, after the result is completely written into the fourth storage space, the working state jumps to a rasterization reset state, the read address and the write address of the first storage space are controlled to be reset, the read address of the fourth storage space is controlled to be reset, and the write address of the fourth storage space is controlled to start self-adding, so that an empty address space is continuously opened in the fourth storage space for storing a result obtained by next discretization; the point cloud participating in the next discretization is a result obtained by performing one or more times of rotation transformation on the point cloud processing module, and the discretization times are different particularly according to the searching pose defined by the searching window; preferably, the number of times of the rotation transformation is set according to the number of the point clouds, and the present embodiment sets the number of times of the rotation transformation to be equal to the number of cycles of the correlation scan matching algorithm. After an address space is set for a result obtained by next discretization in the fourth storage space, the working state is converted into a searching state, and the grid index module is controlled to calculate an index value of the discrete point cloud mapped into the map storage space, namely an index value of a matching grid point of the grid map to be searched; and reading out corresponding occupation probability from the grid map to be searched which is stored in the map storage space according to the index value, obtaining a probability value of corresponding pose matching searched in the search window, and transmitting the probability value to the interconnection bus.
Then jumping to a circulating state in a working state, setting a value of a circulating register according to a coordinate index value of a current searching pose in the searching window, if the searching of all poses in the searching window is judged not to be completed, controlling the jumping to the point cloud rasterization state in the working state, and controlling the point cloud processing module to execute the rotation transformation and discretization again to repeat the working state; in the process of repeating the working state, when the matching operation is completed once or the rotation transformation is executed once, the currently stored point cloud is updated by using the result obtained by the current rotation transformation, so that the point cloud processing module only uses the updated currently stored point cloud to execute a new rotation transformation until the index values of the occupation probabilities corresponding to all the currently acquired point clouds are calculated, namely, all the positions and postures in the searched searching window are counted, and the occupation probabilities matched with all the positions and postures are obtained.
Preferably, if it is determined that the search of all the poses in the search window is completed, the working state jumps to a data writing state, that is, the grid index module counts all the poses existing in the search window, then writes all the occupation probabilities obtained by the operation into an external write FIFO through the interconnection bus, and after the data writing is completed, the working state jumps back to an idle state to wait for the start of the next accelerated computation.
As an embodiment, a bus interface module is arranged outside the hardware acceleration circuit, and the bus interface module comprises a DMA controller module and a transmission bus; preferably, the bus interface module includes an AXI DMA module and an AXI bus.
The DMA controller module is used for continuously transmitting data stored in the physical storage space of the discontinuous addresses in batches and reducing the triggering times of software interruption of the CPU; specifically, the DMA controller module needs a device driver of the CPU to generate a linked list for storing data addresses, describes a discontinuous physical space with the linked list, and sends an initial address of the linked list to the DMA controller module, and after the DMA controller module transmits a piece of data to the hardware acceleration circuit, the DMA controller module sequentially reads an address of a next linked list until all data is transmitted.
The transmission bus comprises a first bus and a second bus, preferably the transmission bus is an AXI bus, comprising two AXI buses, an AXI-Lite bus and an AXI-Stream bus, respectively. The former is suitable for simple, low throughput memory-mapped communications, and the latter is for high-speed streaming data. The first bus is preferably an AXI-Lite bus, establishes a connection relationship between the processor unit and the hardware acceleration circuit and the DMA controller module, can send control signals to the memory module, the point cloud processing module, the grid index module, the state machine module and the interconnection bus, and monitors and reads the working state of the state machine module; the first bus is in signal transceiving contact with the memory module, the point cloud processing module, the grid index module, the state machine module, the interconnection bus and the DMA controller module respectively, and is used for configuring data transmission parameters for the DMA controller module, configuring parameters stored in the map size register, parameters stored in the map resolution register, expansion parameters required by the maximum detection radius reached by the currently stored point cloud, expansion parameters required by executing discretization (including the preset coordinate offset value), expansion parameters required by executing rotation transformation and search step length so as to realize mapping communication of the memory of the hardware acceleration circuit. The second bus is preferably an AXI-Stream bus, is connected with the DMA controller module and is used for transmitting trigonometric function values preset by a CPU (central processing unit), the grid map to be searched and point clouds currently acquired by the laser radar to the memory module; wherein the transmission bus is in compliance with AMBA protocol. The embodiment provides a bus interface architecture module for the hardware acceleration circuit and an external data source (controller), and respectively designs a first bus suitable for simple and low-throughput memory mapping communication (transmitting extension parameters, map characteristic parameters and associated basic control signals to a register inside the hardware acceleration circuit) and a second bus aiming at high-speed data flow (transmitting a preconfigured trigonometric function value, the grid map to be searched and point clouds currently collected by a laser radar) according to data transmission performance; the real-time performance of the operation work of the hardware acceleration circuit is improved.
The embodiment of the invention also discloses a chip which integrates the hardware acceleration circuit. When a processor unit and a bus interface outside a chip and the hardware acceleration circuit are not integrated on the same chip, the processor unit does not control parallel acceleration operation executed by the hardware acceleration circuit, and the processor unit only provides operation parameters, a starting signal, an interrupt signal, a reset signal and a flag bit clearing signal for the hardware acceleration circuit through the bus interface; when the processor unit, the bus interface and the hardware acceleration circuit are integrated on the same chip, the hardware acceleration circuit is applied to a heterogeneous chip. The heterogeneous chip makes a balance between precision and cost; the hardware acceleration circuit is suitable for a data parallel algorithm, so that the acceleration obtained by the heterogeneous chip promotes the real-time processing of the point cloud map with a high frame rate.
The invention further discloses a robot, wherein the chip is assembled in the robot and used for positioning in the pre-constructed grid map. Compared with the prior art that the robot executes a correlation scanning matching algorithm in a pure software platform mode, the technical scheme has the advantage of obvious running speed even under the condition that the frequency of the hardware acceleration circuit is not high, and can better meet the repositioning requirement of the mobile robot platform.
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and are intended to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the scope of the present invention. All equivalent changes or modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims (17)

1. A hardware accelerating circuit based on a correlation scanning matching algorithm is characterized in that the hardware circuit is electrically connected with a laser radar; the hardware circuit comprises a memory module, a point cloud processing module, a grid index module, a state machine module and an interconnection bus; the memory module, the point cloud processing module, the grid index module and the state machine module establish a data transmission relation through the interconnection bus;
the point cloud processing module is also used for reading the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read point cloud to execute rotation transformation;
the point cloud processing module is used for reading the result of the current rotation transformation under the control of the state machine module, controlling the result of the current rotation transformation to perform discretization, setting the discretization result into discrete point cloud, and storing the discrete point cloud into the memory module through the interconnection bus;
the grid index module is used for acquiring the discrete point cloud from the memory module under the control of the state machine module, calculating an index value of the discrete point cloud mapped to a map storage space according to a preset coordinate offset value and a search step length, and setting the index value as a reading address of the occupation probability of the corresponding grid point;
the memory module is provided with a map storage space and is used for storing the grid map to be searched transmitted by the bus interface;
the memory module is used for storing a circle of point cloud collected by the laser radar and a preset trigonometric function value; the currently stored point cloud is a circle of point cloud collected by a laser radar;
the occupation probability of the corresponding grid point is the probability value of the grid point matched in the grid map to be searched after the point cloud is processed by the point cloud processing module; and in the grid map to be searched, a matched index value exists in the occupation probability of the grid points.
2. The hardware acceleration circuit of claim 1, wherein the point cloud processing module comprises a point cloud rotation sub-module;
the point cloud rotation submodule comprises a first register, a second register, a first pipeline structure and a second pipeline structure;
the first register and the second register are connected with a first pipeline structure, and the first pipeline structure is used for reading the abscissa of the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read abscissa of the point cloud to execute rotation transformation;
the first register and the second register are connected with a second pipeline structure, and the second pipeline structure is used for reading the vertical coordinate of the currently stored point cloud from the memory module under the control of the state machine module and then controlling the read vertical coordinate of the point cloud to execute rotation transformation;
the interconnection bus transmits the abscissa of the point cloud to the first register; the abscissa of the point cloud transmitted by the interconnection bus to the first register is derived from the abscissa of the currently stored point cloud;
wherein the interconnection bus caches a vertical coordinate of the point cloud to a second register; the vertical coordinate of the point cloud transmitted to the second register by the interconnection bus is derived from the vertical coordinate of the currently stored point cloud;
wherein the first and second pipeline structures are parallel pipeline structures.
3. The hardware acceleration circuit of claim 2, wherein the first pipeline structure comprises a first point cloud multiplier, a second point cloud multiplier, and a point cloud subtractor; the first point cloud multiplier and the second point cloud multiplier belong to multipliers, and the point cloud subtracter belongs to a subtracter;
the first input end of the first point cloud multiplier is connected with the output end of the first register;
the second input end of the first point cloud multiplier is used for receiving a cosine function value corresponding to a rotation angle reached by the current rotation transformation, wherein the rotation angle reached by the current rotation transformation is the sum of an angle search step length and the rotation angle reached by the last rotation transformation;
the first input end of the second point cloud multiplier is connected with the output end of the second register;
the second input end of the second point cloud multiplier is used for receiving a sine function value corresponding to the rotation angle reached by the current rotation transformation;
the first input end of the point cloud subtracter is connected with the output end of the first point cloud multiplier, and the second input end of the point cloud subtracter is connected with the output end of the second point cloud multiplier;
the point cloud subtracter is used for transmitting the output difference value to the point cloud discretization submodule and setting the difference value output by the point cloud subtracter as an abscissa result output by the first assembly line structure and converted from the current rotation.
4. The hardware acceleration circuit of claim 2 wherein the second pipeline structure comprises a third point cloud multiplier, a fourth point cloud multiplier, and a point cloud adder; the third point cloud multiplier and the fourth point cloud multiplier belong to multipliers, and the point cloud adder belongs to an adder;
a first input end of the third point cloud multiplier is connected with an output end of the first register;
a second input end of the third point cloud multiplier, configured to receive a sine function value corresponding to a rotation angle reached by the current rotation transformation; the rotation angle reached by the current rotation transformation is the sum of the angle search step length and the rotation angle reached by the last rotation transformation;
the first input end of the fourth point cloud multiplier is connected with the output end of the second register;
a second input end of the fourth point cloud multiplier, configured to receive a cosine function value corresponding to the rotation angle reached by the current rotation transformation;
the first input end of the point cloud adder is connected with the output end of the third point cloud multiplier, and the second input end of the point cloud adder is connected with the output end of the fourth point cloud multiplier;
and the point cloud adder is used for transmitting the output sum value to the point cloud discretization submodule and setting the sum value output by the point cloud adder as a vertical coordinate result converted from the current rotation output by the second pipeline structure.
5. The hardware acceleration circuit of claim 3, wherein the point cloud processing module further comprises a point cloud discretization sub-module; the point cloud discretization submodule comprises a third pipeline structure; and the third pipeline structure is used for reading the abscissa result output by the first pipeline structure and converted from the current rotation under the control of the state machine module, controlling the abscissa result converted from the current rotation to execute discretization, setting the discretization result as the abscissa value of the discrete point cloud, and storing the abscissa value of the discrete point cloud into the memory module through the interconnection bus.
6. The hardware acceleration circuit of claim 5, wherein the third pipeline structure comprises a first discrete adder, a first discrete subtractor, and a first discrete multiplier; the first discrete adder belongs to the adder, the first discrete subtracter belongs to the subtracter, and the first discrete multiplier belongs to the multiplier;
the first input end of the first discrete adder is connected with the output end of the point cloud subtracter in the first pipeline structure, and the first input end of the first discrete adder is used for receiving the abscissa result converted from the current rotation;
the second input end of the first discrete adder is used for receiving the abscissa of the robot position transmitted by the interconnection bus, wherein the abscissa of the robot position is a pre-calculated abscissa coordinate value of the robot in a world coordinate system;
the first discrete adder is used for controlling the addition of the abscissa result converted by the current rotation and the abscissa of the robot position, and outputting a sum value obtained by the addition so that the sum value becomes a horizontal axis coordinate value converted from point cloud to a world coordinate system;
the first input end of the first discrete subtracter is connected with the output end of the first discrete adder; the second input end of the first discrete subtracter is used for receiving the maximum abscissa value of the map transmitted by the interconnection bus;
the first input end of the first discrete multiplier is connected with the output end of the first discrete subtracter; the second input end of the first discrete multiplier is used for receiving the reciprocal of the map resolution transmitted by the interconnection bus;
a first discrete multiplier for outputting the product value obtained by the multiplication to the memory module through the interconnection bus; and the product value output by the first discrete multiplier is configured as an abscissa value of the discrete point cloud, discretization of the abscissa is determined to be completed, the aim that the abscissa value of the currently stored point cloud is aligned into the coordinate system of the grid map to be searched is achieved, and meanwhile, the product value output by the first discrete multiplier is set as a discretization result output by the third pipeline structure.
7. The hardware acceleration circuit of claim 4, wherein the point cloud discretization sub-module further comprises a fourth pipeline structure; and the fourth pipeline structure is used for reading the vertical coordinate result output by the second pipeline structure and converted from the current rotation under the control of the state machine module, controlling the vertical coordinate result converted from the current rotation to execute discretization, setting the discretization result as the vertical coordinate value of the discrete point cloud, and storing the vertical coordinate value of the discrete point cloud into the memory module through the interconnection bus.
8. The hardware acceleration circuit of claim 7, wherein the fourth pipeline structure comprises a first discrete subtractor, a second discrete adder, and a second discrete multiplier; the second discrete adders belong to the adders, the second discrete subtractors belong to the subtractors, and the second discrete multipliers belong to the multipliers;
the first input end of the second discrete adder is connected with the output end of the point cloud adder of the second pipeline structure, and the first input end of the second discrete adder is used for receiving the ordinate result converted from the current rotation;
the second input end of the second discrete adder is used for receiving the vertical coordinate of the robot position transmitted by the interconnection bus, wherein the vertical coordinate of the robot position is a pre-calculated vertical coordinate value of the robot in a world coordinate system;
the second discrete adder is used for controlling the addition of the ordinate result converted by the current rotation and the ordinate of the robot position, and outputting the sum obtained by the addition so that the sum becomes the ordinate value of the point cloud converted to the world coordinate system;
the first input end of the second discrete subtracter is connected with the output end of the second discrete adder; the second input end of the second discrete subtracter is used for receiving the maximum longitudinal coordinate value of the map transmitted by the interconnection bus;
the first input end of the second discrete multiplier is connected with the output end of the second discrete subtracter; the second input end of the second discrete multiplier is used for receiving the reciprocal of the map resolution transmitted by the interconnection bus; the product value output by the second discrete multiplier is configured to be a longitudinal coordinate value of the discrete point cloud, the discretization of the longitudinal coordinate is determined to be completed, the longitudinal coordinate value of the currently stored point cloud is aligned to the coordinate system of the grid map to be searched, and meanwhile, the product value output by the second discrete multiplier is set to be a discretization result output by the fourth pipeline structure;
a second discrete multiplier further configured to output the multiplied product value to the memory module via the interconnect bus.
9. The hardware acceleration circuit of claim 5, wherein the grid index module comprises a fifth pipeline structure; the fifth pipeline structure comprises a first index subtracter and a first index adder, wherein the first index subtracter belongs to the subtracter, and the first index adder belongs to the adder;
the first input end of the first index subtracter is used for receiving the abscissa of the discrete point cloud transmitted by the interconnection bus;
the second input end of the first index subtracter is used for receiving the horizontal axis coordinate offset value transmitted by the interconnection bus; the preset coordinate deviation value comprises a horizontal axis coordinate deviation value;
the first input end of the first index adder is connected with the output end of the first index subtracter; the second input end of the first index adder is used for receiving the horizontal axis coordinate searching step length transmitted by the interconnection bus; wherein the search step comprises a horizontal axis coordinate search step; the sum value output by the first index adder is configured as the index value of the direction of the horizontal axis of the discrete point cloud mapped to the map storage space and is set as the result output by the fifth pipeline structure.
10. The hardware acceleration circuit of claim 9, wherein the grid index module comprises a sixth pipeline structure; the sixth pipeline structure comprises a second index subtracter and a second index adder; the second index subtracter belongs to the subtracter, and the second index adder belongs to the adder;
the first input end of the second index subtracter is used for receiving the vertical coordinate of the discrete point cloud transmitted by the interconnection bus, wherein the vertical coordinate of the discrete point cloud transmitted by the interconnection bus to the second index subtracter is from the memory module;
the second input end of the second index subtracter is used for receiving the vertical axis coordinate offset value transmitted by the interconnection bus; wherein the preset coordinate offset value comprises a vertical axis coordinate offset value;
the first input end of the second index adder is connected with the output end of the second index subtracter; the second input end of the second index adder is used for receiving the longitudinal axis coordinate searching step length transmitted by the interconnection bus; wherein the search step comprises a horizontal axis coordinate search step; the sum value output by the second index adder is configured as the longitudinal axis direction index value of the discrete point cloud mapped to the map storage space and is set as the result output by the sixth pipeline structure.
11. The hardware acceleration circuit of claim 10, wherein the grid index module further comprises a third index adder and an index multiplier, the third index adder belongs to the adder, and the index multiplier belongs to the multiplier;
the first input end of the index multiplier is connected with the output end of the second index adder; a second input end of the index multiplier is used for receiving the row grid number transmitted by the interconnection bus, wherein the row grid number is the number of grids which can be occupied by each row of the map storage space;
the first input end of the third index adder is connected with the output end of the first index adder, the second input end of the third index adder is connected with the output end of the index multiplier, the third index adder is used for controlling the product of the number of the line grids and the sum value output by the second index adder to be added with the index value in the horizontal axis direction, and then the added sum value is set as an index value of the discrete point cloud mapped into the map storage space and is sent to the interconnection bus.
12. The hardware acceleration circuit of claim 10, wherein the trellis index module further comprises a third index adder and an index multiplier, the third index adder belongs to the adder, and the index multiplier belongs to the multiplier;
the first input end of the index multiplier is connected with the output end of the first index adder; the second input end of the index multiplier is used for receiving the number of the column grids transmitted by the interconnection bus; wherein the number of column grids is the number of grids that each column of the map storage space can occupy;
the first input end of the third index adder is connected with the output end of the second index adder, and the second input end of the third index adder is connected with the output end of the index multiplier; and the third index adder is used for controlling the addition of the product of the number of the column grids and the sum value output by the first index adder and the index value in the longitudinal axis direction, setting the sum value obtained by the addition as an index value of the discrete point cloud mapped into the map storage space and sending the index value to the interconnection bus.
13. The hardware acceleration circuit of any one of claims 1 to 12, wherein the state machine module is a finite state machine; the state machine module is used for scheduling the working states of the memory module, the point cloud processing module and the grid index module, so that the grid index module calculates an index value of a point cloud at a matching grid point of the grid map to be searched and determines to finish a matching operation when the point cloud processing module executes the rotation transformation and the discretization once; wherein the index value is an index value of an occupancy probability of the corresponding grid point.
14. The hardware acceleration circuit of claim 13, wherein the point cloud processing module is further configured to update the currently stored point cloud with a result obtained from the current rotation transformation every time a matching operation is completed or every time the rotation transformation is performed, so that the point cloud processing module performs a new rotation transformation only with the updated currently stored point cloud until an index value of an occupancy probability corresponding to all currently collected point clouds is calculated.
15. The hardware acceleration circuit of claim 14, wherein a bus interface module is disposed outside the hardware acceleration circuit, and the bus interface module includes a DMA controller module and a transmission bus;
the DMA controller module is used for continuously transmitting data stored in a physical storage space of a discontinuous address in batches, and reducing the triggering times of software interrupt of a CPU;
the transmission bus comprises a first bus and a second bus, the first bus is respectively in signal transceiving connection with the memory module, the point cloud processing module, the grid index module, the state machine module, the interconnection bus and the DMA controller module, the first bus is used for configuring data transmission parameters for the DMA controller module, and the first bus is also used for configuring a parameter register arranged in the hardware acceleration circuit;
the second bus is connected with the DMA controller module and is used for transmitting a grid map to be searched, a preset sine function value under a corresponding rotation angle, a preset cosine function value under a corresponding rotation angle and a point cloud currently collected by the laser radar, wherein the grid map to be searched is preset by the CPU, and the point cloud is preset by the memory module;
wherein the transmission bus is in accordance with an AMBA protocol.
16. A chip incorporating the hardware acceleration circuit of any one of claims 1 to 15.
17. A robot, characterized in that it is equipped internally with a chip according to claim 16 for positioning in a pre-constructed grid map.
CN202111202200.1A 2021-10-15 2021-10-15 Hardware acceleration circuit, chip and robot based on correlation scanning matching algorithm Pending CN115984346A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111202200.1A CN115984346A (en) 2021-10-15 2021-10-15 Hardware acceleration circuit, chip and robot based on correlation scanning matching algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111202200.1A CN115984346A (en) 2021-10-15 2021-10-15 Hardware acceleration circuit, chip and robot based on correlation scanning matching algorithm

Publications (1)

Publication Number Publication Date
CN115984346A true CN115984346A (en) 2023-04-18

Family

ID=85956754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111202200.1A Pending CN115984346A (en) 2021-10-15 2021-10-15 Hardware acceleration circuit, chip and robot based on correlation scanning matching algorithm

Country Status (1)

Country Link
CN (1) CN115984346A (en)

Similar Documents

Publication Publication Date Title
JP7431273B2 (en) Motion planning and reconfigurable motion planning processor for autonomous vehicles
CN112082565A (en) Method, device and storage medium for location and navigation without support
CN110969649A (en) Matching evaluation method, medium, terminal and device of laser point cloud and map
CN115098412B (en) Peripheral access controller, data access device and corresponding method, medium and chip
CN111679286B (en) Laser positioning system and chip based on hardware acceleration
WO2018211122A1 (en) Methods, systems and apparatus to reduce memory latency when fetching pixel kernels
WO2022002885A1 (en) An edge computing based path planning system for agv with intelligent deviation correction algorithm
CN115984346A (en) Hardware acceleration circuit, chip and robot based on correlation scanning matching algorithm
CN112667562B (en) Random walk heterogeneous computing system on large-scale graph based on CPU-FPGA
CN115576332B (en) Task-level multi-robot collaborative motion planning system and method
CN112362061A (en) Cluster unmanned aerial vehicle path assignment method, control system, storage medium and unmanned aerial vehicle
CN110887490A (en) Key frame selection method, medium, terminal and device for laser positioning navigation
CN108508426B (en) SAR echo signal generation method based on multi-core DSP and echo simulator
CN115984345A (en) Hardware acceleration circuit, chip and robot for point cloud transformation
CN105608046A (en) Multi-core processor architecture based on MapReduce programming model
Firmansyah et al. Fpga-based implementation of the stereo matching algorithm using high-level synthesis
CN115984344A (en) Correlation scanning matching hardware circuit of TSDF (time series distribution function) map and heterogeneous computing framework
CN115580572A (en) Routing method, routing node, routing device and computer readable storage medium
CN111782562B (en) Data transmission method, DMA controller, NPU chip and computer equipment
CN114253511A (en) SLAM hardware accelerator based on laser radar and implementation method thereof
CN117687042B (en) Multi-radar data fusion method, system and equipment
Kim et al. Optimization of multi-core accelerator performance based on accurate performance estimation
CN112395916B (en) Method and device for determining motion state information of target and electronic equipment
CN116028541B (en) Data vectorization aggregation method, device, equipment and storage medium
CN117687943B (en) Acceleration equipment, heterogeneous computing architecture-based system and data processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination