CN117008162A

CN117008162A - Satellite navigation system chip shape correlator and method based on GPU parallel computation

Info

Publication number: CN117008162A
Application number: CN202210473223.4A
Authority: CN
Inventors: 崔晓伟; 王传瑞; 刘刚; 陆明泉
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-04-29
Filing date: 2022-04-29
Publication date: 2023-11-07
Also published as: WO2023207632A1

Abstract

The application discloses a real-time chip shape correlator based on GPU parallel computing and a method thereof. The chip shape correlator includes: a mask generation unit configured to generate a corresponding mask signal from chip edges of the on-time local pseudo-random noise code; and a signal compression unit configured to: compressing a product of the mask signal and an input signal using the on-the-fly local pseudo-random noise code, generating a compressed signal, and computing the compressed signal to generate data for measuring a chip shape of the input signal; and/or compressing an input signal according to the instantaneous local pseudorandom noise code, the early pseudorandom noise code, and the late pseudorandom noise code, generating a compressed signal, and computing the compressed signal to generate data for measuring correlation peaks of the input signal in co-correlation with the local pseudorandom noise code.

Description

Satellite navigation system chip shape correlator and method based on GPU parallel computation

Technical Field

The application relates to the field of radio navigation software receivers, in particular to a satellite navigation system real-time chip shape correlator and a satellite navigation system real-time chip shape correlator method based on GPU parallel computing.

Background

The main signal processing process of the global navigation satellite system (GNSS, global Navigation Satellite System) radio software (SDR, software defined radio) receiver is realized by a software processing module, and compared with a hardware receiver, the receiver is easier to debug, upgrade and modify and has better configurability, thus playing an important role in various application fields of GNSS. The correlation operation includes a large number of multiply-accumulate operations, and the heavy computational burden makes it difficult to guarantee the real-time performance of the SDR receiver based on the processor (CPU). And the related computation in the SDR receiver can be greatly accelerated based on the parallel computation of a Graphic Processor (GPU), so that the real-time requirement is met.

With the increasing demands on GNSS accuracy and integrity, more and more complex signal processing algorithms in SDR receivers place higher demands on the output of the correlators. Standard correlators typically output only 3-5 correlation values for signal tracking, but some multipath mitigation and signal quality monitoring algorithms require more correlation values to output the constituent correlation peaks. The standard correlator needs a complete correlation operation every time it calculates a path of correlation values, and it is difficult for the standard correlator of the SDR receiver to calculate multiple paths of correlation values in real time. The chip shape correlator can utilize repeated calculation among the multipath correlation values to reduce the calculation amount for calculating the multipath correlation values. Meanwhile, the chip shape correlator can also measure the chip shape for a multipath suppression algorithm and a signal quality monitoring algorithm based on the observed quantity of the chip domain. But SDR receiver chip shape correlators place higher demands on memory allocation and access.

Therefore, in order to meet the requirements of real-time measurement of correlation peaks and chip shapes, a more reasonable design of a satellite navigation system real-time chip shape correlator and a method based on GPU parallel computation are needed.

Disclosure of Invention

To solve the above-described problems, or other problems in the art, the present application provides a GNSS SDR receiver real-time chip shape correlator system and method based on GPU parallel computing.

According to one aspect of the present application, there is provided a real-time chip shape correlator based on GPU parallel computing, comprising: a mask generation unit configured to generate a corresponding mask signal from chip edges of the on-time local pseudo-random noise code; and a signal compression unit configured to: compressing the product of the mask signal and the input signal using the instantaneous local pseudorandom noise code, generating a compressed signal, and computing the compressed signal to generate data for measuring the chip shape of the input signal; and/or compressing the input signal according to the instantaneous local pseudorandom noise code, the early pseudorandom noise code, and the late pseudorandom noise code to generate a compressed signal, and computing the compressed signal to generate data for measuring correlation peaks of the input signal in co-correlation with the local pseudorandom noise code; wherein the leading pseudorandom noise code and the lagging pseudorandom noise code respectively lead and lag a predetermined number of chips relative to the instantaneous local pseudorandom noise code.

In one embodiment, the predetermined number of chips may include 1 chip.

In one embodiment, the compressed signal is one chip in size and is divided into a plurality of chip lattices.

In one embodiment, the signal compression unit is further configured to: determining an interval between adjacent sampling points of the compressed signal belonging to the same chip lattice, finding all sampling points belonging to the same chip lattice according to the determined interval, and mapping all sampling points of the compressed signal belonging to the same chip lattice to one thread of the unified computing device architecture CUDA to perform parallel computation.

In one embodiment, when the edges between two adjacent chips of the instantaneous local pseudorandom noise code respectively coincide with (1) a rising edge, (2) a falling edge, (3) hold +1, and (4) hold-1, the values of two adjacent half chips of the mask signal generated from the chip edges of the instantaneous local pseudorandom noise code coincide with the instantaneous local pseudorandom noise code and the values of the remaining chips are 0.

In one embodiment, the real-time chip shape correlator may further comprise: a chip shape measurement unit configured to determine a real-time chip shape of the input signal including a chip rising edge and a chip falling edge from the data for measuring the chip shape of the input signal generated by the signal compression unit; and an accumulation unit configured to perform sliding accumulation on the data of the correlation peak generated by the signal compression unit for measuring the co-correlation of the input signal and the local pseudo-random noise code to obtain the correlation peak.

According to another aspect of the present application, there is provided a GPU-based parallel computing method, including: generating a corresponding mask signal according to the chip edge of the real-time local pseudo-random noise code; compressing the product of the mask signal and the input signal using the instantaneous local pseudorandom noise code, generating a compressed signal, and computing the compressed signal to generate data for measuring the chip shape of the input signal; and/or compressing the input signal according to the instantaneous local pseudorandom noise code, the early pseudorandom noise code, and the late pseudorandom noise code to generate a compressed signal, and computing the compressed signal to generate data for measuring correlation peaks of the input signal in co-correlation with the local pseudorandom noise code; wherein the leading pseudorandom noise code and the lagging pseudorandom noise code are respectively advanced and lagged by a predetermined number of chips relative to the instantaneous local pseudorandom noise code.

Wherein the step of generating the compressed signal comprises: determining an interval between adjacent sampling points of the compressed signal belonging to the same chip lattice, finding all sampling points belonging to the same chip lattice according to the determined interval, and mapping all sampling points of the compressed signal belonging to the same chip lattice to one thread of the unified computing device architecture CUDA to perform parallel computation.

Wherein the method further comprises the steps of: the method comprises the steps of determining a real-time chip shape of an input signal, comprising a rising edge and a falling edge of a chip, according to data generated by a signal compression unit and used for measuring the chip shape of the input signal, and performing sliding accumulation on data of a correlation peak generated by the signal compression unit and used for measuring the co-correlation of the input signal and a local pseudo-random noise code to obtain the correlation peak.

According to another aspect of the present application, there is provided an apparatus for GPU-based parallel computing, comprising: a memory storing computer-executable instructions; and a processor executing instructions to implement the method as described above.

According to another aspect of the application, there is provided a storage medium comprising computer-executable instructions which, when executed, implement the method as described above.

According to one embodiment of the application, all sampling points of one tracking channel in the signal compression unit are calculated in parallel, a parallel operation method of mapping the sampling points to a corresponding thread according to the chip grids to which the sampling points belong is adopted, and the thread efficiently searches all the sampling points belonging to the same chip grid by determining the interval between adjacent sampling points belonging to the same chip grid so as to finish accumulation. The parallel computing mode greatly reduces the number of registers required by each thread, and improves the computing efficiency and the upper limit of the resolution of the chip shape and the related peak measurement.

Drawings

FIG. 1 shows a schematic diagram of a real-time chip shape correlator for a satellite navigation system based on GPU parallel computing in accordance with an embodiment of the present application.

Fig. 2 shows a schematic diagram of a process for implementing correlation peak measurement according to one embodiment of the application.

Fig. 3 shows a schematic diagram of four mask signals generated by a local pseudorandom noise code and a process for measuring chip shape by signal compression in accordance with one embodiment of the application.

Fig. 4 shows the result of a chip shape correlator measuring the 1s smoothed correlation peak of satellite number 1 of the GPS L1C/a signal according to an embodiment of the present application.

Fig. 5 shows the result of a chip shape correlator measuring a 1s smoothed correlation peak of satellite number 1 to which a homodromous multipath GPS L1C/a signal of amplitude 0.5 and delay 0.2 chips is added, according to an embodiment of the present application.

Fig. 6 shows the result of a chip shape correlator measuring correlation peaks of satellite number 1 to which an inverse multipath GPS L1C/a signal of amplitude 0.5 and delay 0.2 chips is added, according to an embodiment of the present application.

FIG. 7 shows the result of a chip shape correlator measuring the 1s smooth chip rising edge of satellite number 1 of the GPS L1C/A signal, in accordance with an embodiment of the present application.

Fig. 8 shows the result of a chip shape correlator measuring the 1s smooth chip rising edge of satellite 1 with the addition of a homodromous multipath GPS L1C/a signal with amplitude 0.5 and delay 0.2 chips, according to an embodiment of the present application.

Fig. 9 shows the result of a chip shape correlator measuring the 1s smooth chip rising edge of satellite 1 with an inverse multipath GPS L1C/a signal of magnitude 0.5 and delay 0.2 chips added, according to an embodiment of the present application.

Detailed Description

The following description more fully describes embodiments of the present disclosure and various features and details thereof with reference to the accompanying drawings. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments of the disclosure. In addition, the various embodiments described in this disclosure are not necessarily mutually exclusive, as some embodiments may be combined with one or more other embodiments to form new embodiments. The term "or" as used in this disclosure refers to a non-exclusive or, unless otherwise indicated. The examples used in this disclosure are intended merely to facilitate an understanding of ways in which the embodiments of the disclosure may be practiced and to further enable those of skill in the art to practice the embodiments of the disclosure. Accordingly, the examples should not be construed as limiting the scope of the embodiments of the disclosure.

It will be further understood that terms such as "comprises," "comprising," "includes," "including," "having," "containing," "includes" and/or "including" are open-ended, rather than closed-ended, terms that specify the presence of the stated features, elements, and/or components, but do not preclude the presence or addition of one or more other features, elements, components, and/or groups thereof. Furthermore, when a statement such as "at least one of … …" appears after a list of features listed, it modifies the entire list of features rather than just a separate element in the list. Furthermore, when describing embodiments of the application, use of "may" means "one or more embodiments of the application. Also, the term "exemplary" is intended to refer to an example or illustration.

Unless otherwise defined, all terms (including engineering and technical terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the present application pertains. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. In addition, unless explicitly defined or contradicted by context, the particular steps included in the methods described herein need not be limited to the order described, but may be performed in any order or in parallel. The application will be described in detail below with reference to the drawings in connection with embodiments.

Fig. 1 shows a schematic block diagram of a real-time chip shape correlator 10 for a satellite navigation system based on GPU parallel computing in accordance with an embodiment of the present application. The real-time chip shape correlator 10 may comprise a mask generation unit 101 and a signal compression unit 102. The mask generation unit 101 generates a corresponding mask signal from the chip edges of the on-time local pseudo-random noise code. The signal compression unit 102 may compress the product of the mask signal and the input carrier stripped signal using the instantaneous local pseudorandom noise code, generate a compressed signal, and calculate the compressed signal to generate data for measuring the chip shape of the input carrier stripped signal.

In one example, the signal compression unit 102 may also use early, immediate, late pseudorandom codes to signal compress the input carrier stripped signal, generate a compressed signal, and calculate the compressed signal to generate data for measuring correlation peaks of the input carrier stripped signal in co-correlation with the local pseudorandom noise code.

Preferably, the early pseudorandom noise code and the late pseudorandom noise code may be respectively advanced and retarded by 1 chip relative to the instantaneous local pseudorandom noise code.

When the edges between two adjacent chips of the instantaneous local pseudo-random noise code respectively conform to (1) as a rising edge, (2) as a falling edge, (3) keep +1, and (4) keep-1, the values of two adjacent half chips of the mask signal generated according to the chip edges of the instantaneous local pseudo-random noise code are consistent with the instantaneous local pseudo-random noise code and the values of the rest chips are 0.

The local pseudorandom noise code of the navigation signal may be expressed in the form of:

i.e. the local pseudorandom noise code m (t) may be expressed as a chip function c (t) passing epsilon _k Weight of = ±1, shift by several chip periods T _c And adding to obtain the final product. In the signal compression unit 102 epsilon with a local pseudo-random noise code _k Weighted accumulation of input signals, i.e.

The compressed signal has a length of only one chip and retains all the information of the signal. In addition, because the signal compression coherently accumulates all the chips in the signal within a period of time to obtain the gain of the carrier-to-noise ratio, the chips originally submerged in the noise can be visually observed from the compressed signal. For the discrete case of actual use, there is typically no strictly corresponding sampling point in the different chips. Therefore, the signal compression in the discrete case is to divide the chip into M lattices uniformly, and accumulate the sampling points belonging to the same lattice in the process of signal compression.

Fig. 2 is a schematic diagram 20 showing a process of the accumulation unit 104 performing sliding accumulation on the compressed signal obtained from the signal compression unit 102 to calculate multiple correlation values and implement correlation peak detection according to an embodiment of the present application. Discrete correlation values may also be calculated from the compressed signal. The method comprises the following steps:

wherein T is _b ＝T _c and/M is the chip trellis period.

In correlation peak detection based on chip shape, a received signal is signal-compressed using an instantaneous pseudo-random code and an early pseudo-random code and a late pseudo-random code that respectively advance and retard by 1 chip with respect to the instantaneous pseudo-random code to obtain a compressed signal including three chips (3M chip lattices in total), the M lattices numbered-M to 2M-1 are provided, and the n to n+M-1 (-M-n.ltoreq.M)And sum is R _rm (nT _b ) I.e. the correlation value with code phase n/M. Therefore, only M times of addition operation are needed to obtain a correlation value, so that a plurality of paths of correlation values can be obtained with small operation amount. Let the total sampling point number of the tracking data be N _s The correlator outputs L-way correlation values, and L>3, O (L.N) _s ) Multiplication of O (L.N) _s ) While the chip shape correlator 10 requires only O (N _s ) Multiplication of (a) and O (N) _s +M.L). Due to M < N _s Thus, the chip shape correlator 10 requires less multiplication and addition than conventional multiplexers.

Fig. 3 shows a schematic diagram 30 of a process for obtaining a chip shape with chip edges of four cases, rising edge, falling edge, hold +1, hold-1, from a local pseudorandom noise code, and measuring the chip shape by signal compression.

In the signal compression unit 102, if the local pseudo random noise code is directly used for signal compression, chips with different chip edge conditions are superimposed, and thus the resulting chip edge is meaningless. Thus, the present application generates the four mask signals y (t) shown in FIG. 3 based on the 4 chip edge cases of the local pseudorandom noise code. Only when the edge between two adjacent chips of the local pseudo-random noise code meets the requirement, the values of the mask signals of the two adjacent half chips are consistent with the local pseudo-random noise code, and the rest parts are all 0. Thus, the product y (t) ·r (t) of the input signal and the mask signal is compressed by the signal compression unit 102 by:

only chips whose edges meet the requirements are accumulated separately in the process of signal compression, so that the measurement of the edges of the chips can be ensured to be accurate. And the compressed signal with the size of one chip is exchanged by the chip shape measuring unit 103 for the first half chip and the second half chip, so that the accurate measurement results of the chip shapes of the four cases of rising edge, falling edge, holding +1 and holding-1 can be obtained respectively.

In the case where the compressed signal has a size of one chip and is divided into a plurality of chip lattices, the signal compression unit 102 is further configured to: determining an interval between adjacent sampling points of the compressed signal belonging to the same chip lattice, finding all sampling points belonging to the same chip lattice according to the determined interval, and mapping all sampling points of the compressed signal belonging to the same chip lattice to one thread of the unified computing device architecture CUDA to perform parallel computation.

The real-time chip shape correlator 10 may further comprise a chip shape measurement unit 103 configured to determine a real-time chip shape of the input signal comprising a rising edge and a falling edge of a chip, from data generated by the signal compression unit 102 for measuring the chip shape of the input signal.

The real-time chip shape correlator 10 may further comprise an accumulation unit 104 configured to sliding accumulate data of correlation peaks generated by the signal compression unit 102 for measuring a co-correlation of the input signal with the local pseudorandom noise code to obtain correlation peaks.

The parallel algorithm based on the GPU in the signal compression unit 102 can be realized by CUDA, and can also be popularized to the framework written by other heterogeneous platform programs. Each tracking channel is mapped to a thread block of a CUDA to perform parallel calculation, and the threads in the thread blocks of the CUDA mapped to sampling points perform accumulation calculation in parallel. The result of the signal compression is a compressed signal of length M, and the signal compression unit 102 needs to complete the signal compression of the homodromous and quadrature components of the E, P, L compressed signal in order to calculate the correlation value, so that 6M registers are needed for each thread to store the whole signal compression result. While the registers on the GPU are scarce resources, the maximum number of registers for one thread in the CUDA is 255. In order to avoid the reduction of the calculation efficiency caused by the overflow of the register, the value of M cannot exceed 42. This means that a very high resolution cannot be achieved for the measurement of the correlation peak and chip shape. Therefore, in order to avoid this problem, in the implementation of the signal compression unit 102 of the present application, each thread is only responsible for multiply-accumulate computation of all sampling points falling into a certain chip trellis, and only 6 registers are needed to store the 6-way accumulated result of a certain chip trellis, so that register overflow is avoided, and computation efficiency is improved. The upper limit of the value of M is also extended to 1024 of the maximum thread number in the thread blocks in the CUDA, and the upper limit of the resolution of the chip shape correlator 10 is also increased.

Based on the mapping mechanism from the sampling point of the signal compression unit 102 to the thread, the thread determines the distance between the next sampling point belonging to the same chip lattice through an efficient algorithm, so as to find all the sampling points falling into the chip lattice responsible for calculation. At a given sampling rate f _s Code rate f _c And the interval between adjacent sampling points is only a few fixed possible values P under the number M of the code chip lattices _i (P ₁ <P ₂ <…<P _n ) And the current can be determined by a simple judgment criterionThe specific value is: at offset P from the current sampling point _i After a sample point, a fractional chip phase offset is generated in addition to possibly shifting the chip phase by an integer number, as shown in the following equation:

since the chip is divided into M chip lattices, the fractional phase offset of the corresponding chip lattice can be defined as:

it is assumed that there is also a fractional phase shift F (0.ltoreq.F) resulting from a previous shift at the current sample point position<1) Then at move P _i After a sample point, if the sample point is still in the same chip lattice, then the new fractional phase offset needs to satisfy:

0≤F+A _i <1 8)

thus, check A in ascending order _i Whether or not the above condition can be satisfied, the first one satisfying the condition A _i Corresponding P _i I.e. the distance between the current sampling point and the next sampling point. Since the maximum interval between adjacent sampling points belonging to the same chip lattice is max (P _i ) Thus, the continuous max (P _i ) At least one sampling point in all the chip lattices can be found out from the sampling points, the sampling points are used as the first sampling point in the chip lattices, and then the next sampling point belonging to the chip lattice is continuously found out by the judgment criterion, so that all the sampling points belonging to the chip lattice are found out in a recursive manner.

At a given sampling rate f _s Code rate f _c And the number of the code chip lattices M, P _i The specific value calculation process of (a) is as follows: first, the possible value Q of the interval of all sampling points belonging to the same chip lattice is calculated _i . Due to P _i Is the adjacent sampling point belonging to the same chip latticeSpacing between Q _i Is a possible value of the spacing between adjacent sampling points belonging to the same chip lattice, thus Q _i Comprises all P _i . Let n be ₁ And n ₂ Is an index of two sampling points belonging to the same chip lattice, and therefore needs to satisfy:

order the

Substituting k=1, 2, … … into the above formula, the total probability of index difference between two sampling points belonging to the same chip lattice can be calculated, namely Q _i 。

Subsequently, from Q _i Select P in (1) _i . Q as described above _i Only some of which may belong to the index pitch P between adjacent sampling points of the same chip lattice _i Therefore, it is also required to use Q _i Screening P _i . When moving Q _i After a sample point, the fractional chip trellis changes to the following offset:

if moving Q _i The samples still belong to the same chip grid, and the changed fractional frequency still is in the range of 0 to 1, namely:

0≤F _i +A _i <1

F _i ∈[min(0,-A _i ),max(1,1-A _i )) 15)

thus, Q is calculated in ascending order _i F corresponding to _i If the first k F _i And the union of (1) is 0, 1), then for 0.ltoreq.F<1 can be in Q _i ...Q _k Find Q in _i So that 0 is less than or equal to F+A _i <1, i.e. Q _i ...Q _k All the possibilities of the spacing between adjacent sampling points belonging to the same chip lattice are included, namely the required P _i 。

An evaluation embodiment of a satellite navigation system real-time chip shape correlator and method based on GPU parallel computing according to the present application is given below.

For GPS L1CA signal, code rate f _c =1.023 Mzz, and employ f _s The sampling is performed by =24 Mzz, and the number of lattices of one chip in the signal compression unit 102 is 40. Then according to the signal compression method based on GPU parallel computing, P is determined first _i Is a value of (a). Substituting parameters into the above-mentioned inequality 13) determines:

the inequality obtained when k=1 is therefore as follows:

n ₂ -n ₁ ＝23,24

the inequality obtained when k=2 is as follows:

n ₂ -n ₁ ＝47

so Q is ₁ ＝23、Q ₂ ＝24、Q ₃ =47 is the 3 smallest possible values of the interval between sampling points belonging to the same chip lattice.

From Q ₁ ＝23、Q ₂ ＝24、Q ₃ Calculated by=47, the corresponding fractional lattice offset is a ₁ ＝-0.785、A ₂ ＝0.92、A ₃ To meet the offset, the same chip bin is still present, and the corresponding initial fractional bin offset interval should be:

F ₁ ∈[0.785,1),F ₂ ∈[0,0.08),F ₃ ∈[0,0.865)

as a result of:

[0.785,1)∪[0,0.08)∪[0,0.865)＝[0,1)

therefore, the whole initial fractional lattice condition is already contained, and the interval between the adjacent sampling points belonging to the same chip lattice is only P ₁ ＝23、P ₂ ＝24、P ₃ Three cases=47.

According to parameter P ₁ 、P ₂ 、P ₃ According to a signal compression algorithm based on GPU parallel calculation, signal compression of L1CA signals can be performed in parallel, and real-time measurement of correlation peaks and chip shapes is achieved.

According to another aspect of the present application, a method for GPU-based parallel computing is provided. The method comprises the following steps: generating a corresponding mask signal according to the chip edge of the real-time local pseudo-random noise code; compressing the product of the mask signal and the input signal using the instantaneous local pseudorandom noise code, generating a compressed signal, and computing the compressed signal to generate data for measuring the chip shape of the input signal; and/or compressing the input signal according to the instantaneous local pseudorandom noise code, the early pseudorandom noise code, and the late pseudorandom noise code, generating a compressed signal, and computing the compressed signal to generate data for measuring correlation peaks of the input signal in co-correlation with the local pseudorandom noise code. Wherein the leading pseudorandom noise code and the lagging pseudorandom noise code respectively lead and lag a predetermined number of chips relative to the instantaneous local pseudorandom noise code.

Illustratively, the predetermined number of chips may be 1 chip.

Illustratively, the compressed signal may be one chip in size and may be divided into a plurality of chip lattices.

Illustratively, the step of generating the compressed signal may comprise: determining an interval between adjacent sampling points of the compressed signal belonging to the same chip lattice, finding all sampling points belonging to the same chip lattice according to the determined interval, and mapping all sampling points of the compressed signal belonging to the same chip lattice to one thread of the unified computing device architecture CUDA to perform parallel computation.

Illustratively, generating the corresponding mask signal from the chip edges of the instantaneous local pseudorandom noise code may comprise: in response to the edges between two adjacent chips of the instantaneous local pseudorandom noise code respectively conforming to (1) a rising edge, (2) a falling edge, (3) hold +1, (4) hold-1, generating a mask signal from the chip edges of the instantaneous local pseudorandom noise code, wherein the values of two adjacent half chips of the mask signal are consistent with the instantaneous local pseudorandom noise code and the values of the remaining chips are 0.

Illustratively, the above method may further comprise: the method comprises determining a real-time chip shape of the input signal comprising a rising edge and a falling edge of the chip from data generated by the signal compression unit 102 for measuring the chip shape of the input signal, and sliding accumulation of data generated by the signal compression unit 102 for measuring correlation peaks of the input signal with a local pseudorandom noise code to obtain correlation peaks.

The application carries out parallel calculation on all sampling points of a tracking channel in a signal compression unit, adopts a parallel operation method of mapping the sampling points to a corresponding thread according to the chip grids to which the sampling points belong, and the thread efficiently searches all the sampling points belonging to the same chip grid by determining the interval between adjacent sampling points belonging to the same chip grid so as to finish accumulation. The parallel computing mode greatly reduces the number of registers required by each thread, and improves the computing efficiency and the upper limit of the resolution of the chip shape and the related peak measurement.

According to another aspect of the present application, there is provided an apparatus for GPU-based parallel computing, comprising: a memory storing computer-executable instructions; and a processor executing instructions to implement the method described above.

According to another aspect of the present application, there is provided a storage medium comprising computer-executable instructions which, when executed, implement the method described above.

Fig. 4 shows the result 40 of a chip shape correlator monitoring the 1s smoothed correlation peak of satellite number 1 of the GPS L1CA signal, in accordance with an embodiment of the present application. The signal is generated by the spiral GSS9000 analog source, compressed by the signal compression unit 102, and then subjected to sliding addition by the accumulation unit 104 to obtain a multi-path correlation value, so that real-time measurement of correlation peaks is realized. In the case where the number of lattices of the chip is 40, a maximum of 81 correlation values can be generated, and in the present embodiment, 39 correlation values are generated at intervals of 0.05 chip. It can be seen from the figure that the correlation peak obtained by measurement is very close to the correlation peak of the standard, which indicates that the current received signal is normal.

Fig. 5 and 6 show the results 50 and 60, respectively, of a chip shape correlator monitoring the 1s smoothed correlation peak of satellite number 1 with the addition of a co-directional and inverse multipath GPS L1CA signal with an amplitude of 0.5 and a delay of 0.2 chips. As can be seen from the figure, the slope of the correlation function at the fourth sampling point of the correlation peak changes significantly, so that it can be known that the delay of the multipath is 0.2 chip, and the phase and amplitude of the multipath can be calculated according to the specific slope. This illustrates that the chip shape correlator according to embodiments of the present application can achieve monitoring of received signal multipath and the like distortion by correlation peak real-time measurement.

FIG. 7 shows the result 70 of a chip shape correlator monitoring the 1s smooth chip rising edge of satellite number 1 of the GPS L1C/A signal, in accordance with an embodiment of the present application. In one chip with a chip phase of-0.5 to 0.5, there are 40 lattices, so that the code phase interval is 0.025 chip, and the rising edge shape of the chip can be clearly represented.

Fig. 8 and 9 show the results 80 and 90, respectively, of a chip shape correlator monitoring the 1s smooth chip rising edge of satellite number 1 with the addition of GPS L1C/a signals with amplitude 0.5 and delay 0.2 chips in co-directional and reverse multipath, according to an embodiment of the present application. As can be seen from the figure, multipath results in a large change in the chip shape. The obtained chip rising edge information can be used for subsequent processing of multipath inhibition and other technologies.

Table 1 below shows the average time required for a chip shape correlator method and a conventional multipath correlator method to process 1ms of data in a tracking link according to an embodiment of the present application. The GPU used in the test was NVIDIA GeForce RTX 3080, the L1C/A and L5 signals of the GPS system, the B1C, B a and B1I signals of the BDS system, and the E1OS and E5a signals of the GAL system were processed simultaneously in the test, for a total of 7 signals, 12 channels for each signal, and therefore, for a total of 84 channels, wherein the conventional multi-channel correlator method as a comparison also outputted 5 channels of correlation signals by the GPU parallel computation. The chip shape correlator according to the embodiment of the present application is divided into two cases of outputting only 39 correlation values and outputting the rising edge shape of the chip while outputting 39 correlation values. It can be seen that, since the signal compression method is adopted to reduce the calculation amount, the time spent by the chip shape correlator for calculating the 39-way correlation value is smaller than the time spent by the traditional multipath correlator for calculating the 5-way correlation value, and the additional calculation of the rising edge of the chip by the chip shape correlator can increase the time consumption, but the average 0.58ms for processing 1ms of data can still ensure the real-time performance of the operation of the receiver.

The average time required for the chip shape correlator method and the conventional multipath correlator method to process 1ms data is shown in table 1 below.

Method	Average time
		The traditional multi-path correlator outputs 5-path correlation values	0.458ms
The chip shape correlator outputs 39 correlation values	0.406ms
		The chip shape correlator outputs 39 correlation values and a chip rising edge	0.580ms

TABLE 1

The above embodiments are merely illustrative of the present application and are not intended to be limiting. Various changes and modifications to the disclosed embodiments and examples may be made by those skilled in the relevant art without departing from the spirit and scope of the application, and all such equivalent technical aspects are therefore intended to be within the scope of the application, as defined by the following claims.

Claims

1. A real-time chip shape correlator based on GPU parallel computing, comprising:

a mask generation unit configured to generate a corresponding mask signal from chip edges of the on-time local pseudo-random noise code; and

a signal compression unit configured to:

compressing a product of the mask signal and an input signal using the on-the-fly local pseudo-random noise code, generating a compressed signal, and computing the compressed signal to generate data for measuring a chip shape of the input signal; and/or

Compressing an input signal according to an instantaneous local pseudorandom noise code, an early pseudorandom noise code, and a late pseudorandom noise code to generate a compressed signal, and calculating the compressed signal to generate data for measuring correlation peaks of the input signal in co-correlation with the local pseudorandom noise code; wherein the leading pseudorandom noise code and the lagging pseudorandom noise code respectively lead and lag a predetermined number of chips relative to the instantaneous local pseudorandom noise code.

2. The real-time chip shape correlator of claim 1 wherein the predetermined number of chips comprises 1 chip.

3. The real-time chip shape correlator according to any one of claims 1-2, wherein the compressed signal is one chip in size and divided into a plurality of chip lattices;

wherein the signal compression unit is further configured to:

determining the interval between adjacent sampling points of the compressed signal belonging to the same chip lattice,

finding all sampling points belonging to the same chip lattice according to the determined interval, and

all sampling points of the compressed signal belonging to the same chip lattice are mapped to one thread of the unified computing device architecture CUDA to perform parallel computation.

4. The real-time chip shape correlator of claim 1 wherein when edges between two adjacent chips of the instantaneous local pseudorandom noise code respectively coincide with (1) a rising edge, (2) a falling edge, (3) hold +1, and (4) hold-1, values of two adjacent half chips of the mask signal generated from chip edges of the instantaneous local pseudorandom noise code coincide with the instantaneous local pseudorandom noise code and values of the remaining chips are 0.

5. The real-time chip shape correlator according to claim 1, further comprising:

a chip shape measurement unit configured to determine a real-time chip shape of the input signal including a chip rising edge and a chip falling edge from data for measuring the chip shape of the input signal generated by the signal compression unit; and

and an accumulation unit configured to perform sliding accumulation on the data of the correlation peak generated by the signal compression unit for measuring the co-correlation of the input signal and the local pseudorandom noise code to obtain the correlation peak.

6. A method of GPU-based parallel computing, comprising:

generating a corresponding mask signal according to the chip edge of the real-time local pseudo-random noise code;

7. The method of claim 6, wherein the predetermined number of chips comprises 1 chip.

8. The method of any of claims 6-7, wherein the compressed signal is one chip in size and divided into a plurality of chip lattices;

wherein generating the compressed signal comprises:

9. The method of claim 6, wherein generating the corresponding mask signal from the chip edges of the instantaneous local pseudorandom noise code comprises:

and in response to edges between two adjacent chips of the instantaneous local pseudorandom noise code respectively conforming to (1) a rising edge, (2) a falling edge, (3) hold +1, (4) hold-1, generating the mask signal according to the chip edges of the instantaneous local pseudorandom noise code, wherein values of two adjacent half chips of the mask signal are consistent with the instantaneous local pseudorandom noise code and values of the remaining chips are 0.

10. The method of claim 6, further comprising:

determining a real-time chip shape of the input signal including a chip rising edge and a chip falling edge from data for measuring the chip shape of the input signal generated by the signal compression unit, and

and carrying out sliding accumulation on data of a correlation peak which is generated by the signal compression unit and is used for measuring the co-correlation of the input signal and the local pseudorandom noise code so as to obtain the correlation peak.

11. An apparatus for GPU-based parallel computing, comprising:

a memory storing computer-executable instructions; and

a processor executing the instructions to implement the method of any one of claims 6-10.

12. A storage medium comprising computer-executable instructions that, when executed, implement the method of any of claims 6-10.