CN114896186B - Pre-training-based FPGA and external bus data interaction method - Google Patents

Pre-training-based FPGA and external bus data interaction method Download PDF

Info

Publication number
CN114896186B
CN114896186B CN202210564575.0A CN202210564575A CN114896186B CN 114896186 B CN114896186 B CN 114896186B CN 202210564575 A CN202210564575 A CN 202210564575A CN 114896186 B CN114896186 B CN 114896186B
Authority
CN
China
Prior art keywords
delay
lookup table
bus
data
fpga
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210564575.0A
Other languages
Chinese (zh)
Other versions
CN114896186A (en
Inventor
李瑶
吕志武
樊周华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Computer Technology and Applications
Original Assignee
Beijing Institute of Computer Technology and Applications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Computer Technology and Applications filed Critical Beijing Institute of Computer Technology and Applications
Priority to CN202210564575.0A priority Critical patent/CN114896186B/en
Publication of CN114896186A publication Critical patent/CN114896186A/en
Application granted granted Critical
Publication of CN114896186B publication Critical patent/CN114896186B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/382Information transfer, e.g. on bus using universal interface adapter
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/14Protection against unauthorised use of memory or access to memory
    • G06F12/1458Protection against unauthorised use of memory or access to memory by checking the subject access rights
    • G06F12/1483Protection against unauthorised use of memory or access to memory by checking the subject access rights using an access-table, e.g. matrix or list
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a pre-training-based FPGA and external bus data interaction method, and belongs to the technical field of information. The invention considers the influence of the FPGA on the transmission delay of the chip circuit under different working temperatures, layout and wiring and other environmental conditions, and performs parallel search on the optimal delay parameters of the current bus based on iterative use of the thick lookup table and the thin lookup table. Compared with the conventional FPGA and external bus data interaction method, the pre-training-based FPGA and external bus data interaction mechanism improves the communication reliability of the FPGA and the external bus and reduces the error rate of the FPGA and the external bus. Meanwhile, the method has good adaptability, can be universally suitable for buses with different interface rates, and can quickly and accurately search the optimal delay value of the current bus through the self-adaptive selection of the coarse and fine search tables.

Description

Pre-training-based FPGA and external bus data interaction method
Technical Field
The invention belongs to the technical field of information, and particularly relates to a pre-training-based FPGA and external bus data interaction method.
Background
With the rapid development of high-performance embedded systems, the interaction speed of the FPGA and the bus is also continuously improved. The high-speed bus technology itself has higher transmission rates, better signal integrity, and is more conducive to miniaturized designs. Therefore, the method is widely applied to the aspects of signal acquisition, high-speed transmission, video monitoring and the like.
In an ideal case, when the FPGA performs data interaction with the external bus, the data will be sampled with a clock change edge, and the clock change edge is just in the middle position of the data. However, when the actual FPGA and the external data bus perform high-speed interaction, factors such as layout and wiring of the FPGA and working temperature all cause different changes in delay of the chip circuit, so that timing margin of the FPGA is reduced, deviation of a clock and data generating part is caused in many cases, and when the deviation exceeds a certain range, data sampling dislocation and data interaction errors are caused.
Currently, research aimed at improving synchronization problems in the interaction process of FPGAs and external buses is mainly focused on three aspects:
in terms of hardware, the current method for improving the data interaction quality of the FPGA and the external bus in engineering mostly adopts equal-length processing when wiring on a PCB (printed circuit board), and ensures that the time delay of different signals before entering the FPGA is the same. However, the high-speed data transmission often has higher requirements on hardware, the data transmission quality can be influenced no matter whether a chip interface or a PCB (printed Circuit Board) is wired, the portability, the flexibility and the self-adaptive capacity of the method are poor, and each set of different hardware needs to be subjected to corresponding layout and wiring again, so that manpower and material resources are consumed, and the production cost is increased.
FPGA timing aspect: embedded high-speed serial bus technology published in 2017: the use of OFFSET constraints and MAXSKEW constraints in FPGA for timing optimization is mentioned in pages 63 to 64 in FPGA-based implementation and application Zhang Feng. The method can restrict the relative phase relation between the sampling clock and the data in the FPGA, so that the sampling clock is just positioned in the middle of the data. However, the constraint values in this method do not have the capability of adaptive adjustment and require reasonable computation. Unreasonable constraint values may cause the FPGA to increase the length of time for PAR (Place and Route) and even cause PAR failure.
FPGA logic aspects: in 2021, volume 37 and 5, in microcomputer application, pages 99 to 101, a method for building a dynamically changeable output delay PHY interface by using FPGA primitives is provided in FPGA primitive-based low-delay high-speed interface implementation method published by Arngzhi Xin, wang Jiangwei and the like. The IODELAY is used for adjusting the time sequence problem caused by different hardware wiring delays. The method effectively reduces the interface delay, has good effect on the application requiring small delay and time sequence tension, and has a certain degree of self-adaptability. However, the self-adaptive adjustment capability is limited, and only the delay dynamic adjustment in a small range can be realized, so that the synchronization problem of buses with different rates in the communication process cannot be considered.
In summary, the existing methods do not thoroughly solve the synchronization problem faced by buses with different rates in the process of communicating with the FPGA.
Disclosure of Invention
First, the technical problem to be solved
The invention aims to solve the technical problem of how to provide a pre-training-based method for interacting data between an FPGA and an external bus so as to solve the synchronization problem faced by buses with different rates in the process of communicating with the FPGA.
(II) technical scheme
In order to solve the technical problems, the invention provides a pre-training-based FPGA and external bus data interaction method, which comprises the following steps:
step 1, pre-storing two lookup tables related to delay parameters in an FPGA, wherein one lookup table is a coarse lookup table and is used for pre-performing coarse search in a low-speed bus, the other lookup table is a fine lookup table and is used for performing optimal delay search in a high-speed bus or further performing fine search on the basis of coarse search in the low-speed bus, the search step is delta, and the coarse lookup table is different from the fine lookup table in search step;
step 2, under the initial condition that delay is not carried out, namely the delay parameter n is 0, firstly, the FPGA and an external bus are transmitted with a section of known training sequence, and the interaction condition is observed; if the communication is in the communication error state at first, the step 3 is switched to; if the communication is in the normal communication state, the step 4 is carried out;
step 3, when the communication between the FPGA and the external bus is in an error state, adopting a search strategy 1: setting the lookup range of the lookup table as [ -clk, clk ], wherein clk is the interface working frequency, and starting from the middle position of the lookup table, namely the delay parameter n=0, the lookup table is divided into two paths to search in the front and back directions of the lookup table at the same time, namely the delay parameter is searched in the sequence of n=delta, n= -delta, n=2delta, n= -2delta. Each time a delay parameter is searched, corresponding delay is carried out on data or a clock;
when one path detects normal interaction data, the path stops searching, and the delay parameter in the current lookup table is recorded and is recorded as D1; the other path starts searching from the tail end of the lookup table in the direction D1, stops after finding the delay which can be interacted normally, records the delay parameter in the current lookup table and records as D2; d1 and D2 are distances between front and rear boundary positions of data and the current sampling clock edge;
if the current communication bus is a high-speed bus, the searching strategy 1 adopts a fine searching table to directly determine the optimal delay parameter; if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the search strategy 1, and the boundary position of the data needs to be accurately confirmed by further utilizing a fine lookup table, and the step 5 is shifted to;
and 4, when the communication between the FPGA and the external bus is in a normal state, adopting a search strategy 2: setting the lookup range of the lookup table as [ -clk, clk ], starting from the middle position, namely that the delay parameter is n=0, dividing the lookup table into two paths, and searching the two paths in parallel in the front and back directions of the lookup table at the same time, namely searching the delay parameter in the sequence of n=delta, n= -delta, n=2delta, n= -2delta.
When one path detects that the interaction between the FPGA and the external bus is wrong, the path stops searching, and the delay parameter in the current lookup table is recorded and is recorded as D1; the other path is continuously searched, the delay is stopped after the delay with the interaction error is found, and the delay parameter in the current lookup table is recorded and is recorded as D2;
if the current communication bus is a high-speed bus, the searching strategy 2 adopts a fine searching table to directly determine the optimal delay value; if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the search strategy 2, and the boundary position of the data needs to be accurately confirmed by further utilizing a fine lookup table, and the step 5 is shifted to;
step 5, when the coarse lookup table confirms that D1 is a data boundary position, adopting a search strategy 3: setting the search range of the fine lookup table to be [ D1-5ns, D1+5ns ], and dividing the search range from D1 into two paths to search in parallel in the front and rear directions of the lookup table at the same time, namely searching the delay parameters in the sequence of n=d1+Δ, n=d1- Δ, n=d1+2Δ, n=d1-2Δ. Each time a delay parameter is searched, corresponding delay is carried out on data or a clock;
when detecting the change of the data interaction condition, stopping both paths of searching, and recording delay parameters in the current lookup table, wherein the delay parameters are recorded as d1;
when the coarse look-up table identifies D2 as another boundary position for the data, search strategy 3 is also employed: setting the search range of the fine lookup table to be [ D2-5ns, d2+5ns ], and dividing the fine lookup table into two paths from D2 and searching the fine lookup table in parallel in the front and rear directions of the lookup table at the same time, namely searching the delay parameters in the sequence of n=d2+delta, n=d2-delta, n=d2+2delta, n=d2-2delta. Each time a delay parameter is searched, corresponding delay is carried out on data or a clock;
when detecting the change of the data interaction condition, stopping both paths of searching, and recording delay parameters in the current lookup table, namely d2;
calculating an optimal delay value of the bus according to d1 and d2;
and 6, after the optimal delay value is determined, the FPGA delays the data or the clock correspondingly according to the optimal delay value, and then the effective data is transmitted subsequently.
Further, the low-speed bus is a bus with the interface working frequency not more than 100MHz, and the high-speed bus is a bus with the interface working frequency more than 100 MHz.
Further, the coarse lookup table has a search step of Δ=5ns and the fine lookup table has a search step of Δ=78ps.
Further, when performing a refinement lookup in the low speed bus, the lookup range of the fine lookup table is [ D-5ns, D+5ns ], where D is the coarse lookup result of the coarse lookup table; when the best delay search is performed in the high-speed bus, the search range of the fine search table is [ -clk, clk ].
Further, the FPGA carries out corresponding delay on the data and the clock according to the currently searched delay parameter n, wherein n is a real number, and the data interaction condition is observed; the specific delay operation is as follows: when the delay parameter is positive number n > 0, then the clock is delayed backward by n ns or ps; when the delay parameter is negative number n < 0, the data is delayed backward by |n|ns or ps.
Further, in the step S2, if the external bus is a parallel bus, directly comparing whether the received data is consistent with the known training sequence, and if more than 4 consecutive bytes are consistent with the training sequence, the communication between the current FPGA and the external parallel bus is considered to be in a normal state, otherwise, the communication is in an error state; if the external bus is a serial bus, serial-parallel conversion is performed immediately after the data is received, whether the converted received data is consistent with a known training sequence is observed in a comparison manner, more than 4 continuous bytes are consistent with the training sequence, namely, the communication between the current FPGA and the external serial bus is considered to be in a normal state, and otherwise, the communication between the current FPGA and the external serial bus is in an error state.
Further, in the step S3, if the current communication bus is a high-speed bus, the searching strategy 1 uses a fine searching table, and after the data boundary values D1 and D2 obtained by searching are obtained, the delay value delay_h between the best sampling position at the data intermediate position and the current sampling clock edge is calculated by using the fine searching table and is recorded as the best delay value;
|D2|>|D1|>0,|delay_h|>0
if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the searching strategy 1, and after the data boundary values D1 and D2 obtained by searching are needed to be further confirmed accurately by using a fine lookup table, the step is shifted to step 5.
Further, in the step S4,
if the current communication bus is a high-speed bus, a fine search table is adopted in the search strategy 2, and after the data boundary values D1 and D2 obtained by searching are used for calculating a delay_h between the optimal sampling position at the data middle position and the current sampling point, and the delay_h is recorded as an optimal delay value;
|D2|≥|D1|>0,delay_h∈R
if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the searching strategy 2, and after the data boundary values D1 and D2 obtained by searching are needed to be further confirmed accurately by using a fine lookup table, the step is shifted to step 5.
Further, in the step S5, the interaction condition changes from normal to error or from error to normal; the method for calculating the optimal delay value of the bus according to d1 and d2 is as follows:
calculating a delay_l between the optimal sampling position at the middle position of the data and the current sampling point, and recording the delay_l as an optimal delay value of the bus;
|d2|≥|d1|>0,delay_l∈R。
further, the delay operation in the step S6 is as follows:
when the current optimal delay value is positive, namely delay is more than 0, the clock is delayed backwards by delay ns or ps;
when the current optimal delay value is negative, i.e. delay < 0, the data is delayed backward by |delay|ns or ps.
(III) beneficial effects
The invention provides a pre-training-based method for interacting data between an FPGA and an external bus, which considers the influence of the FPGA on the transmission delay of a chip circuit under different working temperatures, layout and wiring and other environmental conditions, and performs parallel search on the optimal delay parameter of the current bus based on iterative use of a thick lookup table and a thin lookup table. Compared with the conventional FPGA and external bus data interaction method, the pre-training-based FPGA and external bus data interaction mechanism improves the communication reliability of the FPGA and the external bus and reduces the error rate of the FPGA and the external bus. Meanwhile, the method has good adaptability, can be universally suitable for buses with different interface rates, and can quickly and accurately search the optimal delay value of the current bus through the self-adaptive selection of the coarse and fine search tables.
Drawings
FIG. 1 is a schematic diagram of a pre-trained FPGA-based data interaction flow with an external bus;
FIG. 2 is a schematic diagram illustrating the delay adjustment of the high-speed bus in the case of an initial communication error according to the present invention;
FIG. 3 is a schematic diagram of the delay adjustment of the high-speed bus under the normal condition of initial communication according to the present invention;
FIG. 4 is a schematic diagram illustrating the delay adjustment of the low-speed bus in the case of an initial communication error according to the present invention;
FIG. 5 is a diagram illustrating a fine look-up table search of the low speed bus in the case of an initial communication error according to the present invention;
FIG. 6 is a schematic diagram of the delay adjustment of the low-speed bus under the normal condition of initial communication according to the present invention;
FIG. 7 is a diagram illustrating a fine look-up table search of the low speed bus under normal initial communication conditions.
Detailed Description
To make the objects, contents and advantages of the present invention more apparent, the following detailed description of the present invention will be given with reference to the accompanying drawings and examples.
In view of this, the present invention proposes a pre-training based mechanism for interaction of FPGA and external bus data. On the basis of the conventional method, the pre-training based on the known sequence is added, and before the effective data interaction between the FPGA and the external bus, a section of the known training sequence is transmitted, so that the optimal delay value which can normally stabilize the interaction data and can ensure the maximum timing margin is obtained, the error rate is reduced, and the reliability of communication is improved.
The invention provides a pre-training-based FPGA and external bus data interaction method, which comprises the following steps:
step 1, pre-storing two lookup tables related to delay parameters in an FPGA, wherein one lookup table is a coarse lookup table and is used for pre-performing coarse search in a low-speed bus, the other lookup table is a fine lookup table and is used for performing optimal delay search in a high-speed bus or further performing fine search on the basis of coarse search in the low-speed bus, the search step is delta, and the coarse lookup table is different from the fine lookup table in search step;
step 2, under the initial condition that delay is not carried out, namely the delay parameter n is 0, firstly enabling the FPGA and an external bus to carry out transmission of a section of known training sequence, and observing the interaction condition of the FPGA and the external bus; if the communication is in the communication error state at first, the step 3 is switched to; if the communication is in the normal communication state, the step 4 is carried out;
step 3, when the communication between the FPGA and the external bus is in an error state, adopting a search strategy 1: setting the lookup range of the lookup table as [ -clk, clk ], wherein clk is the interface working frequency, and starting from the middle position of the lookup table, namely the delay parameter n=0, dividing the lookup table into two paths and searching the front direction and the rear direction of the lookup table in parallel, namely searching the delay parameter in the sequence of n=delta, n= -delta, n=2delta, n= -2delta. Each time a delay parameter is searched, corresponding delay is carried out on data or a clock;
when one path detects normal interaction data, the path stops searching, and the delay parameter in the current lookup table is recorded and is recorded as D1; the other path starts searching from the tail end of the lookup table in the direction D1, stops after finding the delay which can be interacted normally, records the delay parameter in the current lookup table and records as D2; d1 and D2 are distances between front and rear boundary positions of data and the current sampling clock edge;
if the current communication bus is a high-speed bus, the searching strategy 1 adopts a fine searching table to directly determine the optimal delay parameter; if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the search strategy 1, and the boundary position of the data needs to be accurately confirmed by further utilizing a fine lookup table, and the step 5 is shifted to;
and 4, when the communication between the FPGA and the external bus is in a normal state, adopting a search strategy 2: setting the lookup range of the lookup table as [ -clk, clk ], starting from the middle position, namely the delay parameter n=0, dividing the lookup table into two paths and searching the two directions in front and back of the lookup table at the same time, namely searching the delay parameter in the sequence of n=delta, n= -delta, n=2delta, n= -2delta.
When one path detects that the interaction between the FPGA and the external bus is wrong, the path stops searching, and the delay parameter in the current lookup table is recorded and is recorded as D1; the other path is continuously searched, the delay is stopped after the delay with the interaction error is found, and the delay parameter in the current lookup table is recorded and is recorded as D2;
if the current communication bus is a high-speed bus, the searching strategy 2 adopts a fine searching table to directly determine the optimal delay value; if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the search strategy 2, and the boundary position of the data needs to be accurately confirmed by further utilizing a fine lookup table, and the step 5 is shifted to;
step 5, when the coarse lookup table confirms that D1 is a data boundary position, adopting a search strategy 3: setting the search range of the fine lookup table to be [ D1-5ns, D1+5ns ], and dividing the search range from D1 into two paths to search in parallel in the front and rear directions of the lookup table at the same time, namely searching the delay parameters in the sequence of n=d1+Δ, n=d1- Δ, n=d1+2Δ, n=d1-2Δ. Each time a delay parameter is searched, corresponding delay is carried out on data or a clock;
when detecting the change of the data interaction condition, stopping both paths of searching, and recording delay parameters in the current lookup table, wherein the delay parameters are recorded as d1;
when the coarse look-up table identifies D2 as another boundary position for the data, search strategy 3 is also employed: setting the search range of the fine lookup table to be [ D2-5ns, d2+5ns ], and dividing the fine lookup table into two paths from D2 and searching the fine lookup table in parallel in the front and rear directions of the lookup table at the same time, namely searching the delay parameters in the sequence of n=d2+delta, n=d2-delta, n=d2+2delta, n=d2-2delta. Each time a delay parameter is searched, corresponding delay is carried out on data or a clock;
when detecting the change of the data interaction condition, stopping both paths of searching, and recording delay parameters in the current lookup table, namely d2;
calculating an optimal delay value of the bus according to d1 and d2;
step 6, after the optimal delay value is determined, the FPGA delays the data or the clock correspondingly according to the optimal delay value, and then transmits the effective data subsequently
Example 1:
in connection with fig. 1, the following explanation is made for the specific implementation steps of the submitted interconnection mechanism:
step 1, two lookup tables related to delay parameters are prestored in an FPGA.
One of the lookup tables is a coarse lookup table for performing a coarse search in advance in a bus (hereinafter simply referred to as a "low-speed bus") having an interface operating frequency of not more than 100 MHz. The coarse lookup table search step is 5ns (delta=5ns), the search range is [ -clk, clk ] (clk is interface operating frequency);
the other lookup table is a fine lookup table, and is used for further refining the lookup based on the rough search in the low-speed bus, or directly performing the search of the optimal delay in the bus with the interface working frequency greater than 100MHz (hereinafter referred to as a high-speed bus). The fine look-up table search steps 78ps (Δ=78ps), with the search range varying with the type of current interface bus: when the fine lookup is performed in the low-speed bus, the lookup range of the fine lookup table is [ D-5ns, D+5ns ], wherein D is the rough lookup result of the rough lookup table; when the best delay search is performed in the high-speed bus, the search range of the fine search table is [ -clk, clk ] (clk is the interface operating frequency).
And the FPGA correspondingly delays the data and the clock according to the currently searched delay parameter n (n is a real number) and observes the data interaction condition. The specific delay operation is as follows:
when the delay parameter is positive (n > 0), the clock is delayed to the back nns or ps;
when the delay parameter is negative (n < 0), the data is delayed backward by |n|ns or ps.
And 2, under the initial condition of no delay (the delay parameter n is 0), firstly enabling the FPGA and the external bus to transmit a section of known training sequence, and observing the interaction condition of the FPGA and the external bus.
If the external bus is a parallel bus, directly comparing whether the received data is consistent with the known training sequence. More than 4 continuous bytes are consistent with the training sequence, namely, the communication between the current FPGA and the external parallel bus is considered to be in a normal state, otherwise, the communication is in an error state;
if the external bus is a serial bus, serial-parallel conversion is performed immediately after receiving the data, and the converted received data is compared with a known training sequence to see whether the received data is consistent. More than 4 continuous bytes (32 bits) are consistent with the training sequence, namely the communication between the current FPGA and the external serial bus is considered to be in a normal state, otherwise, the communication between the current FPGA and the external serial bus is considered to be in an error state.
If the communication is in a communication error state (the error rate is high) at first, the step 3 is switched to; if the communication is in a normal communication state (the error rate is low) at first, the step 4 is performed;
step 3, when the communication between the FPGA and the external bus is in an error state, adopting a search strategy 1: the search range of the lookup table is set to [ -clk, clk ] (clk is the interface operating frequency), and the delay parameter is searched in parallel in the front and rear directions of the lookup table by dividing into two paths starting from the middle position of the lookup table (delay parameter n=0), namely, in the order of n=Δ, n= - Δ, n=2Δ, n= -2Δ. Whenever a delay parameter is found, the data or clock is delayed accordingly according to the method defined in step 1.
When one path detects that the data can be interacted normally, the path stops searching, and the delay parameter in the current lookup table is recorded and recorded as D1. The other path is searched from the tail end of the lookup table in the direction D1, and the delay is stopped after the delay which can be interacted normally is found, and the delay parameter in the current lookup table is recorded and recorded as D2. D1 and D2 are distances between the front and rear boundary positions of the data and the current sampling clock edge.
And further determining the optimal delay parameter according to the interface working frequency of the current communication bus.
High-speed bus
If the current communication bus is a high-speed bus, i.e. the working frequency of the interface is greater than 100MHz, the searching strategy 1 adopts a fine searching table, and after the data boundary values D1 and D2 obtained by searching are used, the delay value delay_h between the best sampling position at the data intermediate position and the current sampling clock edge is calculated and recorded as the best delay value. The overall delay adjustment is schematically shown in fig. 2.
|D2|>|D1|>0,|delay_h|>0
Low speed bus
If the current communication bus is a low-speed bus, i.e. the working frequency of the interface is less than or equal to 100MHz, a coarse lookup table is adopted in the above-mentioned search strategy 1, after the data boundary values D1 and D2 obtained by searching, the fine lookup table is further required to be used for accurately confirming the boundary position of the data, and the step 5 is shifted to;
and 4, when the communication between the FPGA and the external bus is in a normal state, adopting a search strategy 2: the search range of the lookup table is set to [ -clk, clk ] (clk is the interface operating frequency), and the delay parameter is searched in parallel in two ways before and after the lookup table, starting from the middle position (delay parameter is 0), that is, in the order of n=Δ, n= - Δ, n=2Δ, n= -2Δ. Whenever a delay parameter is found, the data or clock is delayed accordingly.
When one path detects that the interaction between the FPGA and the external bus is wrong, the path stops searching, and the delay parameter in the current lookup table is recorded and is recorded as D1. And the other path is continuously searched, the delay is stopped after the delay with the interaction error is found, and the delay parameter in the current lookup table is recorded and is recorded as D2.
And further determining the optimal delay parameter according to the interface working frequency of the current communication bus.
High-speed bus
If the current communication bus is a high-speed bus, i.e. the working frequency of the interface is greater than 100MHz, the searching strategy 2 adopts a fine searching table, and after the data boundary values D1 and D2 obtained by searching are used, the delay value delay_h between the optimal sampling position at the data intermediate position and the current sampling point is calculated and recorded as the optimal delay value. The overall delay adjustment is schematically shown in fig. 3.
|D2|≥|D1|>0,delay_h∈R
Low speed bus
If the current communication bus is a low-speed bus, i.e. the working frequency of the interface is less than or equal to 100MHz, a coarse lookup table is adopted in the above-mentioned search strategy 2, and after the data boundary values D1 and D2 obtained by searching, the fine lookup table is further required to be used for accurately confirming the boundary position of the data, and the step 5 is shifted to;
step 5, in the case of low speed buses, the front and rear boundary positions of the data have been located near D1 and D2, respectively, using a coarse look-up table. The data boundary is then further pinpointed using a fine look-up table.
When the coarse lookup table confirms that D1 is a data boundary location, search strategy 3 is employed: the search range of the fine lookup table is set to [ D1-5ns, d1+5ns ], and the search is performed in parallel in two directions before and after the lookup table at the same time by dividing from D1, that is, in the order of n=d1+Δ, n=d1- Δ, n=d1+2Δ, n=d1-2Δ. Whenever a delay parameter is found, the data or clock is delayed accordingly.
When a change in the data interaction condition is detected (the interaction state changes from normal to wrong or from wrong to normal), both searches are stopped, and the delay parameter in the current lookup table is recorded and recorded as d1.
Similarly, when the coarse lookup table identifies D2 as another boundary position of the data, search strategy 3 is also employed: the search range of the fine lookup table is set to [ D2-5ns, d2+5ns ], and the delay parameters are searched in parallel in two directions before and after the lookup table simultaneously by dividing from D2, i.e., in the order of n=d2+Δ, n=d2- Δ, n=d2+2Δ, n=d2-2Δ. Whenever a delay parameter is found, the data or clock is delayed accordingly.
When detecting the change of the data interaction condition (the interaction state is changed from normal to error or from error to normal), stopping both paths of searching, and recording the delay parameter in the current lookup table, namely d2;
and calculating a delay_l between the optimal sampling position at the data middle position and the current sampling point, and recording the delay_l as the optimal delay value of the bus. The overall delay adjustment is schematically shown in fig. 4 to 7.
|d2|≥|d1|>0,delay_l∈R
And 6, after the optimal delay value is determined, the FPGA delays the data or the clock correspondingly according to the optimal delay value, and then the effective data is transmitted subsequently. The specific delay operation is as follows:
when the current optimal delay value is positive number (delay > 0), the clock is delayed backward by delay ns or ps;
when the current optimal delay value is negative (delay < 0), the data is delayed backward by |delay|ns or ps.
The invention considers the influence of the FPGA on the transmission delay of the chip circuit under different working temperatures, layout and wiring and other environmental conditions, and performs parallel search on the optimal delay parameters of the current bus based on iterative use of the thick lookup table and the thin lookup table. Compared with the conventional FPGA and external bus data interaction method, the pre-training-based FPGA and external bus data interaction mechanism improves the communication reliability of the FPGA and the external bus and reduces the error rate of the FPGA and the external bus. Meanwhile, the method has good adaptability, can be universally suitable for buses with different interface rates, and can quickly and accurately search the optimal delay value of the current bus through the self-adaptive selection of the coarse and fine search tables.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims (10)

1. The method for interacting the FPGA and the external bus data based on the pre-training is characterized by comprising the following steps:
step 1, pre-storing two lookup tables related to delay parameters in an FPGA, wherein one lookup table is a coarse lookup table and is used for pre-performing coarse search in a low-speed bus, the other lookup table is a fine lookup table and is used for performing optimal delay search in a high-speed bus or further performing fine search on the basis of coarse search in the low-speed bus, the search step is delta, and the coarse lookup table is different from the fine lookup table in search step;
step 2, under the initial condition that delay is not carried out, namely the delay parameter n is 0, firstly, the FPGA and an external bus are transmitted with a section of known training sequence, and the interaction condition is observed; if the communication is in the communication error state at first, the step 3 is switched to; if the communication is in the normal communication state, the step 4 is carried out;
step 3, when the communication between the FPGA and the external bus is in an error state, adopting a search strategy 1: setting the lookup range of the lookup table as [ -clk, clk ], wherein clk is the interface working frequency, and starting from the middle position of the lookup table, namely the delay parameter n=0, the lookup table is divided into two paths to search in the front and back directions of the lookup table at the same time, namely the delay parameter is searched in the sequence of n=delta, n= -delta, n=2delta, n= -2delta. Each time a delay parameter is searched, corresponding delay is carried out on data or a clock;
when one path detects normal interaction data, the path stops searching, and the delay parameter in the current lookup table is recorded and is recorded as D1; the other path starts searching from the tail end of the lookup table in the direction D1, stops after finding the delay which can be interacted normally, records the delay parameter in the current lookup table and records as D2; d1 and D2 are distances between front and rear boundary positions of data and the current sampling clock edge;
if the current communication bus is a high-speed bus, the searching strategy 1 adopts a fine searching table to directly determine the optimal delay parameter; if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the search strategy 1, and the boundary position of the data needs to be accurately confirmed by further utilizing a fine lookup table, and the step 5 is shifted to;
and 4, when the communication between the FPGA and the external bus is in a normal state, adopting a search strategy 2: setting the lookup range of the lookup table as [ -clk, clk ], starting from the middle position, namely the delay parameter n=0, dividing the lookup table into two paths and searching the two directions in front and back of the lookup table at the same time, namely searching the delay parameter in the sequence of n=delta, n= -delta, n=2delta, n= -2delta.
When one path detects that the interaction between the FPGA and the external bus is wrong, the path stops searching, and the delay parameter in the current lookup table is recorded and is recorded as D1; the other path is continuously searched, the delay is stopped after the delay with the interaction error is found, and the delay parameter in the current lookup table is recorded and is recorded as D2;
if the current communication bus is a high-speed bus, the searching strategy 2 adopts a fine searching table to directly determine the optimal delay value; if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the search strategy 2, and the boundary position of the data needs to be accurately confirmed by further utilizing a fine lookup table, and the step 5 is shifted to;
step 5, when the coarse lookup table confirms that D1 is a data boundary position, adopting a search strategy 3: setting the search range of the fine lookup table to be [ D1-5ns, D1+5ns ], and dividing the search range from D1 into two paths to search in parallel in the front and rear directions of the lookup table at the same time, namely searching the delay parameters in the sequence of n=d1+Δ, n=d1- Δ, n=d1+2Δ, n=d1-2Δ. Each time a delay parameter is searched, corresponding delay is carried out on data or a clock;
when detecting the change of the data interaction condition, stopping both paths of searching, and recording delay parameters in the current lookup table, wherein the delay parameters are recorded as d1;
when the coarse look-up table identifies D2 as another boundary position for the data, search strategy 3 is also employed: setting the search range of the fine lookup table to be [ D2-5ns, d2+5ns ], and dividing the fine lookup table into two paths from D2 and searching the fine lookup table in parallel in the front and rear directions of the lookup table at the same time, namely searching the delay parameters in the sequence of n=d2+delta, n=d2-delta, n=d2+2delta, n=d2-2delta. Each time a delay parameter is searched, corresponding delay is carried out on data or a clock;
when detecting the change of the data interaction condition, stopping both paths of searching, and recording delay parameters in the current lookup table, namely d2;
calculating an optimal delay value of the bus according to d1 and d2;
and 6, after the optimal delay value is determined, the FPGA delays the data or the clock correspondingly according to the optimal delay value, and then the effective data is transmitted subsequently.
2. The method for data interaction between an FPGA and an external bus based on pre-training according to claim 1, wherein the low-speed bus is a bus with an interface working frequency not greater than 100MHz, and the high-speed bus is a bus with an interface working frequency greater than 100 MHz.
3. The pretrained FPGA and external bus data interaction method of claim 1, wherein the coarse look-up table has a search step of Δ=5ns and the fine look-up table has a search step of Δ=78ps.
4. The pretrained FPGA and external bus data interaction method of claim 1, wherein when performing a fine lookup in the low speed bus, the fine lookup table has a lookup range of [ D-5ns, d+5ns ], where D is the coarse lookup result of the coarse lookup table; when the best delay search is performed in the high-speed bus, the search range of the fine search table is [ -clk, clk ].
5. The method for interaction between the FPGA and the external bus based on the pre-training according to any one of claims 1 to 4, wherein the FPGA carries out corresponding time delay on data and a clock according to the currently searched time delay parameter n, wherein n is a real number, and the data interaction condition is observed; the specific delay operation is as follows: when the delay parameter is positive number n > 0, the clock is delayed to nns or ps; when the delay parameter is negative number n < 0, the data is delayed backward by |n|ns or ps.
6. The method for interaction between FPGA and external bus based on pre-training of claim 5, wherein in step 2, if the external bus is a parallel bus, directly comparing whether the received data is consistent with the known training sequence, and if there are more than 4 consecutive bytes consistent with the training sequence, it is determined that the communication between the current FPGA and the external parallel bus is in a normal state, otherwise, it is in an error state; if the external bus is a serial bus, serial-parallel conversion is performed immediately after the data is received, whether the converted received data is consistent with a known training sequence is observed in a comparison manner, more than 4 continuous bytes are consistent with the training sequence, namely, the communication between the current FPGA and the external serial bus is considered to be in a normal state, and otherwise, the communication between the current FPGA and the external serial bus is in an error state.
7. The method of data interaction between FPGA and external bus based on pre-training of claim 6, wherein in step 3, if the current communication bus is a high-speed bus, a fine search table is adopted in the search strategy 1, and after searching the obtained data boundary values D1 and D2, a delay_h between the best sampling position at the middle position of the data and the current sampling clock edge is calculated and recorded as the best delay value;
if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the searching strategy 1, and after the data boundary values D1 and D2 obtained by searching are needed to be further confirmed accurately by using a fine lookup table, the step is shifted to step 5.
8. The method for data interaction between a pre-trained FPGA and an external bus according to claim 6, wherein in step 4,
if the current communication bus is a high-speed bus, a fine search table is adopted in the search strategy 2, and after the data boundary values D1 and D2 obtained by searching are used for calculating a delay_h between the optimal sampling position at the data middle position and the current sampling point, and the delay_h is recorded as an optimal delay value;
if the current communication bus is a low-speed bus, a coarse lookup table is adopted in the searching strategy 2, and after the data boundary values D1 and D2 obtained by searching are needed to be further confirmed accurately by using a fine lookup table, the step is shifted to step 5.
9. The method for interaction between the FPGA and the external bus based on the pre-training according to claim 7 or 8, wherein in step 5, the interaction status is changed from normal to wrong or from wrong to normal; the method for calculating the optimal delay value of the bus according to d1 and d2 is as follows:
calculating a delay_l between the optimal sampling position at the middle position of the data and the current sampling point, and recording the delay_l as an optimal delay value of the bus;
10. the method for interaction between FPGA and external bus data based on pre-training according to claim 9, wherein the delay operation in step 6 is as follows:
when the current optimal delay value is positive, namely delay is more than 0, the clock is delayed backwards by delay or ps;
when the current optimal delay value is negative, i.e. delay < 0, the data is delayed backward by |delay|ns or ps.
CN202210564575.0A 2022-05-23 2022-05-23 Pre-training-based FPGA and external bus data interaction method Active CN114896186B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210564575.0A CN114896186B (en) 2022-05-23 2022-05-23 Pre-training-based FPGA and external bus data interaction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210564575.0A CN114896186B (en) 2022-05-23 2022-05-23 Pre-training-based FPGA and external bus data interaction method

Publications (2)

Publication Number Publication Date
CN114896186A CN114896186A (en) 2022-08-12
CN114896186B true CN114896186B (en) 2023-09-26

Family

ID=82724049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210564575.0A Active CN114896186B (en) 2022-05-23 2022-05-23 Pre-training-based FPGA and external bus data interaction method

Country Status (1)

Country Link
CN (1) CN114896186B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101873196A (en) * 2010-05-27 2010-10-27 北京经纬恒润科技有限公司 Method, system and interface card for transmitting data at high speed
CN104571264A (en) * 2014-12-29 2015-04-29 大唐移动通信设备有限公司 Delay adjusting method and delay adjusting device
CN106788951A (en) * 2016-11-30 2017-05-31 中国科学院长春光学精密机械与物理研究所 A kind of high speed source synchronization LVDS interface intialization phase alignment schemes
CN107491407A (en) * 2017-07-03 2017-12-19 西安空间无线电技术研究所 Self-adapting high-speed Transmission system based on SERDES in FPGA
CN108646984A (en) * 2018-05-16 2018-10-12 华为技术有限公司 A kind of DQS location regulation methods and device
US10659215B1 (en) * 2018-09-19 2020-05-19 Xilinx, Inc. Training and tracking of DDR memory interface strobe timing
CN111506527A (en) * 2020-04-13 2020-08-07 天津飞腾信息技术有限公司 Digital high-speed parallel bus adaptive interval correction method, device and storage medium
CN112511163A (en) * 2020-11-16 2021-03-16 西安电子工程研究所 AD input FPGA source synchronization parameter automatic calculation method based on correct data boundary
CN113078909A (en) * 2021-03-23 2021-07-06 汕头市超声检测科技有限公司 Multichannel high-speed serial LVDS data sorting method and circuit based on FPGA
CN113630296A (en) * 2021-08-31 2021-11-09 中国电子科技集团公司第三十八研究所 Automatic LVDS transmission delay window testing method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10148416B2 (en) * 2016-09-02 2018-12-04 Intel Corporation Signal phase optimization in memory interface training

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101873196A (en) * 2010-05-27 2010-10-27 北京经纬恒润科技有限公司 Method, system and interface card for transmitting data at high speed
CN104571264A (en) * 2014-12-29 2015-04-29 大唐移动通信设备有限公司 Delay adjusting method and delay adjusting device
CN106788951A (en) * 2016-11-30 2017-05-31 中国科学院长春光学精密机械与物理研究所 A kind of high speed source synchronization LVDS interface intialization phase alignment schemes
CN107491407A (en) * 2017-07-03 2017-12-19 西安空间无线电技术研究所 Self-adapting high-speed Transmission system based on SERDES in FPGA
CN108646984A (en) * 2018-05-16 2018-10-12 华为技术有限公司 A kind of DQS location regulation methods and device
US10659215B1 (en) * 2018-09-19 2020-05-19 Xilinx, Inc. Training and tracking of DDR memory interface strobe timing
CN111506527A (en) * 2020-04-13 2020-08-07 天津飞腾信息技术有限公司 Digital high-speed parallel bus adaptive interval correction method, device and storage medium
CN112511163A (en) * 2020-11-16 2021-03-16 西安电子工程研究所 AD input FPGA source synchronization parameter automatic calculation method based on correct data boundary
CN113078909A (en) * 2021-03-23 2021-07-06 汕头市超声检测科技有限公司 Multichannel high-speed serial LVDS data sorting method and circuit based on FPGA
CN113630296A (en) * 2021-08-31 2021-11-09 中国电子科技集团公司第三十八研究所 Automatic LVDS transmission delay window testing method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于FPGA的SDRAM时序差异自适应适配方法设计;吕岩川;《中国优秀硕士学位论文全文数据库(电子期刊)》;I137-110 *

Also Published As

Publication number Publication date
CN114896186A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
EP1723534B1 (en) Data sampling clock edge placement training for high speed gpu-memory interface
US8391347B2 (en) Decision feedback equalizer (DFE) circuits for use in a semiconductor memory device and initializing method thereof
US7899144B2 (en) Semiconductor integrated circuit device
US7477068B2 (en) System for reducing cross-talk induced source synchronous bus clock jitter
US7656983B2 (en) Dual clock domain deskew circuit
CN101154434B (en) Semiconductor memory device and method for operating the same
US8332680B2 (en) Methods and systems for operating memory in two modes
CN114896186B (en) Pre-training-based FPGA and external bus data interaction method
JP3344466B2 (en) Signal transfer control method and circuit thereof
US7120838B2 (en) Method and unit for deskewing signals
CN102394808B (en) Method and apparatus for phase adaption and frame alignment of serial media independent interface of ethernet network
CN205179007U (en) Reduce required hold time&#39;s of foundation of chip input port circuit
US7376190B2 (en) Asynchronous data transmitting apparatus
CN111865272A (en) Voltage type phase interpolator circuit
US9141459B2 (en) Precursor adaptation algorithm for asynchronously clocked SERDES
US20110103458A1 (en) Asymmetric decision feedback equalization slicing in high speed transceivers
CN113141477A (en) Drive time sequence control method of CMOS detector
US20200099507A1 (en) Clock recovery system
WO2009069094A1 (en) Method and device for routing data between components
US11901038B2 (en) Memory system
CN112306943B (en) Idelay real-time adjustment method based on FPGA high-speed SerDes
US6959345B1 (en) Method and system for calibrating SCSI expanders
US20230421161A1 (en) Clock data recovery circuits and electronic systems that support data-based clock recovery
US20240121073A1 (en) Reducing eye asymmetry caused by voltage variation in a clock and data recovery circuit or delay locked loop
CN105262464A (en) Circuit and method for reducing required retention time at chip input port

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Li Yao

Inventor after: Lv Zhiwu

Inventor after: Fan Zhouhua

Inventor before: Li Yao

Inventor before: Lv Zhiwu

GR01 Patent grant
GR01 Patent grant