CN117833918A - Self-adaptive high-speed parallel data sampling device - Google Patents

Self-adaptive high-speed parallel data sampling device Download PDF

Info

Publication number
CN117833918A
CN117833918A CN202311616859.0A CN202311616859A CN117833918A CN 117833918 A CN117833918 A CN 117833918A CN 202311616859 A CN202311616859 A CN 202311616859A CN 117833918 A CN117833918 A CN 117833918A
Authority
CN
China
Prior art keywords
delay
parallel data
speed parallel
data
control module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311616859.0A
Other languages
Chinese (zh)
Inventor
邓强
王松明
徐波
陆正宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 10 Research Institute
Original Assignee
CETC 10 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 10 Research Institute filed Critical CETC 10 Research Institute
Priority to CN202311616859.0A priority Critical patent/CN117833918A/en
Publication of CN117833918A publication Critical patent/CN117833918A/en
Pending legal-status Critical Current

Links

Abstract

The application discloses self-adaptation high-speed parallel data sampling device, delay control module receives first control command under training mode, carry out automatic adjustment time delay algorithm and send the second control command to time delay unit, time delay unit selects corresponding tap according to the second control command, delay the high-speed parallel data of receiving after selecting corresponding tap to realize the time delay demand, carry out time delay adjustment between data and clock, confirm the sampling point, guarantee that high-speed parallel data samples correctly, improve transmission rate.

Description

Self-adaptive high-speed parallel data sampling device
Technical Field
The application relates to the technical field of high-speed parallel data sampling, in particular to a self-adaptive high-speed parallel data sampling device.
Background
Along with the continuous increase of the rate of the chip-level interconnection interface, the interface of the digital circuit is easy to generate metastable state phenomenon, and the phenomenon of data misalignment caused by different delay of data paths can also occur.
The multi-bit wide data has a deviation from a reference clock due to the problems of incomplete device time sequence constraint, mismatch of board-level wiring, inconsistent delay among chip pins and the like, so that the data deviates from an optimal sampling point, and even data transmission errors occur when the transmission data quantity is large. Fig. 1 shows a schematic diagram of a timing deviation in the prior art, and due to these factors, the data is changed from an ideal mutually aligned state to a non-aligned state. Fig. 2 shows the phase relationship of clock and data in the source synchronous transmission mode in the prior art. Due to the difference in process angle and data/clock jitter, the phase between the data and clock is not ideal, and the offset (skew) between the data and clock can be normalized. By setting the respective timing constraints, the respective offset values can be limited so that the offset between the data and the clock is controlled within one cycle.
Therefore, how to solve the problems of inconsistent data delay, complex timing constraint, difficult debugging, low transmission rate and the like of the high-speed interface of the chip becomes one of many directions studied by those skilled in the art.
Disclosure of Invention
In order to overcome the defects of the prior art, the self-adaptive high-speed parallel data sampling device is provided, a delay unit is added in a high-speed interface of a chip, different taps of the delay unit are selected to actively adjust the delay of each bit of data relative to an associated clock, so that each bit of data sampled by the associated clock is ensured to be positioned at a change center point, the maximum timing margin is provided, accurate transmission of the high-speed data is realized, and the reliability of the chip function is improved.
The purpose of the application is realized through the following technical scheme:
in a first aspect, the present application proposes an adaptive high-speed parallel data sampling device, the device comprising a delay unit and a delay control module connected to the delay unit, the delay unit comprising a plurality of taps;
the delay control module receives a first control instruction in a training mode, executes an automatic delay adjustment algorithm and sends a second control instruction to the delay unit;
and the delay unit selects a corresponding tap according to the second control instruction, and delays the received high-speed parallel data after the corresponding tap is selected so as to realize delay requirements.
In a possible embodiment, the apparatus further comprises a multiplexer coupled to the delay unit, the multiplexer receiving a third control signal to complete isolation of the delay unit.
In one possible embodiment, the apparatus further comprises a register coupled to the multiplexer, the register configured to sample the isolation result of the multiplexer, receive the associated clock, and send the first control instruction to the delay control module.
In a possible implementation manner, the device further comprises a FIFO module connected to the register, the FIFO module operating under a system clock sysclk and being used for isolating the channel clock from the chip clock sent by the register.
In a possible implementation manner, the device further comprises an output selection multiplexer connected with the FIFO module, and the output selection multiplexer receives the instruction sent by the FIFO module and selects the output channel to be od_bus1 or od_bus2 under the control of the third control instruction.
In one possible implementation, the dclt_ lce interface of the delay control module switches the delay control module to a manual mode, and external logic is used to adjust the delay of the high-speed interface, and the delay is input by the dctl_ ldi interface.
In one possible implementation, the delay control module fixes the taps of the delay cells in the normal operating mode.
In one possible implementation manner, the device controls the delay amount through the delay control module in an automatic mode, reads a sampling value, judges the rising edge before and after the data, and calculates the optimal delay control amount.
In one possible implementation, the apparatus is effective in a dclt_ lce interface of the delay control module in the manual mode, and inputs the desired delay value through the dctl_ ldi interface so that the delay value of the delay unit is changed through the delay control module, and reception of the delay control amount is completed.
The main scheme and each further option of the application can be freely combined to form a plurality of schemes, which are all schemes that can be adopted and claimed by the application; and the selection(s) of non-conflicting choices and other choices may be freely combined. Numerous combinations will be apparent to those skilled in the art upon review of the present application, and are not intended to be exhaustive or to be construed as limiting the scope of the invention.
The application discloses self-adaptation high-speed parallel data sampling device, delay control module receives first control command under training mode, carry out automatic adjustment time delay algorithm and send the second control command to time delay unit, time delay unit selects corresponding tap according to the second control command, delay the high-speed parallel data of receiving after selecting corresponding tap to realize the time delay demand, carry out time delay adjustment between data and clock, confirm the sampling point, guarantee that high-speed parallel data samples correctly, improve transmission rate.
Drawings
Fig. 1 shows a schematic diagram of a timing deviation in the prior art.
Fig. 2 shows the phase relationship of clock and data in the source synchronous transmission mode in the prior art.
Fig. 3 shows a schematic structural diagram of an adaptive high-speed parallel data sampling device according to an embodiment of the present application.
Fig. 4 is a schematic diagram of transmission timing in an ideal case as proposed in the present application.
Fig. 5 shows a schematic diagram of a normal operation mode of the delay control module according to an embodiment of the present application.
Fig. 6 shows a schematic diagram of an automatic delay algorithm according to an embodiment of the present application.
Fig. 7 shows a schematic diagram of an automatic mode of an adaptive device according to an embodiment of the present application.
Fig. 8 shows a schematic diagram of a manual mode of the adaptive device according to an embodiment of the present application.
Fig. 9 shows a state transition diagram of the delay control module in the embodiment of the present application.
Detailed Description
Other advantages and effects of the present application will become apparent to those skilled in the art from the present disclosure, when the following description of the embodiments is taken in conjunction with the accompanying drawings. The present application may be embodied or carried out in other specific embodiments, and the details of the present application may be modified or changed from various points of view and applications without departing from the spirit of the present application. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict.
All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
In the prior art, due to the problems of incomplete device time sequence constraint, mismatch of board-level wiring, inconsistent delay among chip pins and the like, each bit of data has a deviation from a reference clock, the deviation causes the data to deviate from an optimal sampling point, and even data transmission errors occur when the transmission data volume is large. Fig. 1 shows a schematic diagram of a timing deviation in the prior art, and due to these factors, the data is changed from an ideal mutually aligned state to a non-aligned state. Fig. 2 shows the phase relationship of clock and data in the source synchronous transmission mode in the prior art. Due to the difference in process angle and data/clock jitter, the phase between the data and clock is not ideal, and the offset (skew) between the data and clock can be normalized. By setting the respective timing constraints, the respective offset values can be limited so that the offset between the data and the clock is controlled within one cycle.
Therefore, in order to solve the problems of inconsistent data delay, complex time sequence constraint, difficult debugging, low transmission rate and the like of the chip high-speed interface, the embodiment of the application provides a self-adaptive high-speed parallel data sampling device which can automatically realize delay adjustment between data and a clock, determine an optimal sampling point, ensure that the high-speed parallel data is sampled correctly and accurately, improve the transmission rate, form an IP core to multiplex between AISC chips or FPGA, promote related technologies and algorithms, have good reusability, and further explain the problems in detail.
Referring to fig. 3, fig. 3 shows a schematic structural diagram of a self-adaptive high-speed parallel data sampling device according to an embodiment of the present application, where the device can be applied to an ASIC chip interface, and related techniques and algorithms can be popularized to FPGA high-speed parallel interface applications. The device comprises a delay unit and a delay control module connected with the delay unit, wherein the delay unit comprises a plurality of taps;
the delay control module receives a first control instruction in a training mode, and executes an automatic delay adjustment algorithm to send a second control instruction to the delay unit;
the delay unit selects a corresponding tap according to the second control instruction, and delays the received high-speed parallel data after the corresponding tap is selected so as to realize delay requirements.
The chip exchanges data with the outside through the input parallel interface, the data signal consists of a channel associated clock input dclk_in and a 16-bit high-speed parallel data input data_in [15:0] signal, and the chip external input data and clock are received through a bus and sent to the inside of the module for processing. Fig. 4 is a schematic diagram of transmission timing sequence under an ideal condition proposed in the present application, in an SDR source synchronous mode, a clock is aligned with data as a center, that is, a signal sampling edge is a clock rising edge, data changes at a clock falling edge, at this time, an optimal sampling point is processed by a path clock, and signal reliability is highest.
The delay unit is built by delay modules provided by a process library and is formed by cascading fixed delay standard units, n (n is less than or equal to 64) taps are totally arranged, each tap delays Xps (X is not more than 100 ps), and the maximum delay traverses the whole clock period. The interface data bit width N > =1, and each bit of the transmission data can be delayed by selecting a plurality of input and output channels in the delay unit.
The delay control module has two modes of operation: a training mode and a normal operating mode. The training mode is executed first, then the normal working mode is executed, the delay control module firstly samples a group of training data in the training mode, the automatic delay adjustment algorithm can be completed according to the automatic delay adjustment algorithm, a second control instruction delay_ctl is provided, corresponding taps in the delay unit are selected through the signal, corresponding required delay requirements are achieved, and therefore the data are located at the optimal sampling points after passing through the delay unit and control, and corresponding required delay is determined.
Fig. 5 shows a schematic diagram of a normal working mode of a delay control module according to an embodiment of the present application, where the delay control module (in a standby state) plays a role of fixing a tap of a delay unit, and after delay adjustment, data is controlled by dctl_sel to select whether to receive original data or adjusted data, and then a register is used to sample and buffer the data in a FIFO unit, and then a corresponding output channel is selected to send the data into the system.
In addition, the delay control module selects different tap outputs through corresponding control signals delay_ctl, so that data delay with different sizes is realized. Since the data bit width is 16 bits, 16 data delay lanes are required. The optimal sampling point of the data is realized by determining the deviation between the clock and the data through the delay control module, adjusting the output delay of the delay unit, and changing the phase between the data and the clock so as to enable the data and the clock to be positioned at the optimal sampling point.
Referring to fig. 6, fig. 6 is a schematic diagram showing an automatic delay algorithm according to an embodiment of the present application, wherein a required delay is determined by the automatic delay algorithm, and a set of training data data_dlyn (the source synchronous clock is clk) in the same frequency and opposite to the clock is input to the data_in port. The training starts to control the input of original training data by a dctl_sel control signal, the current data value is obtained by sampling after the control signal is registered by a delay control module, the size of a delay control signal delay_ctl is sequentially increased, and then the data delay of a delay unit is gradually increased.
Then the dctl_sel signal is switched to a delayed training data input channel, the delay control module registers and samples, and when the sampling data jumps, the current delay control signal dly1 is recorded. And continuously increasing the data delay, recording the size dly3 of the current delay control signal when the sampling data is hopped again, calculating the delay control signal size dly2 required by the optimal sampling point of the current signal by dly1 and dly3, and roughly estimating the deviation between the data and the clock under the current chip working environment according to the step delay size of the delay unit.
Therefore, after the delay training is completed and the optimal sampling point of the data is determined, the delay control signal is dly2, the self-adaptive high-speed parallel data sampling device enters a normal working mode, the tap of the delay unit is fixed, and the data is located at the optimal sampling point after delay, so that high-speed and reliable data transmission is realized.
In order to reduce clock data offset, reduce clock jitter and ensure consistent board level wiring, the most effective way is to add a delay unit in a high-speed interface of a chip, actively adjust the delay of each bit of data relative to an associated clock by selecting different taps of the delay unit, realize delay adjustment of the data and the clock, ensure that each bit of data sampled by the associated clock is positioned at a change center point, have the largest timing margin, further realize accurate transmission of the high-speed data, and improve the reliability of the chip function.
The dclt_ lce interface of the delay control module switches the delay control module into a manual mode, external logic is used for adjusting the delay of the high-speed interface, the delay is input by the dctl_ ldi interface, and the delay control of manual input can be realized.
In addition, the adaptive high-speed parallel data sampling device provided by the embodiment of the application further comprises a multiplexer connected with the delay unit, and the multiplexer receives a third control signal to complete isolation of the delay unit.
The device also comprises a register connected with the multiplexer, wherein the register is used for sampling the isolation result of the multiplexer, receiving the associated clock and sending a first control instruction to the delay control module.
The multiplexer is controlled by the dctl_sel signal to complete isolation of the delay unit, and the corresponding result is sampled through a register.
The device also comprises a FIFO module connected with the register, wherein the FIFO module works under a system clock sysclk and is used for isolating the channel associated clock from a chip clock sent by the register.
The delay control module works under the channel associated clock dclk_in, the internal circuit works under the system clock sysclk, and cross-clock domain register of data can possibly occur.
The device also comprises an output selection multiplexer connected with the FIFO module, wherein the output selection multiplexer receives the instruction sent by the FIFO module under the control of the third control instruction and selects an output channel to be od_bus1 or od_bus2.
In a possible implementation manner, the adaptive high-speed parallel data sampling device provided by the embodiment of the application controls the delay amount through the delay control module in an automatic mode, reads the sampling value, judges the rising edge before and after the data, and calculates the optimal delay control amount.
When the optimal sampling point training is carried out through the internal delay unit, two working modes exist: the automatic mode and the manual mode can be switched by corresponding control signals. Referring to fig. 7, fig. 7 is a schematic diagram showing an automatic mode of the adaptive device according to the embodiment of the present application, and the FIFO and the output channel selection module are not used. The delay control module automatically controls the delay amount, reads the sampling value, judges the rising edge before and after the data, and calculates the optimal delay control amount.
In another possible implementation manner, fig. 8 shows a schematic diagram of a manual mode of the adaptive device according to the embodiment of the present application, where relevant signals in an automatic mode are masked, and the dclt_ lce interface of the delay control module is valid in the manual mode, and a desired delay value is input through the dctl_ ldi interface, so that the delay value of the delay unit is changed through the delay control module, so as to complete receiving the delay control amount.
In addition, in order to achieve the effects of high-speed interface delay self-adaptation and trainable IP core, firstly, a Verilog is used for realizing the delay self-adaptation and trainable IP core circuit, so that the delay self-adaptation IP adjusts interface data delay according to the actual chip working environment; secondly, the FIFO is asynchronous or synchronous FIFO with configurable bit and depth, and the data path of the delay unit can be selected through parameter configuration; then, configuring data to an output channel in the chip according to actual needs; the delay control module realizes an automatic delay adjustment algorithm and multi-mode switching in a state machine mode; finally, compiling simulation by using a VCS, checking waveforms by using Verdi, and completing synthesis by using a Design Compiler.
Describing the high-speed interface delay self-adaption and trainable IP core, the hardware implementation aspect adopts the high-speed interface delay self-adaption IP circuit structure (figure 3) realized by Verilog language according to the function definition and algorithm. The FIFO is a configurable asynchronous FIFO, the default bit width is 32 x 16bit, the data bit width, the storage depth and the like of the FIFO are subjected to high parameterization setting when the Verilog is written, and the FIFO can be recalculated and configured according to the clock frequency and the read-write rate at two sides of an actual interface. The delay unit is provided with a plurality of data channels, and corresponding parameters can be set according to the interface size of an actual system.
The delay control module uses a finite state machine to complete an automatic adjustment delay algorithm, and fig. 9 shows a state transition diagram of the delay control module in the embodiment of the application. The normal operation mode is IDLE, in which state the delay control signal delay_ctl holds the result of the last training or the manually entered value. When the training mode valid signal start is pulled up, the method jumps to the training mode initialization state INIT_TAP, clears the value of the delay_ctl signal, then jumps to the POS_ONE state, detects a high level, increments the delay_ctl, jumps to the POS_ZERO state when detecting a low level until detecting the high level again, enters EDGE_IDEN, calculates the position of the optimal sampling point, and sets the delay_ctl signal. ONE_ERR and ZERO_ERR are two error states, and when the delay_ctl increases to a maximum value in the POS_ONE or POS_ZERO state, the error state is entered if no jump edge is detected yet, the delay_ctl signal is reset, and the IDLE state is returned.
Compared with the prior art, the embodiment of the application has the following beneficial effects:
first, because the delay control module works under the channel associated clock dclk_in, and the internal circuit works under the system clock sysclk, the cross-clock domain register of data may occur, in order to avoid the metastable state, the FIFO module is used to isolate the input clock from the internal working clock of the chip, so as to improve the stability of the system.
Secondly, due to the complex function of the high-speed interface delay self-adaptive circuit, the phase relation of a data clock possibly encountered is complex, and in order to ensure that the IP core can work normally under various combination relations, comprehensive simulation verification is needed. By constructing a UVM verification platform, clock and data conditions under various conditions are provided, the coverage rate level of the function is inspected, and the simulation verification is determined to reach the expected target.
The foregoing description of the preferred embodiments of the present application is not intended to be limiting, but is intended to cover any and all modifications, equivalents, and alternatives falling within the spirit and principles of the present application.

Claims (9)

1. The self-adaptive high-speed parallel data sampling device is characterized by comprising a delay unit and a delay control module connected with the delay unit, wherein the delay unit comprises a plurality of taps;
the delay control module receives a first control instruction in a training mode, executes an automatic delay adjustment algorithm and sends a second control instruction to the delay unit;
and the delay unit selects a corresponding tap according to the second control instruction, and delays the received high-speed parallel data after the corresponding tap is selected so as to realize delay requirements.
2. The adaptive high-speed parallel data sampling device of claim 1, further comprising a multiplexer coupled to the delay unit, the multiplexer receiving a third control signal to complete isolation of the delay unit.
3. The adaptive high-speed parallel data sampling device of claim 2, further comprising a register coupled to the multiplexer, the register configured to sample the isolation result of the multiplexer, receive the associated clock, and send the first control command to the delay control module.
4. An adaptive high-speed parallel data sampling device according to claim 3, further comprising a FIFO module connected to the register, said FIFO module operating at the system clock sysclk for isolating the channel clock from the chip clock sent by the register.
5. The adaptive high-speed parallel data sampling device according to claim 4, further comprising an output selection multiplexer connected to the FIFO module, the output selection multiplexer receiving the instruction sent by the FIFO module and selecting the output channel to be od_bus1 or od_bus2 under the control of the third control instruction.
6. The adaptive high-speed parallel data sampling device of claim 1, wherein the dclt_ lce interface of the delay control module switches the delay control module to a manual mode, and external logic is used to adjust the delay of the high-speed interface, wherein the delay is input by the dctl_ ldi interface.
7. The adaptive high-speed parallel data sampling device of claim 1, wherein the delay control module fixes taps of the delay cells in a normal operating mode.
8. The adaptive high-speed parallel data sampling device according to claim 1, wherein the device controls the delay amount by a delay control module in an automatic mode, reads sampling values, and judges rising edges before and after data to calculate an optimal delay control amount.
9. The adaptive high-speed parallel data sampling device according to claim 1, wherein the dclt_ lce interface of the delay control module is effective in a manual mode, and a desired delay value is input through the dctl_ ldi interface so that the delay value of the delay unit is changed through the delay control module to complete the receiving of the delay control quantity.
CN202311616859.0A 2023-11-29 2023-11-29 Self-adaptive high-speed parallel data sampling device Pending CN117833918A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311616859.0A CN117833918A (en) 2023-11-29 2023-11-29 Self-adaptive high-speed parallel data sampling device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311616859.0A CN117833918A (en) 2023-11-29 2023-11-29 Self-adaptive high-speed parallel data sampling device

Publications (1)

Publication Number Publication Date
CN117833918A true CN117833918A (en) 2024-04-05

Family

ID=90508571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311616859.0A Pending CN117833918A (en) 2023-11-29 2023-11-29 Self-adaptive high-speed parallel data sampling device

Country Status (1)

Country Link
CN (1) CN117833918A (en)

Similar Documents

Publication Publication Date Title
US6744285B2 (en) Method and apparatus for synchronously transferring data across multiple clock domains
US7323903B1 (en) Soft core control of dedicated memory interface hardware in a programmable logic device
US7590008B1 (en) PVT compensated auto-calibration scheme for DDR3
JP6113215B2 (en) Write leveling implementation in programmable logic devices
US7249290B2 (en) Deskew circuit and disk array control device using the deskew circuit, and deskew method
US5867541A (en) Method and system for synchronizing data having skew
US20030099253A1 (en) Apparatus and method for arbitrating data transmission amongst devices having SMII standard
US20060009931A1 (en) Automated calibration of I/O over a multi-variable eye window
US8520464B2 (en) Interface circuit and semiconductor device incorporating same
US8205110B2 (en) Synchronous operation of a system with asynchronous clock domains
JP3966511B2 (en) Method and system for automatic delay detection and receiver adjustment for synchronous bus interface
US6680636B1 (en) Method and system for clock cycle measurement and delay offset
JPWO2005013546A1 (en) Clock transfer device and test device
JP4930593B2 (en) Data transfer apparatus and data transfer method
US7453970B2 (en) Clock signal selecting apparatus and method that guarantee continuity of output clock signal
US6636999B1 (en) Clock adjusting method and circuit device
US9437261B2 (en) Memory controller and information processing device
WO2022041154A1 (en) Hold time margin detection circuit
CN115292238A (en) FPGA-based method for realizing phase alignment of inter-chip parallel interface
EP1150450B1 (en) Synchronizer
CN117833918A (en) Self-adaptive high-speed parallel data sampling device
CN116049061B (en) Cross-clock-domain data transmission method, system, chip and electronic equipment
US7373539B2 (en) Parallel path alignment method and apparatus
KR20010080912A (en) Phase difference magnifier
US7795941B2 (en) Frame pulse signal latch circuit and phase adjustment method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination