Background technology
In present 3G mobile terminal chip, the baseband decoder in the physical layer is the part of its core, according to the constraints of terminal, how to realize the low power consumption of baseband decoder in the chip for cell phone, and high efficiency and low complex degree are important problems always.
For addressing these problems, chip design need fully take into account performance, the influencing each other of cost and clock rate.Satisfying under the prerequisite of performance, cost and clock rate (power consumption) are to weigh the key factor that is worth.The baseband chip of 3G belongs to new problem at present, but it is big to have exposed power consumption in the product aspect, the shortcoming that cost is high.
The implementation method of base band demodulating chip can be divided into following two kinds at present:
1. full DSP realizes.Adopt the DSP framework, realize the signal processing flow that defines in 3GPPTS25.212 (WCDMA) or 3GPP TS25.222 (TD-SCDMA) standard with the mode of software.In the realization of DSP, software adopts the mode of module to handle, and is responsible for the scheduling and the management of internal storage data between module and the module by operating system.The shortcoming of this method is that power consumption is big, because the processing of DSP is compared with ASIC, needs to drive whole DSP framework.
2. high-speed hardware logic realization.For guaranteeing in 10ms, to finish real-time processing to wireless frame data, need to adopt bigger hardware designs unit, adopt the first-level buffer device then, be the data of transmission channel with the data qualification of physical channel, adopt the mode of software to decode then.Such implementation method is divided into two kinds,
A kind of ASIC of being adds DSP.After the deinterleaving first time, it is the transport channel data of timing cycle that the wireless frame data of 10ms timing cycle has been converted to TTI (Transmission Time Interval) frame.Processing is afterwards carried out according to transport channel data.ASIC is responsible for the symbol of front end and handles, and DSP carries out the operation of software, the decoding and the CRC check of data.Such mentality of designing does not still drop to minimum level with power, and the cost of DSP also needs to consider simultaneously.
A kind of is that pure ASIC realizes.In the implementation procedure, when reducing processing expenditure, guarantee the real-time requirement of handling.
The serial pipelined architecture that the present invention proposes, though use the implementation that ASIC adds DSP equally, but its serial pipelined architecture can guarantee to reduce spending of hardware when supporting maximum rate, can be applied to widely in the circuit design of 3G baseband decoding chip to go.
In 3G mobile communication system based on CDMA, WCDMA technology and TD-SCDMA technology have all defined very complicated base band signal process flow process (3GPP TS25.211 standard~3GPPTS25.215 standard, 3GPP TS25.221 standard~3GPP TS25.225 standard), according to such signal processing flow design base band, bring very big power consumption requirement fully.The 3G system emerge several years ago in, all exist mobile telephone power consumption excessive, can not realize the problem of extended telephone conversation and standby.Therefore how to reduce the power consumption of 3G mobile, be to realize the commercial bottleneck of using of 3G system always.
According to common division, the 3G mobile base band can be divided into chip-level rate processing unit and symbol level rate processing unit.The rate processing unit of chip-level comprises radio frequency interface, high-speed synchronous/path searcher module, channel estimating, some auxiliary circuits (AGC, AFC etc.) of RAKE receiver (relevant band spread receiver) and radio frequency.These modules need drive with the chip clock of twice at least, generally adopt the hardware asics design.The rate processing unit of symbol level strict difinition in 3GPP TS25.212 and 3GPP TS25.222, the content that the present invention relates to is exactly under such constraints, how to guarantee the performance requirement of symbol level rate processing unit (baseband decoder that the present invention just will discuss), how to reduce the design of power consumption.
According to the requirement of mobile phone, in the time of standby, generally below 10mA, electric current is at mA more than 100 in conversation for the standby current that can realize, and when video traffic, power consumption can reach mA more than 300.The CDMA mobile phone need carry out huge number of network in addition, and these measurement functions need constantly be opened hardware, order about its work, then to the result of network measurement reporting, therefore needs consideration in all its bearings how to reduce the expense of power.The cost of digit chip depends on the manufacturing process of employing and inner gate number in addition, and number greatly also can bring the increasing of power consumption, and therefore simplified design how is the very important topic of 3G mobile.
In baseband decoder, the main signaling module of 3G standard (3GPP TS25.212 and 3GPP TS25.222) definition comprises:
-interlace operation for the second time
-interlace operation for the first time
-rate adaptation operating
-chnnel coding (convolutional encoding or TURBO coding) operation
-CRC check
Compare with the base band of CDMA one with GSM, the data processing of introducing is very complicated, therefore need consider how to carry out maximum optimization on hardware designs realizes.
Summary of the invention
Purpose of the present invention reduces the memory spending in the downlink data reception for solving above-mentioned prior art problem, guarantees to reduce when processing delay requires work clock.
The invention provides a kind of baseband decoder circuit, by second deinterleaver, first deinterleaver, separate devices such as rate matchers, channel decoder, cyclic redundancy code detecting unit and connect to form successively, insert four buffers between the described device, be followed successively by first buffer, second buffer, the 3rd buffer and the 4th buffer; When the device of previous frame data after handled, more preceding device can be accepted the next frame data simultaneously and handle.
Wherein, described more preceding device is meant deinterleaver reconciliation rate matchers, and the device after is meant channel decoder.Buffer between the device of described series connection is carried out write operation by previous signal processor, carries out read operation by a back device.Described first buffer and second buffer have the buffer read-write controller, can read and write them simultaneously.Described first buffer, second buffer, the 3rd buffer are driven by 4 times chip clock, and promptly operating frequency is 15.36MHz.Drive by 4 times chip clock during described the 4th buffer write operation, drive by the Digital Signal Processing clock during read operation, need before read/write operation, switch its master clock.Described first deinterleaver and channel decoder have separately memory and can move simultaneously and can not cause any delay.The length of described each buffer can satisfy under the condition of real-time the length of maximized reduction buffer by the calculating to flank speed.The Transmission Time Interval of described frame data is 10ms.
The present invention also provides a kind of streamline operation method of baseband decoder circuit, is connected in series buffer between the front-end processing device of serial and back-end processing device; In back-end processing device process frames of data, the front-end processing device reads in and handles the next frame data.
Wherein, the device of described front end is meant deinterleaver reconciliation rate matchers, and the device of rear end is meant channel decoder.
The invention has the beneficial effects as follows, the present invention is a kind of pile line operation technology of 3G mobile communication system chips design, the flow process of pile line operation is proposed and by calculating the internal relation of operating frequency and streamline, the example that provides the maximum data processing can satisfy the real-time requirement of handling so that such The pipeline design to be described under with minimum hardware configuration, work clock is reduced, improve downlink processing baseband decoding efficient substantially.
The present invention draws the internal relation of operating frequency and pile line operation by calculating each circuit element processes 10ms required time spent of data, meets the real-time requirement of processing with minimum hardware requirement.
The 10ms data flow that serial arrives among the present invention can be handled timely, first 10ms data enter after channel decoder decodes conciliating rate-matched through deinterleaving, allow second 10ms data to enter system and carry out deinterleaving reconciliation rate-matched, the down link data decode operation that originally needed about 12ms to finish, flowing water turns to different processes, therefore can guarantee the processing of the follow-up data of 10ms endlessly.The present invention has made full use of system resource, effectively reduces the memory spending in the downlink data reception, guarantees to reduce work clock when processing delay requires, and has improved downlink processing baseband decoding efficient substantially.
Embodiment
Below in conjunction with accompanying drawing and specific embodiments, the present invention is further illustrated.Under the requirement of 3G flank speed, the wireless frame data of a 10ms of decoding, processing time is approximately 12ms, the buffer cascaded structure that proposes for the present invention designs down, previous 10ms frame data are conciliate rate-matched through deinterleaving and are entered after channel decoder decodes, and allow next 10ms frame data to enter decoding circuit and carry out deinterleaving and conciliate rate-matched.With the working method of data water operation, processing time of original 12ms is shortened to 10ms, do not influence the real-time of hardware simultaneously, and reduce power consumption and hardware design complexity simultaneously.Same, the present invention also can be at other and the similar baseband decoding of 3G communication system system, and under the requirement of given pace, the wireless frame data of certain Transmission Time Interval of decoding needs to spend greater than this length time in time interval in this system.
Fig. 1 is a serial hardware baseband decoder block diagram of the present invention.In Fig. 1, baseband decoder is linear serial connection, wherein second deinterleaver, the second deinterleaver buffer, first deinterleaver, the first deinterleaver buffer, separate rate matchers, separate the rate matchers buffer, channel decoder, channel decoder buffer, and the cyclic redundancy code detecting unit connects successively.Promptly at the second traditional deinterleaver, first deinterleaver, separate between the structure that rate matchers, channel decoder, cyclic redundancy code detecting unit connect successively and insert four buffers.Each hardware module is by the DSP setup parameter, and each hardware module is informed the end of DSP corresponding event with the mode that sends interrupt signal after the operation that finishes one piece of data.
Between the device of serial, be connected in series buffer; In the Frame, interleaver is conciliate rate matchers and is read in and handle the next frame data on channel decoder is handled.
Below be example with Wideband Code Division Multiple Access (WCDMA) (WCDMA), set forth technical solution of the present invention.Method of the present invention can be adapted in the design of similar TD-SCDMA baseband decoder.
Serial hardware WCDMA baseband decoder by second deinterleaver, first deinterleaver, separate rate-matched module, channel decoder, 4 buffers (buffer) and CRC (cyclic redundancy code detection) verification and form, get TTI (Transmission Time Interval)=10ms.(TTI is the parameter of transmission channel, at 10ms, and 20ms, 40ms, value among the 80ms, when information rate was 384kbps, getting TTI was 10ms).
4 channel signal buffers of on the descending decoding path this are respectively:
The buffer of-deinterleaving for the second time circuit, the present invention is called buffer 1.
The buffer of-deinterleaving for the first time circuit, the present invention is called buffer 2.
-separating the buffer of rate matching circuit, the present invention is called buffer 3.
-channel decoding circuit buffer, the present invention is called buffer 4.
Concrete annexation as shown in Figure 1.Serial hardware WCDMA baseband decoder by the second time deinterleaver and buffer thereof (buffer 1), for the first time deinterleaver and buffer (buffer 2) thereof, separate rate-matched module and buffer (buffer 3), channel decoder and buffer (buffer 4) thereof and CRC (cyclic redundancy code detection) verification and form, get TTI (Transmission Time Interval)=10ms.
The realization of this decoder realizes according to the signal flow of 3GPP TS25.212 definition that fully different is that the 3GPP standard has only defined the transmission processing procedure of channel, does not send the specific implementation of handling and do not relate to.The same reception handled the inverse process that can be regarded as the transmission processing.Therefore in standard, do not design and receive the specific implementation of handling.Compare with common signal processing mode, this Architecture characteristic is:
Adopt 4 buffers to connect each signal processing unit, all adopt the hardware time, sequential logic control is finished by software, and the implication that so-called software is finished is that software stays out of the direct processing for the treatment of deal with data, just finishes corresponding configuration.Each hardware module is informed the end of software corresponding event with the mode of hardware interrupts after the operation that finishes one piece of data.Software is responsible for the parameter configuration of hardware next time, and finishes Real-Time Scheduling.The present invention does not relate to the scheduling actions of software, and just describes how to guarantee the symbol decoding operation of hardware with minimum clock and resource support flank speed from the angle of hardware.
Buffer between the signal processing module of series connection is carried out write operation by previous signal processing module, carries out read operation by a back module.For buffer 1 and buffer 2 preparation buffer read-write controllers, can read and write them simultaneously.That is to say for the first time deinterleaver and for the second time deinterleaver can work simultaneously.Identical with buffer 1, buffer 2 can be read and write by the first deinterleaving reconciliation circuit rate matchers simultaneously.Buffer 3 can be separated rate matchers and decoder reads while write.In the present invention, 3 buffers (buffer 1, buffer 2 and buffer 3) are driven by 4 times chip clock (CHIP * 4), and promptly operating frequency is 15.36MHz.The data of CRC check are responsible for being transferred to upper-layer protocol by DSP, therefore can not share the access time of buffer 4.Drive by hardware CHIP * 4 clocks during write operation, drive by DSP (Digital Signal Processing) clock during read operation.This need switch its master clock before read/write operation.Therefore decoder and CRC check must work in series.
Because the existence of grading buffer, operation can not cause any delay in the time of first deinterleaver and channel decoder, and they use memory separately.
The length of grading buffer can satisfy under the condition of real-time the length of maximized reduction buffer by the calculating to flank speed.
At first, need clearly estimate the processing time of down channel decoder, 3 buffers (buffer 1, buffer 2 and buffer 3) by 4 times chip clock (CHIP * 4) drive (=15.36MHz), the processing speed of hardware depends primarily on the read or write speed of memory.
The Design of length of each buffer:
Buffer 4: according to the regulation of 3GPP, for other receiver of 384kbps level, the maximum number bits of the transmission block of receiving at interval at any 10ms is 6400, so the minimum length of buffer 4 is 6400, consider the expense of CRC bit, be expressed as 6400+ α here.
Buffer 3: the speed of coding module is 1/3, so the minimum length of this module is 3X (6400+ α).Buffer 2: consider that the maximum case that bit repeats is 2 times of speed, therefore, the minimum length of this module is 2 * 3 * (6400+ α).
Buffer 1: according to the regulation of 3GPP, for other receiver of 384kbps level, the maximum number bits of the physical channel of receiving at interval at any 10ms is 19200, so the minimum length of buffer 1 is 19200 * 2=38400.
Because down link can be regarded as the inverse process of up link, so calculate from buffer 4 to buffer 1 number of symbols with the method for retrodicting.
Buffer 1 is to the time of buffer 2
Regulation according to 3GPP, under the requirement of flank speed, in the interval of every 10ms, need deposit length in and be 19200bit (as can be known by TS25.211/Table11, slot format (slot format)=16, SF (spreading factor)=4 o'clock, each length of down link DPCH (DPCH) was that the frame of 10ms is divided into 15 time slots, bits/slot:1280 can get frame length frame=1280 * 15=19200bits).First deinterleaver need be at 10 * F from buffer 1
iMove 19200 * F in the ms TTI cycle
i(i is the number of frame, F
iBeing the number of symbols in the frame) individual bit is to buffer 2.This action need at least 1.25 * F
iMs, 1.25 origin is 19200/15.36M.
Therefore, the processing time of first deinterleaver is=P * U/15.36MHz, and wherein U is a physical channel frame internal symbol number, and P is PhyCH (physical channel) number in the CCTrCH.
Under the complex situations (384Kbps rate behavior), first deinterleaver needs every 10ms cycle move 19200bits to being cushioned in the device 2 from buffer 1.Need to spend 1.25ms in this case.Buffer 2 is to the time of buffer 3
For separating the rate-matched module, the longest situation of processing time is the bit repetitive operation, because the number of times of read operation can be the twice of write operation.
Separate the processing time=G/15.36MHz of rate matchers, G is a bit number of separating rate-matched.Under flank speed, need move 48Kbit to buffer 3 from buffer 2.Expend 3.2ms.
Buffer 3 is to the time of buffer 4
When channel speed is under the situation of 384kbps, adopts high speed convolution sign indicating number decoding circuit, and the processing time of channel decoder is can rough measurement as follows, B
iBe m transmission block (comprising the CRC bit) in the TTI frame of TrCH.Processing time=the B of channel decoder
i* 2 * (1+L1/L) * ITER+1} * 1.1/CLK, wherein, L, L1, ITER are parameters, default value be 256,16,11}.L is the output bit number of channel decoder single decoding, and L1+L is the input bit number of channel decoder single decoding, and ITER is the decoding iterations of channel decoder.
It should be noted that CHIP * 8 are used as the work clock of channel decoder in ASIC (Application Specific Integrated Circuit).CHIP * 4 are as the work clock of channel decoder in FPGA (but field programmable gate array).Therefore, if handle the 384kbps channel data, need expend 6.7ms and handle the 10ms data with FPGA.
From the readout time of buffer 4
CRC check is by the DSP module drive.The DSP clock rate is set at 20MHz, and the DSP program can be in 1 DSP access buffer 4 in the cycle.Per 16 bits of CRC check are the bit of read buffers 4 one by one.Therefore, the processing time=B of CRC check
i/ 16/20MHz.
According to top argumentation, here the processing time of flank speed was tabulated for 1 following (processing time of buffer can be ignored):
The maximum number of symbols time overhead of processing unit estimation equation time overhead estimation t clk=CHIP * 4=15.36MHz 1.25ms buffer 1 is to buffering 19200bits t=P * U/clk device 2 (TS25.211) P=1, U=19200 |
The maximum number of symbols time overhead of processing unit estimation equation time overhead estimation t 8K * 3 * 2bits clk=CHIP * 4=15.36MHz 3.2ms buffer 2 is to buffering (3 times of fast t=G/clk device 3 rates of coding, separating rate-matched speed for 2 times) 8K * 3bits clk=CHIP * 4=15.36MHz 0.52ms is transferred to channel (3 times coding fast t=8K * 3 * 1/3/clk lead) time for reading={ 2 * (1+ 6.7ms L1/L) * ITER+1} * 1.1 channel decoder circuit 8K bits L, L1, the ITER default value is { 256,16,11}, t=B
i* read speed/clk clk=20MHz 0.03ms is transferred to DSP 8K bits time for reading=1/16 t=B
i/16/clk
|
The processing time estimation of the complex situations of table 1
By table 1, can draw time schedule as Fig. 2 (setting thinks to have only 1 TrCH, and for TTI=10ms is got in simplification, actual conditions are with reference to the A.3.4 joint among the 3GPP TS25.101).
Fig. 2 has provided the sequential chart of each module operation time of baseband decoder and space arrival data.Transverse axis has represented that the radio frames of 10ms arrives sequential, and the longitudinal axis has been represented the title of each module, and the part of shade has been represented processing time of specific one piece of data.Handling the real-time that can guarantee data by the water operation that can find out this patent statement among the figure clearly handles.
Handle about 12ms of whole processing times of 10ms data.Because we are with the structure of many buffers, realized water operation, therefore can before finishing the work, channel decoder begin first deinterleaving of next 10ms data, newly arrived data of processing that we just can be real-time like this, and do not need total data all processed intact, just accept the data of next 10ms.
First 10ms data are being conciliate through deinterleaving before rate-matched enters channel decoder and finish decoding work, allow second 10ms data to enter system and carry out deinterleaving reconciliation rate-matched, the mode of the down link data decoding flowing water that originally needed about 12ms to finish second is carried out, guarantee the real-time of 10ms data.
That more than introduces only is based on several preferred embodiment of the present invention, can not limit scope of the present invention with this.Any device of the present invention is done replacement, the combination, discrete of parts well know in the art, and the invention process step is done well know in the art being equal to change or replace and all do not exceed exposure of the present invention and protection range.