CN101536333A

CN101536333A - Pre-scaling of initial channel estimates in joint detection

Info

Publication number: CN101536333A
Application number: CNA2007800365615A
Authority: CN
Inventors: 马可·卡西可; 严爱国; 利德温·马蒂诺; 托马斯·J·小巴伯; 约翰·子军·申
Original assignee: MediaTek Inc
Current assignee: MediaTek Inc; Analog Devices Inc
Priority date: 2006-09-29
Filing date: 2007-09-27
Publication date: 2009-09-16
Anticipated expiration: 2027-09-27
Also published as: CN101663829A; CN101553995A; CN101663829B; CN101536333B; CN101553995B; CN101553994A

Abstract

A joint detection system and associated methods are provided. The joint detection system is configured to perform joint detection of received signals. The joint detection system includes a programmable digital signal processor (DSP) configured to generate initial channel estimates corresponding to propagation channels, wherein each of the initial channel estimates includes a plurality of values. The programmable DSP is further configured to determine one or more pre-scaling factors for one or more of the initial channel estimates. The pre-scaling factors are at least partially based on at least one of the plurality of values of one or more of the initial channel estimates. The programmable DSP is further configured to pre-scale the initial channel estimates by the pre-scaling factors.

Description

The pre-convergent-divergent of the initial channel estimation in the joint-detection

Technical field

The present invention relates to the associated detecting method and the circuit of radio communication.

Background technology

TD-SCDMA (Time Division Synchronized Code Division Multiple Access, TD SDMA) is the wireless communication standard that combines TDD/TDMA (Time Division Duplexing/TimeDivision Multiple Access, time division duplex/time division multiple access) operation and synchronization CDMA (code division multiple access).TD-SCDMA can distribute different time slot (time slot) and spreading code for the user, and each time slot can comprise and the data of being correlated with by the different different users that spreading code differentiated thus.Fig. 1 shows TD-SCDMA scheme 100, its midband 110 can be used for the communication by distributing (for example, a TD-SCDMA time slot can use 16 spreading codes at most) such as different time slot 121,122,123,124 etc. and

different spreading codes

1,2,3 to a plurality of users and being associated with these users.The every time slot of present TD-SCDMA uses 16 spreading codes at most, and this makes and can distribute different spreading codes simultaneously to maximum 16 users in a given time slot.In some cases, can distribute a plurality of spreading codes for a user.

TD-SCDMA supports asymmetric flow and business, can utilize flexibly frame structure to revise up link and downlink traffic allocation thus, and this frame structure flexibly makes can dynamically revise up link and downlink allocation during conversing.TD-SCDMA also makes and disturbs (MAI, multiple access interference) by utilizing joint-detection and antenna system to reduce multiple access.In the joint-detection scheme,, and from the signal that is received, extract specific user's data concurrently to from may a plurality of user's data relevant estimating with time slot.In this way, solved the interference that causes with other user-dependent signal and offer user's data and disturb to reduce.

Summary of the invention

In one aspect, providing a kind of is used in the method that is set to the signal that is received is carried out the combined detection system of joint-detection.This method comprises the step that generates with the corresponding a plurality of initial channel estimation of a plurality of propagation channels, and each in wherein said a plurality of initial channel estimation all comprises a plurality of values.This method also comprises the step of determining at least one pre-convergent-divergent coefficient at least one initial channel estimation in described a plurality of initial channel estimation, wherein, described at least one pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of at least one initial channel estimation in described a plurality of initial channel estimation.This method also comprises the step of coming at least one initial channel estimation in described a plurality of initial channel estimation is carried out pre-convergent-divergent according to described at least one pre-convergent-divergent coefficient.

On the other hand, a kind of combined detection system is set to the signal that is received is carried out joint-detection.This combined detection system comprises programmable digital signal processor (DSP, digital signal processor), it is set to generate and the corresponding a plurality of initial channel estimation of a plurality of propagation channels, and each in wherein said a plurality of initial channel estimation all comprises a plurality of values.This Programmable DSPs also is set to determine at least one pre-convergent-divergent coefficient at least one initial channel estimation in described a plurality of initial channel estimation, wherein, described at least one pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of at least one initial channel estimation in described a plurality of initial channel estimation.This Programmable DSPs also is set to come at least one initial channel estimation in described a plurality of initial channel estimation is carried out pre-convergent-divergent according to described at least one pre-convergent-divergent coefficient.

When reading in conjunction with the accompanying drawings detailed description of the present invention, it is clear that others of the present invention, execution mode and feature will become.Accompanying drawing is schematically and not to draw by ratio.In the accompanying drawings, be illustrated in each identical or similar substantially assembly in each accompanying drawing with single label or symbol.

For simplicity, in each accompanying drawing, each parts is not all marked.And, so that those skilled in the art can understand when of the present invention, this assembly is not shown yet when not needing to be described for each assembly in each execution mode of the present invention.Its full content is herein incorporated the whole patent applications and the patent that are herein incorporated by reference.When if conflict is arranged, be as the criterion with this specification (comprising definition).

Description of drawings

Among the figure, identical drawing reference numeral is represented similar elements:

Fig. 1 is the time slot of TD-SCDMA and the schematic diagram of spreading code;

Fig. 2 is the schematic diagram of the downlink channel model of TD-SCDMA;

Fig. 3 is the schematic diagram of various matrixes related in united detection processing;

Fig. 4 be according to the realization of an execution mode block diagram of receiver system of combined detector system;

Fig. 5 is the block diagram according to the interior receiver chain of an execution mode;

Fig. 6 is the block diagram according to the united detection processing of an execution mode;

Fig. 7 is the schematic diagram according to the time slot format of the received signal of an execution mode;

Fig. 8 is the schematic diagram of content of accumulator before shifting processing and after the shifting processing according to an execution mode;

Fig. 9 is the flow chart that the content to accumulator according to an execution mode was shifted and the subclass of the position of accumulator was stored into the processing in the memory;

Figure 10 is the block diagram according to the hardware structure that can realize shifting processing of an execution mode;

Figure 11 lets slip the flow chart of journey according to the preshrunk that the propagation channel of an execution mode is estimated;

Figure 12 is the schematic diagram according to the pre-convergent-divergent of the channel estimating of an execution mode;

Figure 13 is the block diagram according to the combined detector accelerator architecture of an execution mode; And

Figure 14 is can be by the flow chart of the processing of carrying out according to the finite state machine of the combined detector of an execution mode.

Embodiment

For example, can utilize digital signal processor (DSP) that combined detection system is embodied as software solution, perhaps solve convenient form and realize combined detection system with the circuit that is called joint-detection accelerator (JDA, joint detectionaccelerator) by hardware.Compare with the combined detection system of realizing with software, JDA can reduce power consumption and improve the speed of service.

The applicant have realized that combined detection system can comprise some of from the flexibility that programmable software realizes, being benefited handle computings and from the power consumption of the speed of JDA and reduction, be benefited other handle computing.The computing that is realized among the JDA can comprise the ripe algorithm that is customized by different mobile phone (handset) manufacturer that those unlikely change and difficult.On the contrary, the computing that realizes in the Programmable DSPs can comprise and may change and be easy to the algorithm that customized by different mobile phone manufacturers.

The applicant recognizes that also when satisfying performance specification (such as the piece error rate performance), JDA is benefited can realizing from the fixed point (fixed point) that can reduce chip area and power consumption.The data bit width that reduces to fix a point to realize can dwindle chip area again and reduce power consumption.In addition, littler bit wide means can carry out processing in the shorter time, therefore makes longer chip dormancy or the idle pulley time period become possibility.The applicant has realized that the JDA with small data bit wide more by carrying out computing and only position still less is kept in the memory in the accumulator with a large amount of positions, still can keep high accuracy in multiplication accumulating operation process.The applicant also recognizes, can be come internally to determine and/or set one or more shift value by external source (such as Programmable DSPs) by JDA.

The applicant also recognizes, the current initial channel estimation in combined detection system may need bigger bit wide to hold difference on the amplitude of each channel.This situation may occur owing to the mode of carrying out channel estimating.For example, in the TD-SCDMA system, in each burst, be provided with one or more training sequence (midamble), and receiver utilizes training sequence to estimate propagation channel between reflector and the receiver.Yet receiver is carried out initial channel estimation, may not solve the difference between the quantity of the power grade of training sequence and training sequence in initial channel estimation.Though use the zoom factor that generates by efficient coding (active code) detection algorithm finally to solve this species diversity, but in the fixed point of JDA realizes, the bit wide that has needed with having solved above-mentioned influence is at the very start compared, and the channel estimating that initial channel estimation generated can need bigger bit wide.The applicant has realized that combined detection system can dwindle bit wide in can realizing in the fixed point of JDA thus from before one or more propagation channel is sent to JDA these propagation channels being carried out being benefited the pre-convergent-divergent (pre-scaling).

Should be understood that,, and can realize by any-mode because technology described herein is not limited to any specific implementation.After a while the example of these realizations is discussed, but should be understood that these realizations are only introduced as illustrative examples, and can realize these execution modes by alternate manner.The combined detection system of example given below to be used for using with the TD-SCDMA scheme described.Yet, should be understood that, these technology as described herein can be used with other suitable communication plan, and/or can use with other combined detection system of realizing by variety of way, and the use of these technology is not limited to the combined detection system of any particular type.

As described below, an application of technology as described herein is the combined detection system that is used for the TD-SCDMA receiver.Yet, because technology as described herein can be used in the system that can carry out any type of joint-detection to the signal that is received, so this example only is exemplary.

As shown in Figure 2, TD-SCDMA downlink channel model 200 can comprise channel code and scrambler c ₁, c ₂..., c _Ka, channel impulse response h ₁, h ₂..., h _Ka, append to the random noise z of each channel and joint-detection data sink 210.Data d ₁, d ₂..., d _KaIn the base station respectively with channel code and scrambler c ₁, c ₂..., c _KaMultiply each other, and be sent in the channel.Each encoding channel can be modeled as the channel impulse response h that is attended by noise z ₁, h ₂..., h _KaOwing to use smart antenna, so the channel impulse response of each encoding channel can be independently.Simulation part by receiver 210 is sampled to the data r that is received, and is entered into the combined detection system of receiver 210.The output x of combined detection system comprises user data, can further be decoded to this user data by the downlink bit processor.

The accumulation result of channel code/scrambler and channel impulse response is channel code/scrambler c _xWith channel impulse response h _xConvolution.Can represent the accumulation result of whole channels for the individual data symbol by matrix V, wherein the row of matrix V are the convolution at the channel code/scrambler and the channel impulse response of this encoding channel.The columns of matrix V is efficient coding channel K _aQuantity.By arranging the V matrix, can make up array response matrix T about the total data field along the diagonal of matrix T.

Fig. 3 is the schematic diagram of channel impulse response matrix H, channel code/scrambling matrix C, matrix V and matrix T.As shown in the figure, matrix H has K _aRow and W are capable, and Matrix C has K _aRow and Q are capable, and wherein W is the length of channel impulse response, and Q is a spread spectrum coefficient, and K _aBe the quantity of efficient channel.Matrix V has K _aRow and W+Q-1 are capable, and matrix T has N*K _aRow and N*Q+W-1 are capable, and wherein N is the quantity of the data symbol in the units chunk.The line number of matrix V depends on that the length (for TD-SCDMA, it is N*Q chip) of data field adds that the length W (it is between 1 to 17 chip period) of channel impulse response subtracts one (N*Q+W-1) then.

Use above-mentioned defined matrix, the data r that is received can add that then noise z represents with transmission data d with via the accumulation results of the channel code/scrambler and the channel impulse response of matrix T:

r＝Td+z。

Can use joint detection algorithm to come to recover to obtain the data d that transmitted from received data r.Can be used in first algorithm of finding the solution the data d that is transmitted and use least square (LS, least squares) criterion:

T wherein ^HIt is the associate matrix of matrix T.Performance may not be fine to least-squares algorithm for low received signal to noise ratio (SNR), therefore can use another algorithm based on least mean-square error (MMSE, minimum mean squared error) criterion:

σ wherein ²It is the variance of noise z.LS algorithm and MMSE algorithm can be reduced to same equation:

Ad＝y，

Y=T wherein ^HR, and be called as the filter output of coupling, and for LS algorithm A=(T ^HT), perhaps for MMSE algorithm A=(T ^HT+ σ ²I).

Usually, find the solution this equation and will be referred to inverting matrix A.Because the attribute of matrix A utilizes equation A=L ^HDL can use the Cholesky decomposition method to come representing matrix A, wherein L by upper triangular matrix L and diagonal matrix D ^HBe the associate matrix of matrix L, it can be used in solving equation Ad=y iteratively again.In order to illustrate, Fig. 3 has also comprised the schematic diagram of matrix A and matrix L.Matrix L is made of N the piece of arranging along the diagonal of this matrix, and as will be discussed further, by piece (for example, the piece B that only calculates limited quantity ₁And B ₂) and remaining piece be set to equal the value of the last piece that calculates (for example, with piece B ₃, B ₄..., B _NBe set to into B ₂Value), can approach matrix L.

The solution procedure of solving equation Ad=y comprises forward substitution (forward substitution), some division (dot division) and back to replacing (backward substitution), is transmitted the value of data to obtain:

(1) forward substitution: L ^HF=y

(2) some division: g=f./D

(3) back is to replacement: Ld=g

Wherein f is the intermediate vector of obtaining in the forward substitution equation, and g removes the intermediate vector of obtaining in the normal equation at point.The output of joint detection algorithm can comprise the data of given subscriber equipment (UE).Can remove given UE UE data in addition, make last output only comprise the data of given UE.

In one embodiment, realize that the combined detector system of joint detection algorithm can comprise JDA and Programmable DSPs, wherein Programmable DSPs is carried out one or more related processing operation of joint detection algorithm.Programmable DSPs makes it possible to by software one or more united detection processing operation that DSP carries out be customized.Programmable DSPs can carry out before JDA receives data and handle operation, can carry out some at JDA handles the back and carries out some intermediate treatment operations and/or can carry out reprocessing after JDA has finished processing to data.Carry out in some execution modes of intermediate treatment operation at DSP, JDA can be included in DSP and carry out JDA front-end processing of carrying out before the intermediate treatment and the JDA back-end processing of carrying out after DSP carries out intermediate treatment.As discussing after a while, in one embodiment, the intermediate treatment operation of being carried out by DSP is that efficient coding detects processing.Before DSP sends to JDA with data these data are carried out in some execution modes of handling, DSP can carry out the channel estimation process of generator matrix H and Matrix C.JDA can be used in and finds the solution linear equation Ad=y, and DSP can provide received data r, matrix H and Matrix C and noise power σ to JDA ²

Fig. 4 be according to the realization of an execution mode comprise the block diagram of exemplary receiver system 400 of the combined detector system of JDA 415 and Programmable DSPs 425.Programmable DSPs 425 can be carried out one or more related processing operation of joint detection algorithm.System 400 can comprise wireless and Analog Baseband (assembled unit 450), and wherein wireless module can receive the signal by base station transmits, and the Analog Baseband assembly can be provided by the received signal that is provided by wireless module.Numeric field assembly 440 can be handled the signal that is provided by the Analog Baseband assembly then.

Numeric field assembly 440 can comprise digital baseband components and can be convenient to the coprocessor (co-processor) that numeric field is handled.Digital baseband components can comprise the Programmable DSPs 425 that can handle the signal combine digital that received.Digital baseband components can be communicated by letter with coprocessor, and this coprocessor can promote the processing of received signal in numeric field.

Coprocessor comprises JDA 415 and bit rate processor (BRP, bit rate processor) 416.In one embodiment, JDA 415 can carry out one or more processing operation of joint detection algorithm, and DSP 425 also can carry out one or more processing operation of joint detection algorithm.JDA 415 can communicate by letter with DSP 425, makes DSP 425 can carry out one or more united detection processing operation thus, and these results that handle operation are sent to JDA 415 to be used for further processing.As additional or alternatively, JDA 415 also can carry out joint detection algorithm one or more handle operation, the result of these operations is sent to DSP 425.In this way, DSP 425 can carry out the operation of any amount in the joint detection algorithm, and JDA 415 can carry out the operation of any amount in the joint detection algorithm.JDA 415 can generate soft-decision output, can this soft-decision output be converted to most probable hard decision by bit rate processor 416 then.Bit rate processor 416 can be carried out channel-decoding and error-detecting to transmission channel, execution deinterleaves to improve the chnnel coding performance, execution goes rate-matched to adjust data rate, carries out the demultiplexing of transmission channel, and carries out the mapping of going to coded composite transport channel on physical channel.

Fig. 5 is the block diagram according to receiver chain 500 in the TD-SCDMA of an execution mode.What interior receiver chain 500 can comprise receiving terminal rises root cosine filter 520, can realize rising root cosine filter 520 (for example, the assembly 470 of system 400) in Analog Baseband.Rise root cosine filter 520 and can provide the signal that is received to one or more pre-processing assembly (removing assembly 530 and I/Q compensation assembly 540) such as DC.In one embodiment, realize that by Programmable DSPs (such as the DSP 425 of system 400) DC removes assembly 530 and I/Q compensation assembly 540.Before received data is sent to combined detection system 550, can remove

assembly

530 and 540 pairs of I/Q sampling carrying out preliminary treatment of I/Q compensation assembly by DC from a time slot collection, DC removes assembly 530 can carry out the DC offset correction, and I/Q compensation assembly 540 can be carried out the unbalanced correction of I/Q phase place.In some embodiments, combined detection system 550 comprises JDA and Programmable DSPs, and Programmable DSPs makes it possible to by software one or more united detection processing operation be customized.In one embodiment, DSP can carry out pretreatment operation before data are sent to JDA.As discussing after a while, the pretreatment operation of being carried out by DSP can comprise channel estimating and/or training sequence Interference Cancellation.

Fig. 6 is the block diagram 600 according to the united detection processing of an execution mode.United detection processing shown in the block diagram 600 can be carried out by combined detection system (such as the combined detection system 550 of the interior receiver chain of Fig. 5).The operation of united detection processing can be carried out jointly by JDA and Programmable DSPs.

United detection processing can be from receiving through DC offset correction, the unbalanced compensation of I/Q phase place and/or any other pretreated signal.The signal that is received can comprise by separated two the data fields of training sequence.Fig. 7 shows the time slot format 700 of the signal that is received, and the signal that is received comprises first data field 710, is training sequence field 720 afterwards, is second data field 730 afterwards, and is protection (guard period) 740 at interval afterwards.Because airborne spread channel expansion, data in the afterbody of data field 710 are disturbed mutually with training sequence 720, and the data in the afterbody of training sequence 720 are disturbed mutually with second data field 730, and this obtains data field 712 (r1) and data field 722 (r2).

The data division operation 610 of the united detection processing shown in Fig. 6 can be handled the signal that is received, with this division of signal to be independently signal of these two of data field (r1 or r2) and training sequences.United detection processing can be come in turn deal with data field r1 and r2 according to any desired order, wherein (for example to a data field, r2) processing can reuse to other data field (for example, the more resulting results of processing r1), as described later.Like this, description after a while can be called processing to data field r1 and/or r2.In some embodiments, carry out processing before to the processing of data field r1 to data field r2.When second data field comprises that command instruction is (such as reaching power control instruction synchronously, synchronously and power control instruction can at first to the data flow of data field r2, the process the then data flow of data field r1 handled, handle) time, this execution mode is preferred.

The training sequence that is provided by data division operation 610 can be provided in channel estimating operation 615, and generates channel estimate matrix H and encoder matrix C, matrix H as shown in Figure 3 and Matrix C.Be known that channel estimation process can use Known signal patterns (for example training sequence signal) to estimate the airborne spread channel from the base station to the receiver.If the use smart antenna then can be associated each encoding channel of TD-SCDMA scheme with different propagation channel.Use the result of channel estimating can make rough (crude) estimation about efficient coding quantity, but careful be the quantity of too high estimation efficient channel, to avoid that invalid (inactive) is appointed as in efficient coding.Finally, the efficient channel of joint-detection detects the better judgement that can provide efficient coding is provided.The output of channel estimating operation 615 can comprise matrix H and Matrix C.

The training sequence Interference Cancellation operation of training sequence interference to the influence of data field, the data fields that training sequence Interference Cancellation operation 620 can be handled by operation 610 outputs have been eliminated by execution.The operation of training sequence Interference Cancellation can relate to the channel estimating of use from channel estimating operation 615.The output of operation 620 can be the data field that has passed through the training sequence Interference Cancellation.In one embodiment, carry out data division, training sequence Interference Cancellation and/or channel estimating by DSP.This makes it possible to customize one or more operation and need not to change the receiver chip group.

In one embodiment, will send to JDA subsequently by the pretreated result that DSP carries out, to carry out front-end processing.JDA can receive matrix H and the Matrix C (for example, via the external coprocessor interface, shown in the system of Fig. 4) that is sent by DSP, and makes up matrix V in operation 625.Make up matrix V and relate to use channel estimate matrix H and encoder matrix C.The i row of matrix V are the convolution of i row with the Matrix C i row of matrix H.As described below, in some embodiments, JDA can realize the displacement to the result before being saved in the result in the memory.

In addition, the operation 630 of JDA can be carried out matched filter and calculate, and makes up matched filter output y=T thus ^HR, wherein r is r1 and/or r2.Matched filter operation can use matrix V and vectorial r to make up y, and because matrix T ^HA lot of be zero, as shown in Figure 3, so needn't make up whole matrix T ^HMatched filter operation 630 can receive by operating 625 constructed matrix V.In addition, matched filter operation 630 can also receive data field r1 and data field r2 from training sequence Interference Cancellation operation 620.As described later, in some embodiments, JDA can realize the displacement to the result before being saved in the result in the memory.

JDA can also calculate the power (power) of y and each row of matrix V in operation 635, this can carry out efficient coding again and detect.Calculating to each power that is listed as of matrix V relates to the operation of asking the squared magnitude of the item of matrix V to sue for peace at given row.Because vectorial y1 is enough to be used in the purpose to the efficient coding detection, so can calculate the power of y and needn't carry out calculating at y1 at y2.In some embodiments, JDA can realize the displacement to the gained result before being saved in resulting power value in the memory.

In one embodiment, carrying out efficient coding by DSP detects.In operation 640, DSP can receive the power result of calculation of y and matrix V (optional) from JDA, and uses this power value to determine the efficient coding and the zoom factor of each channel code.When in DSP, carrying out the efficient coding detection, can carry out the customization of efficient coding detection algorithm.The evolution that can detect handle along with efficient coding and revise dsp software, and same chipset can be used in to realize detecting through the efficient coding of revising and handles.

Can use any suitable algorithm to carry out efficient coding detects.For example, efficient coding testing process can relate to the coding of the power of definite matched filter output (y) greater than threshold level.Should be understood that this only is an example of simple efficient coding testing process, and can use any efficient coding to detect and handle.Efficient coding detects handles the zoom factor that can also determine to be applied to each channel code.The zoom factor that can represent each channel code by mantissa value and exponential quantity.Should be understood that, not necessarily will carry out efficient coding and detect, and can walk around this operation under specific circumstances that for example, when using spread spectrum coefficient " ", perhaps to indicate in given time slot which yard be when effectively indicating when subscriber equipment has had.

The result of efficient coding detecting operation can estimate to operate 655 by SIR and use, and also can be carried out by DSP and operate 655.SIR estimation operation 655 can be used the result of channel estimating operation 615 and efficient coding detecting operation 640.SIR estimates that operation can output noise power σ ²Should be understood that, in some embodiments, can under the result's who does not use efficient coding to detect situation, carry out SIR and estimate.In this case, can after channel estimating, carry out SIR and estimate, and can before JDA carries out front-end processing, SIR be estimated to send to JDA by DSP.Perhaps, when JDA carries out front-end processing, carry out SIR by DSP at least in part and estimate.

In some embodiments, will detect the indication of determined efficient coding and zoom factor by the efficient coding that DSP carries out and/or, send to JDA to carry out back-end processing by the noise that DSP calculates.The JDA back-end processing can comprise convergent-divergent again (rescaling) and rearrangement (reordering) operation 645 and the convergent-divergent again of V of y and rearrange operation 650.The result that these operations can be sent according to efficient coding detecting operation 640 rearranges the row of y and matrix matrix V and convergent-divergent again, wherein, rearranges and has eliminated and the corresponding any row of invalid code.As this result who rearranges, the JDA back-end processing can be used same matrix index, and with which coding be effectively irrelevant.

The back-end processing of being carried out by JDA can also comprise the calculating operation 660 of matrix A, matrix A calculating operation 660 receive by operation 650 generated through convergent-divergent and the matrix V that rearranges again with by operation 655 noises that generated, with by estimated matrix computing T ^HT+ σ ²I makes up matrix A.Owing to can directly use matrix V to come the element of compute matrix A, and, therefore, make up matrix A and might not relate to the structure matrix T because a lot of elements of matrix T all are zero.Therefore, can only carry out calculating to the nonzero element of matrix A, and these nonzero elements (for example, not needing to store known neutral element) that can storage matrix A.In some embodiments, JDA carries out displacement to these results before can being saved in the memory in the value with resulting matrix A.

The JDA back-end processing can also comprise Cholesky operation splitting 655, and Cholesky operation splitting 655 can be decomposed into matrix A matrix L and matrix D.Can carry out Cholesky to decompose and need not whole elements of compute matrix L.Matrix L can be divided into a plurality of that on value, restrain, and the quantity of the piece that is calculated depends on desired accuracy.In an implementation, the quantity of the piece of the matrix L of being calculated is 2.The quantity of the piece that uses at matrix L reduces the minimizing that causes division number of times in a division calculation, has promoted the realization of joint detection algorithm thus.Therefore, can be only to the calculating of the subclass of the nonzero element of matrix L, and these nonzero elements (for example, not needing to store known neutral element) that can storage matrix L.

The JDA back-end processing can also comprise linear equation solver operation 670, and linear equation Ax=y (for example, using forward substitution, some division and back to replacement, as mentioned above) is found the solution in linear equation solver operation 670.Linear equation solver operation 670 can and rearrange that operation 645 receives data fields from the convergent-divergent again of y and from Cholesky operation splitting 665 receiving matrix L and matrix D.Linear equation solver operation 670 can generate data field (x1 and x2).In some embodiments, JDA can be shifted forward substitution, some division and/or back before the result who replaces processing is saved in the memory to these results.

Can extract operation by the user and 675 come deal with data field x1 and x2, the user extracts (a plurality of) sign indicating number that operation 675 can use UE using and extracts this particular UE data.Linear equation solver operation 670 can generate this two data field x1 and x2 successively, and user data extraction piece 670 can also merge these two data fields, thereby obtain a unified data field x, handle to be used for other assembly by data extract operation 670 this data field of output x subsequently.For example, DSP can carry out the back united detection processing.If also need the coding (for example, being used for the coding of power measurement) outside the UE sign indicating number, then can in output, comprise other coding.

In some embodiments, can realize JDA, wherein, before the position that the fixed bit that will be arranged in accumulator (bit) position quantity reduces is saved in memory, (for example, in the accumulator) operation result is shifted according to the fixed point implementation.This operation is equal to which position of selecting accumulator to be saved in the memory and needn't the content of accumulator to be shifted.Can select the fixing position, position in the memory of to be saved in of shift value and accumulator, thereby guarantee in memory, to represent fully the value (for example, shearing the exact value of (bit clipping)) in the accumulator without any obvious position.

In one embodiment, JDA comprises with variable storage being the memory assembly of signed N position decimal.Like this, the numerical value of the variable of being stored-1 and+1 between, comprise-1 but do not comprise+1.Perhaps, because technology described here is not limited to be used for decimal, so the variable of the JDA in the memory can be a signed N position integer.When carrying out computing at two or more storage of variables among the JDA, (for example, storing in accumulator) operation result may not satisfy the range of variables in the above-mentioned JDA of the being stored in memory.Displacement technology described here makes can use the expectation bit wide to come storing value.

Should be understood that a lot of computings among the JDA are multiplying and/or summation operation, such as computing c _j=∑ _ia _ib _iJDA can carry out this computing, thereby in by the multiplication accumulating operation MAC process of using accumulator (data bit width of this accumulator is obviously greater than the data bit width of the memory of the final result that will preserve multiplication accumulating operation (MAC, multiply and accumulate operation)), keep degree of precision.After computing (such as MAC) is finished, can with accumulator the position subclass be saved in the memory.The content displacement shift value of selecting the position of which accumulator to be saved in may to relate in the memory to make accumulator also will be saved in memory from the place value of the fixed bit position of accumulator.

Fig. 8 be according to the accumulator of an execution mode and after the shifting function figure of the subclass of the position that is saved in memory of accumulator.Only be used for illustrative purposes in the place value shown in Figure 80 0, and shown in technology on the one hand unrestricted at this.Accumulator 810 can comprise than (for example, MAC) being stored in the more position, position of data storage memory location 820 after finishing in computing.The position that accumulator 810 can comprise any amount (for example, 28), these positions can comprise sign bit, and the figure place N that is stored in the result of memory than the little any amount of the figure place of accumulator (for example can be, 11), these positions also can comprise sign bit.Should be understood that above-mentioned accumulator and data bit width value only are examples, and technology described here is unrestricted on the one hand at this.In addition, can be enough big based on just electing as in big young pathbreaker's accumulator size of the data of computing, thus guarantee that precision does not have heavy losses.

Can select to be saved in the figure place of the accumulator 810 in the memory based on desired memory data bit wide N.In addition, owing to can correspondingly adjust based on selected fixed bit position, so can at random select the particular location of the fixed bit that should be saved to memory in the accumulator 810 to the displacement that before the result is saved in memory, is applied to accumulator contents.In Fig. 8, the accumulator data place value that be saved to memory is the place value in the rectangle 840.

The data of institute's computing and performed computing can be to make accumulator decimal point 830 between two

specific bit positions

831 and 832 of accumulator 810, and this can be to be situations when the decimal of symbol is arranged in the data when institute's computing.Selected to be saved to accumulator 810 data bit (i.e. position in rectangle 840) of memory, this makes bit position 831 comprise the leftmost bit that will be saved to memory.

(for example, MAC) finish and after operation result is arranged in accumulator 810, select to be saved to the place value of memory based on the displacement of the position content that is applied to accumulator 810 in computing.Fig. 8 shows that displacement is shifted with the place value to accumulator to the content application of accumulator 810, as shown in the resulting accumulator 810 '.After with the place value right shift shift value S in the accumulator 810, accumulator 810 ' is identical with accumulator 810.As described later, this shift value is a signed integer, and can determine or set this shift value according to the mode of any appropriate.Positive shift value S (wherein S is a positive integer) can be associated to shifting left with the position with accumulator.Negative shift value-S (wherein S is a positive integer) can be associated with the content right shift with accumulator.Should be understood that the symbol of this shift value is arbitrarily and depends on routine, and technology described here is unrestricted on the one hand at this.

As shown in Figure 8, shift value is to make shifting function that the first symbol place value at 833 places, position, position is moved in the accumulator bit position 831.After the displacement of finishing accumulator contents, the place value of the fixed bit position that is positioned at accumulator shown in rectangle 840 is saved to memory.In the illustrative examples of Fig. 8, select shift value to make not the sign bit with the repetition of the binary number in the accumulator be stored in the memory, be referred to as normalization here.Should be understood that,, thereby can measure arbitrarily the value displacement because example shown in Figure 8 only is to be used for illustrative purposes.In some embodiments, the value that is stored in the memory is a signed N position decimal, and the shift value that is applied to accumulator contents is to make the content that is shifted that is positioned at accumulator decimal point left side only comprise the sign bit (for example, the position in 833 left sides, position, position in the accumulator 810) of repetition.

Fig. 9 handles 900 flow chart, and the operation result that is stored in the accumulator can store in the memory by handling 900.Can carry out processing 900 by hardware in JDA, under the situation of MAC computing, this hardware can comprise multiply-accumulator and one or more shift register.In action 902, carry out computing and (for example, MAC) result is stored in the accumulator.After computing is finished, in action 904, make accumulator contents displacement shift value.Shifting function is equal to that content with accumulator multiply by or divided by 2 ^SHIFT, wherein SHIFT is a shift value.Can carry out shifting function by Output Shift Register, and can internally determine used shift value, this shift value (for example, Programmable DSPs, and can being specified by the user) perhaps can be provided by the system of JDA outside by JDA.

In action 908, the content of accumulator is rounded (round) subclass of expection with the position stored in the memory.Can round up (rounding-up) by last position to the subclass of the position of the accumulator that will be stored in memory (or round downwards (rounding-down)) round.Yet, should be understood that, because technology described here is unrestricted on the one hand at this, can round according to other any suitable manner.As known to those skilled in the art, in action 910, can overflow checking and judge to round whether cause overflowing.If made the judgement that occurs overflowing, then make the accumulator contents that will be saved to memory saturated (action 912).Saturated relating to, be made as maximum positive N figure place or minimum negative N figure place to the value that will store memory into.

If (in action 910) made the judgement of overflowing does not appear, perhaps occurred overflowing and correspondingly the value of making is saturated subsequently, then should handle and continue action 914, N the continuous position that will be arranged in the place, fixed position of accumulator appointment in action 914 is saved in memory, and wherein N is less than the sum of the position of accumulator.Can when designing JDA, specify the fixed bit that is saved in the figure place N in the memory and should be saved in the position of the accumulator in the memory by hardware designer.Subsequently, handling 900 can stop.

In some embodiments, different variablees can have the different shift values that is associated.In some embodiments, each element that is stored in vector in the memory or matrix is composed identical shift value.In other embodiments, the different lines or the different rows of matrix are composed different shift values.Permission is used different shift values at the different lines or the different rows of matrix, and this makes it possible to improve accuracy by based on the shift value of the customization (tailor) of displacement of each row of matrix or the value in each row being selected each row or each row.

Should be understood that the add operation of being undertaken by JDA can relate to two or more and utilize different displacements and may stored storage of variables.Should be understood that, can be regarded as having the storage of the mantissa value of different indexes with storage of variables that different shift values are associated.In this computing, JDA can guarantee before carrying out addition for the variablees that all just are being added displacement all be identical.For example, when carrying out such as c _j=d _j+ ∑ _ia _ib _iComputing the time, JDA can judge before one or more vector element is in being stored in memory whether be shifted.If one or more in these vectors is shifted, then JDA can guarantee that whole vectors all have same shift value before carrying out addition.For example, if vectorial a has passed through the displacement of shift value A_SHIFT before in being saved to memory, vector b has passed through the displacement of shift value B_SHIFT before in being saved to memory, and vectorial d has passed through the displacement of shift value D_SHIFT before in being saved to memory, and then JDA can will be added to the summed result ∑ from the d vector element that memory obtains _ia _ib _iMake vectorial d displacement shift value A_SHIFT+B_SHIFT-D_SHIFT before.This computing can be expressed as on mathematics (dj＜＜A_SHIFT+B_SHIFT-D_SHIFT)+∑ _ia _ib _i, wherein, operator "＜＜" is expressed as the shift operation that dj is carried out.The result of computing can also be shifted before being stored in memory location.

Figure 10 is the block diagram of the hardware structure 1000 among the JDA, and hardware structure 1000 can be realized at relating to variable d and summed result ∑ _ia _ib _iThe shifting processing of the computing of addition, as mentioned above.In one embodiment, though can use other data bit width, the figure place N that is used for storage of variables a, b and d is 11, and the figure place of accumulator is 28, and described here technology is unrestricted on the one hand at this.Hardware 1000 can comprise input shift register 1008, and input shift register 1008 is used for before the d value is written into accumulator 1006 the d value being shifted.Shift value A_SHIFT, vectorial b shift value B_SHIFT, the vectorial d shift value D_SHIFT that has been shifted before in being saved to memory that has been shifted before in being saved to memory if vectorial a has been shifted before in being saved to memory, then shift register 1008 employed shift values can be A_SHIFT+B_SHIFT-D_SHIFT.

Hardware 1000 can also comprise and being used for input value a _iWith input value b _iThe multiplier 1002 that multiplies each other and being used for the content of accumulator 1006 and a that provides by multiplier 1002 _iAnd b _iThe adder 1004 of multiplied result addition.Accumulator 1006 can comprise a plurality of A, and A can be greater than the figure place N of input data.After the multiplication accumulating operation is finished, can make value displacement shift value C_SHIFT in the accumulator 1006 by Output Shift Register 1010, and the subclass of the position of accumulator can be saved in memory.As mentioned above, the subclass of the position of accumulator can be included in N position of fixed position in the accumulator.Can select shift value C_SHIFT, feasible significance bit with resulting value in the accumulator is saved in the memory.Be used for being stored in the figure place of the median that the united detection processing process calculates by minimizing, JDA can the tool desired speed, memory area and/or power consumption.

In some embodiments, the system (for example, passing through Programmable DSPs) by the JDA outside sets one or more shift value that uses among the JDA.External system can comprise Programmable DSPs, and this Programmable DSPs makes the designer to programme to the software of the shift value of one or more variable of being identified for being stored by JDA.As an alternative or additional, the designer is provided with fixing shift value (for example, passing through Programmable DSPs), and provides it to JDA subsequently.DSP is to the result who determines to relate to processing performed in the front-end processing that utilizes JDA of shift value.For example, based on the result of the efficient coding testing process that can be carried out by DSP, DSP can be identified for the shift value of one or more variable of being stored by JDA.May depend on number of times (number of times of this addition is relevant with the quantity of efficient coding again) owing to be used for the selection of shift value of the back-end processing of JDA, so this step needs in the summation operation addition.Therefore, can determine shift value based on the result of efficient coding testing process at least in part by DSP, and subsequently shift value be sent to JDA.

In some embodiments, internally determine employed one or more shift value of JDA by JDA.JDA determines can relate to being stored in the analysis of the data result in the memory to the inside of one or more shift value that will be used by JDA.Under the situation of matrix A, because matrix V is stored in the inside of JDA, thus in software, calculate relatively difficulty of maximum possible output displacement, and therefore expectation makes JDA determine the shift value of matrix A.In some embodiments, can be by the maximum possible output displacement of JDA in the internal calculation matrix A.If, then can usually determine maximum possible output displacement based on the greastest element in the matrix A by the maximum possible output displacement of JDA in the internal calculation matrix A.Because each encoding channel of element representation (adding noise) on the diagonal in the matrix A is big with the cross-correlation of other any encoding channel with the auto-correlation ratio of the auto-correlation of channel code and each encoding channel, so the greatest member in the matrix A is positioned on the diagonal.In order to determine the maximum possible output displacement of matrix A, cornerwise each element that can compute matrix A and can be with the maximum possible output displacement of the maximum possible shift value of greatest member as the whole elements of matrix A.Should be understood that, compare that the automatic inside that can only use a small amount of cycle to carry out shift value is determined with the sum in the cycle that is used for the complete united detection processing.

The storage of other variable among the JDA is also benefited from the inside of the shift value of being carried out by JDA and is determined.For example, the result that the some division is handled also benefits from the inside of shift value is determined, the result is stored in the memory shift value should be applied to these results before.As mentioned above, under the situation of a division, such as the f./D computing of linear equation solver operation, this computing comprises takes advantage of the anti-phase of decimal and diagonal matrix D.Because the element of D is positive decimal, is not the possibility of decimal so exist the result of a division.Contrary diagonal element (that is 1/d, of matrix D _Ii) the inside displacement of deriving to can be used in the result who guarantees a division also be decimal.In some embodiments, used single shift value at whole elements of this matrix, this can make computation complexity and memory area minimize.In such execution mode, the maximum possible shift value of greatest member that can be by determining this matrix and subsequently this maximum possible shift value is used for whole elements of matrix is determined single shift value.

Can in a plurality of parts, carry out the division of matrix D to each element of matrix D.At first, as known in the art, can come each diagonal element of matrix D is carried out normalization by shifting processing, wherein shifting processing make each element displacement shift value to and determine the mantissa and the index of each diagonal element thus, eliminate the signed position of repeating thus.Because technology described here is unrestricted on the one hand at this, the shift value that is applied to whole elements can be identical, to reduce computation complexity, perhaps can be different.The normalized value of the element dii of matrix D (is called normalized (d _Ii)) less than 1 and more than or equal to 0.5.Therefore, each value 0.5/normalized (d _Ii) be greater than 0.5 and be less than or equal to 1 decimal.Can in middle divider, calculate normalized value 0.5/normalized (d with a plurality of positions (for example, 21) _Ii), and can these positions that quantity reduces be kept in the memory according to the form of mantissa's (for example, as 11 place values) and index (for example, 5 place values) subsequently.In addition, can determined value 0.5/normalized (d _Ii) maximal index, and, wherein should be understood that used as the shift value before the result of memory point division arithmetic g=f/.D, can be with the shift value of maximal index as whole elements of vectorial g.

In some embodiments, whether the designer can use the shift value of given variable to select to JDA, so that determined displacement in inside or set displacement by the system (for example, passing through Programmable DSPs) of JDA outside by JDA.The designer can for example set bit variable by the Programmable DSPs of communicating by letter with JDA, wherein, bit variable indication JDA whether should use by JDA inner determine (for example, as mentioned above at matrix A and matrix 1./D) or the shift value of the given variable set by system's (for example, being programmed among the DSP by the designer) of JDA outside.This makes which variable the designer can select to use the inner displacement of determining to store and which variable should use the displacement of being determined or being set by external source (for example, Programmable DSPs) to store.This method is by making the designer at some variablees (for example to pass through, shift value by designer programming) to select which position be effective to the shift value of externally determining, and make JDA can use the result of intermediate treatment to determine the shift value of other variable in inside simultaneously, flexibility ratio is provided.The shift value that is set by the system of JDA outside may relate to and uses the intermediate object program that offers external system to calculate shift value, perhaps can be the fixing shift value that is provided by the designer.

Should be understood that, can be individually or use one or more technology of the displacement that is used to use, determine and/or sets that the fixed point of JDA realizes in conjunction with other technology described here.Can with JDA that Programmable DSPs is communicated by letter in use displacement, to carry out one or more processing operation such as one or more intermediate treatment operation, but the technology that among the JDA variable is shifted also can be used by the JDA that does not have whole features described here (for example, not necessarily using DSP to carry out the JDA of intermediate treatment operation).

In some embodiments, joint detection algorithm can be included in before the JDA transmitting channel is estimated one or more propagation channel is estimated that (for example, one or more row of the matrix H of Fig. 3) carry out pre-convergent-divergent operation.Pre-convergent-divergent operation can be included in the channel estimating operation, and in case finish initial channel estimation and before this channel estimating of output, carry out pre-convergent-divergent operation.Can in the Programmable DSPs of also carrying out the initial channel estimation processing, carry out this pre-convergent-divergent operation.Before estimating to the JDA transmitting channel one or more propagation channel being estimated to carry out pre-convergent-divergent operation can make it possible to improve accuracy in the fixed point of JDA realizes.

Figure 11 shows the preshrunk of propagation channel estimation and lets slip the flow chart 1100 of journey.This processing can wherein have been determined one or more pre-convergent-divergent coefficient from moving 1102.These pre-convergent-divergent coefficients will be applied to and may be estimated by the determined propagation channel of initial channel estimation procedure.The pre-convergent-divergent coefficient of each propagation channel can be different, but should be understood that, this technology is unrestricted on the one hand at this.Can carry out determining of one or more pre-convergent-divergent coefficient in any suitable manner.

It can be the pre-convergent-divergent coefficient that power that the greatest member estimated based on propagation channel at least in part and/or propagation channel are estimated selects propagation channel to estimate.Can select pre-convergent-divergent coefficient to realize various purposes, include but not limited to: (1) is estimated to carry out convergent-divergent to propagation channel, makes that the maximum value element that these propagation channels are estimated has identical index after pre-convergent-divergent; (2) propagation channel is estimated to carry out convergent-divergent, make after pre-convergent-divergent, these propagation channels estimate to have essentially identical maximum value element (absolute value that for example, has same index and mantissa for their maximum value element); Perhaps (3) are estimated to carry out convergent-divergent to propagation channel, make that these propagation channels estimate to have essentially identical power after pre-convergent-divergent.

In one embodiment, can select pre-convergent-divergent coefficient to guarantee that the index of the maximum value element of each channel estimating is all identical after pre-convergent-divergent.When channel estimating comprised complex item, the maximum value element can be selected as comprising these the absolute value of real part and the maximum of the set of the absolute value of imaginary part.In this case, complex item is made of two real number elements, i.e. this real and imaginary part.If initial propagation channel estimate be by

h ₁＝[h ₁(0)，h ₁(1)，…，h ₁(w-1)]

h ₂＝[h ₂(0)，h ₂(1)，…，h ₂(w-1)]

.

h _ka＝[h _ka(0)，h _ha(1)，…，h _ka(w-1)]

Provide, wherein h ₁, h ₂..., h _KaBe that initial propagation channel estimates (row of matrix H) and each initial propagation channel estimate it is the vector with w-1 complex item.By plain real (hi (j)) of real argument and imaginary element imag (h _i(j)) constitute each complex item h _i(j).Therefore, as described, given propagation channel can be estimated h here _iMaximum value element (being also referred to as the maximum value that constitutes a plurality of values that initial propagation channel estimates) be expressed as by { abs (real (h _i(j))), abs (imag (h _i(j))), j=0 ..., the maximum of the set that w-1} provides.

In another embodiment, (for example determined given propagation channel estimation, the given row of matrix H) maximum value element, and the inverse that can be the maximum value element with the pre-convergent-divergent coefficient settings of this given channel, guarantee thus after pre-convergent-divergent, the element of given propagation channel is less than or equal to 1 (for example, decimal).Can use independent mantissa and index to represent the pre-convergent-divergent coefficient that each propagation channel is estimated.

In another embodiment, determined the power that each propagation channel estimates (for example, the mould of each row of matrix H square), and can come each channel is carried out convergent-divergent, so that each channel has essentially identical power behind convergent-divergent according to pre-convergent-divergent coefficient.Therefore, the pre-convergent-divergent coefficient that each propagation channel is estimated can be chosen as the inverse of the power of each propagation channel estimation.

In action 1104, use determined pre-convergent-divergent coefficient in action 1102 comes each propagation channel is estimated to carry out pre-convergent-divergent.Can provide through the propagation channel of pre-convergent-divergent to JDA subsequently and estimate and corresponding pre-convergent-divergent coefficient.

Figure 12 shows the illustrative embodiments of coming initial propagation channel is estimated to carry out pre-convergent-divergent according to the pre-convergent-divergent coefficient of each propagation channel of use of an execution mode.As shown in the matrix H 1210 that comprises row for each channel, can represent initial channel estimation by the row of matrix H, and wherein the length of each row of matrix H is W.Therefore, each row of matrix H comprise W (for example, plural number).Figure 12 shows according to pre-convergent-divergent FACTOR P _iCome each initial propagation channel is estimated that (each row of matrix H) carry out pre-convergent-divergent to make up the matrix H 1250 through pre-convergent-divergent.Can come to determine each pre-convergent-divergent coefficient in any suitable manner, for example, (the action 1102 described processing of the method shown in the flow chart of Figure 11) are handled in use as described.

Should be understood that, can use pre-zoom technology individually or in conjunction with other technology described here.Can with JDA that Programmable DSPs is communicated by letter in use pre-convergent-divergent, handle operation to carry out, but also can in the JDA of other type, use pre-zoom technology such as one or more of one or more intermediate treatment operation.

In case extracted user data, then in united detection processing or when united detection processing is finished, can explain the influence of pre-convergent-divergent coefficient.In some embodiments, be used for when united detection processing is finished, can obtaining explaining to whole coefficients that the intermediate object program that generates in united detection processing is carried out convergent-divergent or displacement.These coefficients can comprise pre-convergent-divergent coefficient, as the zoom factor of regulation in efficient coding detects be used for the shift value of JDA memory stores.For example, if matrix T is carried out convergent-divergent, then can from final output, remove the influence (for example, from efficient coding detection and/or pre-convergent-divergent) of these zoom factors.As additional or substitute, can explain any displacement that the intermediate object program that generates is carried out in united detection processing by coming with the negative value of clean (net) shift value final output is shifted.

Because one or more technology described here is unrestricted on the one hand at this, so can use any suitable hardware structure to realize these technology.Figure 13 is the block diagram 1300 according to the JDA framework of an execution mode.The JDA framework of Figure 13 comprises communication bus interface 1320, and communication bus interface 1320 makes that JDA can be via communication bus 1321 and external module (for example, Programmable DSPs) communication.JDA can comprise finite state machine 1350, and finite state machine 1350 can be controlled a plurality of hardware to carry out joint detection algorithm, the JDA algorithm shown in the block diagram of Fig. 6.The JDA hardware block (for example can comprise data address maker 1303, register 1302, joint-detection memory 1304, first multiply-accumulator 1306, the complex multiplication unit that adds up), second multiply-accumulator 1307 (for example, complex multiplication add up unit) and divider unit 1308.Register 1302 can be stored and be provided with and state information, and joint-detection memory 1304 can be stored in employed data value and parameter in the united detection processing algorithm.Can comprise that in this framework inputoutput multiplexer 1314 and its are set to the input data pilot to first multiply-accumulator 1306, second multiply-accumulator 1307 or divider unit 1308.Output multiplexer 1316 can be set to the results direct of performed computing is returned combined detector memory 1304.

Shown in the framework of Figure 13, JDA can comprise the multiple data paths that can carry out dissimilar computings.The JDA framework comprises three data paths shown in the block diagram of Figure 13, be primary data path, secondary data path and divider 1308 data paths, wherein, primary data path comprises the complex multiplication unit 1306 that adds up, and the secondary data path comprises the complex multiplication of the reduced form unit 1307 that adds up.

The primary data path that comprises multiply-accumulator 1306 can be carried out such as ∑ _ia _ib _i+ d _jComputing.Shown primary data path comprises input shift register 1310 and Output Shift Register 1312, these shift registers can make on duty with or (that is, multiply by 2 divided by two power ^SHIFT).Input shift register 1310 can be used for being added to the data input value d of accumulator _jThe position be shifted, and Output Shift Register 1312 can be used for before the subclass of the position of the value that accumulator is obtained is stored in the combined detector memory 1304 accumulator in the position fixed bit position of accumulator (for example) of these values be shifted.In some embodiments, primary data path can be used to carry out multiplication accumulating operation except that compute matrix V.Should be understood that, as is known to persons skilled in the art, though multiply-accumulator 1306 comprises multiplier 1361 and adder 1362, the complex multiplication unit 1306 that adds up can comprise four multipliers and two adders, is used to calculate the real part that multiplies each other and the imaginary part of two plural numbers.

Comprise through the add up secondary data path of unit 1307 of the complex multiplication of simplifying and to carry out such as ∑ _ia _ib _iComputing, wherein, b _iBe+1 ,-1 ,+j or-j.Can in calculating, the matrix V of the convolution of the row that relate to matrix H and Matrix C carry out this computing.Since encoder matrix C can be restricted to comprise belong to set+1 ,-1 ,+j, the element of-j} is so can use the secondary data path to come the element of compute matrix V.The secondary data path can be set to by using one or more multiplexer 1309 more optimally to carry out a _ib _iMultiplication, multiplexer 1309 can be based on b _iBe+1 ,-1 ,+j or-j selects input value a _iReal part or imaginary part.The accumulating operation of being carried out by the secondary data path not necessarily comprises d _jValue is added to a _ib _iSummation, also not necessarily comprise simultaneously to d _jInput displacement.Should be understood that, can in the secondary data path, comprise the Output Shift Register that the result of accumulator is carried out computing, thereby output is shifted.In some embodiments, the secondary data path can be used to carry out compute matrix V.

The division data path comprises divider 1308, and can be used for calculating l/d in the Cholesky resolution process _IiComputing.As known in the art, the division data path can be used to carry out the normalization division to be handled, the shifting processing of shift value of can being shifted by each element to matrix D in the normalization division is handled comes each diagonal element of matrix D is carried out normalization, eliminates the sign bit that repeats thus.Because technology described here is unrestricted on the one hand at this, can be identical so be applied to the shift value of whole elements,, perhaps can be different to reduce computation complexity.Calculated value 0.5/normalized (d in the middle divider of long number (for example, 20) more can be had _Ii), and can these positions that quantity reduces be kept in the JDA memory 1304 according to the form of mantissa's (for example, as 11 place values) and index (for example, 5 place values) subsequently.Should be understood that, index can be stored in the imaginary part of plural number, so not necessarily need the specific memory device at the mantissa of the inverse element of matrix D and the storage of index owing to mantissa can be stored in the real.

According to a plurality of streamlines (pipeline) stage that comprises addressing generation phase 1360, data extract stage 1370, execution phase 1380 and data write phase 1390, the operation that finite state machine 1350 can the control hardware piece.Address phase 1360 can be associated with the control to data address generator 1303.The data extract stage 1370 can be associated with the control to associating detection of stored device 1304.Executing state 1380 can be associated with the control to primary data path, secondary data path or divider data path.Data write phase 1390 can with will be written to joint-detection memory 1304 from the result of execution phase and be associated.In one embodiment, except the visit of the initial memory when the beginning of each joint-detection task computation, in each clock cycle memory access (no matter being read access or write access) appears all.

Can under the control of finite state machine 1350, carry out the JDA shown in the block diagram of Fig. 6 for example by the framework of Figure 13 and handle operation.Figure 14 shows and can carry out the hardware block of controlling the JDA framework and the flow chart 1400 of carrying out one or more task handling of united detection processing by finite state machine 1350.Finite state machine can be from making up matrix V (action 1402), and wherein, thereby finite state machine can be controlled the element of aforesaid a plurality of flow line stage compute matrix V.Finite state machine can continue to make up the output of first matched filter, and (for example, y1), wherein, thereby finite state machine can be controlled a plurality of flow line stages calculating matched filter outputs (action 1404) in a similar way.Finite state machine can continue the power of the matched filter output of the power of compute matrix V and/or calculating before, wherein, only need to calculate the power of a matched filter output in some embodiments, thereby wherein finite state machine can be controlled a plurality of flow line stages calculating power values (action 1406) in a similar way.

Subsequently, finite state machine can continue to make up second matched filter output (for example, y2), wherein, thereby finite state machine can be controlled a plurality of flowing water stages in a similar way and calculates matched filters output y2 (action 1408).Finite state machine can also continue to control the execution that detects relevant action with efficient coding, detects relevant action with efficient coding and can move (exporting y2 such as making up matched filter) execution simultaneously with other.

In some embodiments, the action relevant with effective detection can comprise and determine whether the effective code detection of needs.Can provide the indication that whether needs effective code detection by the designer.This indication can comprise parameter is set, and whether should carry out or skip efficient coding and detect to specify.Because these execution modes are unrestricted on the one hand at this, so can provide the parameter setting to JDA by any suitable means.Can provide the indication efficient coding to detect the parameter setting that whether should be skipped from the DSP that is connected to JDA, and the designer can provide this parameter to DSP.Finite state machine can judge whether to skip efficient coding and detect (action 1409) based on the value of aforementioned parameters.Should skip the judgement that efficient coding detects if made, then finite state machine can be skipped efficient coding all the other actions in detecting and proceeding to handle.If made the judgement that should not skip the efficient coding detection and therefore should carry out this detection, then finite state machine can carry out the initiation that efficient coding detects, and this efficient coding detects and can be carried out by the Programmable DSPs that is connected to JDA.In this way, when DSP was carrying out the efficient coding detection, JDA can carry out other operation of not using the efficient coding testing result simultaneously, for example made up matched filter output y2 (action 1408).

Finite state machine can be waited for finish (action 1410) of efficient coding detection then, wherein, as mentioned above, in some embodiments, can carry out efficient coding by other assembly except that JDA (for example Programmable DSPs) and detect.When effective code detection was finished, finite state machine can be controlled the convergent-divergent again of matched filter output y1, y2 and matrix V and rearrange (action 1412).Finite state machine can continue Cholesky then and decompose, and wherein, finite state machine can be controlled a plurality of pipeline states in a similar way and decompose (action 1414) to carry out.Finite state machine can continue to find the solution the calculating that linear equation A*x1=y1 may relate at the x1 value then, wherein, finite state machine can be controlled a plurality of pipeline states in a similar way and find the solution the calculating (action 1416) that relates in the linear equation with execution.Finite state machine can continue to extract user data (action 1418) from separate x1.Finite state machine can continue to carry out at the x2 value to find the solution the calculating that linear equation A*x2=y2 may relate to then, wherein, finite state machine can be controlled a plurality of pipeline states in a similar way and find the solution the calculating (action 1420) that linear equation relates to execution.Finite state machine can continue to extract user data (action 1422) from separate x2.Finite state machine can be waited for next time slot (action 1424) then, and continues once more 1402 to begin to repeat this processing from moving when receiving next time slot.It is to be further understood that in some embodiments, carry out at next time slot loading data and Control Parameter (being collectively referred to as data here sometimes) before can finishing in the processing of current time slots.In some this execution modes,, just carry out loading data and Control Parameter at next time slot when JDA finishes Loading Control parameter and data at current time slots one.Control Parameter can comprise whether shift value, expression should skip the length (W) of the parameter of efficient coding detection, channel and/or the quantity of coding.

Can be individually, in combination or with not in aforementioned embodiments concrete multiple setting of discussing use various aspects of the present invention, and these aspects of the present invention described here are not limited on it is used described in the aforementioned description or the details and the setting of the assembly in the accompanying drawings.These aspects of the present invention can be used for other execution mode, and can implement or execution according to variety of way.Can in conjunction with the network of any kind, bunch or structure realize various aspects of the present invention.Network is realized not being provided with any restriction.

Therefore, the description of front and accompanying drawing only are examples.

In addition, employed here wording and term are for purposes of illustration, it should be considered as restriction.Here employed " comprising ", " comprising ", " having " or " relating to " and modification thereof are intended to comprise listed thereafter assembly and its equivalent and addition Item.

Claims

1, a kind of being used in the method that is set to the signal that is received is carried out the combined detection system of joint-detection, this method may further comprise the steps:

(A) generate and the corresponding a plurality of initial channel estimation of a plurality of propagation channels, each in wherein said a plurality of initial channel estimation all comprises a plurality of values;

(B) determine at least one pre-convergent-divergent coefficient at least one initial channel estimation in described a plurality of initial channel estimation, wherein, described at least one pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of at least one initial channel estimation in described a plurality of initial channel estimation; And

(C) come at least one initial channel estimation in described a plurality of initial channel estimation is carried out pre-convergent-divergent according to described at least one pre-convergent-divergent coefficient.

2, method according to claim 1, wherein, described at least one pre-convergent-divergent coefficient is configured such that behind pre-convergent-divergent, the maximum value of described a plurality of values of described at least one initial channel estimation in described a plurality of initial channel estimation have with described a plurality of initial channel estimation in the identical index of maximum value of described a plurality of values of at least one different initial channel estimation.

3, method according to claim 1, wherein, described at least one pre-convergent-divergent coefficient is configured such that behind pre-convergent-divergent, and the maximum value of described a plurality of values of at least one the different initial channel estimation in the maximum value of described a plurality of values of described at least one initial channel estimation in described a plurality of initial channel estimation and the described a plurality of initial channel estimation is basic identical.

4, method according to claim 1, wherein, described at least one pre-convergent-divergent coefficient is configured such that behind pre-convergent-divergent, described at least one initial channel estimation in described a plurality of initial channel estimation have with described a plurality of initial channel estimation in the essentially identical power of at least one different initial channel estimation.

5, method according to claim 1, wherein, step (B) may further comprise the steps:

Determine the first pre-convergent-divergent coefficient at first initial channel estimation in described a plurality of initial channel estimation, wherein, the described first pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of described first initial channel estimation; And

Determine the second pre-convergent-divergent coefficient at second initial channel estimation of described a plurality of initial channel estimation, wherein, the described second pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of described second initial channel estimation.

6, method according to claim 5, wherein, step (C) may further comprise the steps:

Come described first initial channel estimation is carried out pre-convergent-divergent according to the described first pre-convergent-divergent coefficient; And

Come described second initial channel estimation is carried out pre-convergent-divergent according to the described second pre-convergent-divergent coefficient,

Wherein, behind described pre-convergent-divergent, the maximum value of described a plurality of values of described first initial channel estimation has the identical index of maximum value with described a plurality of values of described second initial channel estimation.

7, method according to claim 1, wherein, step (B) may further comprise the steps:

Determine different pre-convergent-divergent coefficients at each the different initial channel estimation in described a plurality of initial channel estimation, wherein, each different pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of each different initial channel estimation.

8, method according to claim 7, wherein, step (C) may further comprise the steps:

Come each different initial channel estimation is carried out pre-convergent-divergent according to described different pre-convergent-divergent coefficient respectively, make each maximum value of described a plurality of values of the initial channel estimation that each are different all have identical index.

9, method according to claim 1, wherein, described at least one pre-convergent-divergent coefficient is at least in part based on the maximum value of described a plurality of values of described at least one initial channel estimation in described a plurality of initial channel estimation.

10, method according to claim 1, wherein, step (C) may further comprise the steps:

Come described at least one initial channel estimation in described a plurality of initial channel estimation is carried out pre-convergent-divergent according to described at least one pre-convergent-divergent coefficient, make up at least one initial channel estimation thus through pre-convergent-divergent, wherein, described at least one described a plurality of value through the initial channel estimation of pre-convergent-divergent have and are less than or equal to 1 maximum value.

11, method according to claim 1, wherein, described at least one pre-convergent-divergent coefficient is at least in part based on the maximal index of described a plurality of values of described at least one initial channel estimation in described a plurality of initial channel estimation.

12, a kind of combined detection system that is set to the signal that is received is carried out joint-detection, this combined detection system comprises:

Programmable digital signal processor (DSP), it is set to

Generate and the corresponding a plurality of initial channel estimation of a plurality of propagation channels, each in wherein said a plurality of initial channel estimation all comprises a plurality of values;

Determine at least one pre-convergent-divergent coefficient at least one initial channel estimation in described a plurality of initial channel estimation, wherein, described at least one pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of at least one initial channel estimation in described a plurality of initial channel estimation; And

Come at least one initial channel estimation in described a plurality of initial channel estimation is carried out pre-convergent-divergent according to described at least one pre-convergent-divergent coefficient.

13, combined detection system according to claim 12, wherein, described at least one pre-convergent-divergent coefficient is configured such that behind pre-convergent-divergent, the maximum value of described a plurality of values of described at least one initial channel estimation in described a plurality of initial channel estimation have with described a plurality of initial channel estimation in the identical index of maximum value of described a plurality of values of at least one different initial channel estimation.

14, combined detection system according to claim 12, wherein, described at least one pre-convergent-divergent coefficient is configured such that behind pre-convergent-divergent, and the maximum value of described a plurality of values of at least one the different initial channel estimation in the maximum value of described a plurality of values of described at least one initial channel estimation in described a plurality of initial channel estimation and the described a plurality of initial channel estimation is basic identical.

15, combined detection system according to claim 12, wherein, described at least one pre-convergent-divergent coefficient is configured such that behind pre-convergent-divergent, described at least one initial channel estimation in described a plurality of initial channel estimation have with described a plurality of initial channel estimation in the essentially identical power of at least one different initial channel estimation.

16, combined detection system according to claim 12, this combined detection system also comprises the combined detector accelerator, this combined detector accelerator is set to described received signal is carried out at least one operation in the described joint-detection, and wherein, described Programmable DSPs is set to behind pre-convergent-divergent described a plurality of initial channel estimation be sent to described combined detector accelerator.

17, combined detection system according to claim 16, wherein, described combined detector accelerator has the fixed point implementation.

18, combined detection system according to claim 12, wherein, described Programmable DSPs is set to

Determine the first pre-convergent-divergent coefficient at first initial channel estimation in described a plurality of initial channel estimation, wherein, the described first pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of described first initial channel; And

Determine the second pre-convergent-divergent coefficient at second initial channel estimation in described a plurality of initial channel estimation, wherein, the described second pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of described second initial channel.

19, combined detection system according to claim 18, wherein, described Programmable DSPs is set to

20, combined detection system according to claim 12, wherein, described Programmable DSPs is set to determine different pre-convergent-divergent coefficients at each the different initial channel estimation in described a plurality of initial channel estimation, wherein, each different pre-convergent-divergent coefficient is at least in part based at least one value in described a plurality of values of each different initial channel estimation.

21, combined detection system according to claim 12, wherein, described Programmable DSPs is set to come each different initial channel estimation is carried out pre-convergent-divergent according to described different pre-convergent-divergent coefficient respectively, makes each maximum value of described a plurality of values of the initial channel estimation that each are different all have identical index.

22, combined detection system according to claim 12, wherein, described at least one pre-convergent-divergent coefficient is at least in part based on the maximum value of described a plurality of values of described at least one initial channel estimation in described a plurality of initial channel estimation.

23, combined detection system according to claim 12, wherein, described Programmable DSPs is set to come described at least one initial channel estimation of described a plurality of initial channel estimation is carried out pre-convergent-divergent according to described at least one pre-convergent-divergent coefficient, make up at least one initial channel estimation thus through pre-convergent-divergent, wherein, described at least one described a plurality of value through the initial channel estimation of pre-convergent-divergent have and are less than or equal to 1 maximum value.

24, combined detection system according to claim 12, wherein, described at least one pre-convergent-divergent coefficient is at least in part based on the maximal index of described a plurality of values of described at least one initial channel estimation of described a plurality of initial channel estimation.