GB2426419A - A hardware accelerator for a signal decoder - Google Patents

A hardware accelerator for a signal decoder Download PDF

Info

Publication number
GB2426419A
GB2426419A GB0510127A GB0510127A GB2426419A GB 2426419 A GB2426419 A GB 2426419A GB 0510127 A GB0510127 A GB 0510127A GB 0510127 A GB0510127 A GB 0510127A GB 2426419 A GB2426419 A GB 2426419A
Authority
GB
United Kingdom
Prior art keywords
tree
symbol
data
symbols
string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB0510127A
Other versions
GB2426419B (en
GB0510127D0 (en
Inventor
David Milford
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Europe Ltd
Original Assignee
Toshiba Research Europe Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Research Europe Ltd filed Critical Toshiba Research Europe Ltd
Priority to GB0510127A priority Critical patent/GB2426419B/en
Publication of GB0510127D0 publication Critical patent/GB0510127D0/en
Publication of GB2426419A publication Critical patent/GB2426419A/en
Application granted granted Critical
Publication of GB2426419B publication Critical patent/GB2426419B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/37Decoding methods or techniques, not specific to the particular type of coding provided for in groups H03M13/03 - H03M13/35
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/02Arrangements for detecting or preventing errors in the information received by diversity reception
    • H04L1/06Arrangements for detecting or preventing errors in the information received by diversity reception using space diversity
    • H04L1/0618Space-time coding
    • H04L1/0631Receiver arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/02Arrangements for detecting or preventing errors in the information received by diversity reception
    • H04L1/06Arrangements for detecting or preventing errors in the information received by diversity reception using space diversity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/03Shaping networks in transmitter or receiver, e.g. adaptive shaping networks
    • H04L25/03006Arrangements for removing intersymbol interference
    • H04L25/03178Arrangements involving sequence estimation techniques
    • H04L25/03184Details concerning the metric
    • H04L25/03191Details concerning the metric in which the receiver makes a selection between different metrics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/03Shaping networks in transmitter or receiver, e.g. adaptive shaping networks
    • H04L25/03006Arrangements for removing intersymbol interference
    • H04L25/03178Arrangements involving sequence estimation techniques
    • H04L25/03203Trellis search techniques
    • H04L25/03242Methods involving sphere decoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/03Shaping networks in transmitter or receiver, e.g. adaptive shaping networks
    • H04L25/03006Arrangements for removing intersymbol interference
    • H04L25/03178Arrangements involving sequence estimation techniques
    • H04L25/03337Arrangements involving per-survivor processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L27/00Modulated-carrier systems
    • H04L27/32Carrier systems characterised by combinations of two or more of the types covered by groups H04L27/02, H04L27/10, H04L27/18 or H04L27/26
    • H04L27/34Amplitude- and phase-modulated carrier systems, e.g. quadrature-amplitude modulated carrier systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/03Shaping networks in transmitter or receiver, e.g. adaptive shaping networks
    • H04L25/03006Arrangements for removing intersymbol interference
    • H04L2025/0335Arrangements for removing intersymbol interference characterised by the type of transmission
    • H04L2025/03426Arrangements for removing intersymbol interference characterised by the type of transmission transmission using multiple-input and multiple-output channels
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/03Shaping networks in transmitter or receiver, e.g. adaptive shaping networks
    • H04L25/03006Arrangements for removing intersymbol interference
    • H04L2025/03433Arrangements for removing intersymbol interference characterised by equaliser structure
    • H04L2025/03439Fixed structures
    • H04L2025/03445Time domain
    • H04L2025/03458Lattice
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/0202Channel estimation
    • H04L25/0224Channel estimation using sounding signals
    • H04L25/0228Channel estimation using sounding signals with direct estimation from sounding signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/0202Channel estimation
    • H04L25/024Channel estimation channel estimation algorithms
    • H04L25/0242Channel estimation channel estimation algorithms using matrix methods
    • H04L25/0246Channel estimation channel estimation algorithms using matrix methods with factorisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/03Shaping networks in transmitter or receiver, e.g. adaptive shaping networks
    • H04L25/03006Arrangements for removing intersymbol interference
    • H04L25/03178Arrangements involving sequence estimation techniques
    • H04L25/03312Arrangements specific to the provision of output signals
    • H04L25/03318Provision of soft decisions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L27/00Modulated-carrier systems
    • H04L27/26Systems using multi-frequency codes
    • H04L27/2601Multicarrier modulation systems
    • H04L27/2647Arrangements specific to the receiver only
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L5/00Arrangements affording multiple use of the transmission path
    • H04L5/0001Arrangements for dividing the transmission path
    • H04L5/0014Three-dimensional division
    • H04L5/0023Time-frequency-space

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Power Engineering (AREA)
  • Physics & Mathematics (AREA)
  • Optics & Photonics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Theoretical Computer Science (AREA)
  • Error Detection And Correction (AREA)

Abstract

This invention relates to hardware acceleration of signal processing systems, in particular sphere decoders. A hardware accelerator system for a received signal decoder, said received signal decoder being configured to decode a string of transmitted symbols sent over a channel by searching a tree for one or more candidate strings of symbols, a candidate string of symbols comprising a string of candidate symbols, and selecting one or more of said candidate strings of symbols, said searching comprising searching a multidimensional lattice determined by a response of said channel and represented by said tree, each level of the tree corresponding to a symbol of said transmitted string of symbols, a level of said tree having at least two branches each representing a next possible said candidate symbol, and wherein said hardware accelerator system comprises a branch selection module having an estimated symbol input to receive estimated symbol data and an output to provide data for controlling an order of searching of said tree branches responsive to said estimated symbol data. More generally the invention relates to accelerating processing of a sphere decoder search tree comprising: providing first hardware to determine successively refined symbol string estimates during a descent of said tree; and providing second hardware to determine, responsive to a successively refined symbol string estimate, an order of branches of said tree to search during said tree descent, for said symbol string estimate refining.

Description

Signal Processing Systems This invention relates to hardware acceleration
of signal processing systems, in particular sphere decoders.
Sphere decoders are a particularly advantageous form of decoder for estimating a string of transmitted symbols from a received signal. A string of symbols may be distributed in space, for example across multiple transmit antennas, time, for example with a space- time block or trellis coder, and/or frequency, for example where multiple frequency channels or carriers are employed. Here we will refer particularly to a MIMO (Multiple Input-Multiple Output) arrangement in which the multiple transmit antennas and one or more receive antennas may form part of a single user system (for increased data rate) or a multi- user system.
Embodiments of the techniques described herein are applicable to all the above types of system.
One approach to decoding a string of received signals to determine the most likely transmitted symbols would be to take each symbol in turn, but a better approach is to consider the received string of symbols as a whole. However the search space then becomes very large. The sphere decoding procedure provides an efficient way to search such a space. Broadly speaking candidates for the transmitted signal, modified by the channel response (and space-time encoder) are represented by a lattice in which points correspond to possible (noiseless) received signals. The sphere decoding procedure aims to find one or a few lattice points nearest the actually received signal by searching a tree (generally depth first) in which intermediate levels (or nodes) correspond to partial symbol strings and in which leaf nodes correspond to complete candidate symbol strings, branches at a level representing possible next candidate symbols. The procedure performs a search in a multi-dimensional spherical region centred on the actually received signal and provides a technique for identifying which lattice points are within the required search radius (which may be adjusted according to the noise level and/or channel conditions).
Figure Ia shows a basic MIMO configuration, figure lb an example of transmitted arid received lattices, and figure Ic a typical MIMO data communications system in which, for example, the space-time decoder 100 (which has the task of removing the effect of the encoder and MIMO channel) may be implemented by a sphere decoder. The encoder typically implements space-time and/or frequency encoding, in the latter case optionally modulating separate frequency channels onto OFDM (Orthogonal Frequency Division Multiplexed) carriers, and is normally followed by a modulator (not shown) to provide coded symbols from a constellation of symbols, for transmission.
It is helpful, at this point, to provide an outline review of the operation of the sphere decoding procedure. For a string of N transmitted symbols an N-dimensional lattice is searched, beginning with the Nth dimensional layer (corresponding to the first symbol of the string). A symbol is selected for this layer from the constellation employed and the incremental distance of the generated lattice point from the received signal is determined. The procedure continues with each successive symbol in turn, and if all are within a sphere radius it eventually converges on a lattice point in one dimension. If a symbol is outside the chosen radius then the procedure moves back up a layer (dimension) and chooses the next possible symbol in that layer (dimension) for checking. In this way the procedure builds a tree in which the lowest nodes correspond to complete strings of symbols and in which the number of nodes at the zth level of the tree corresponds to the number of lattice points inside the relevant izth dimensional sphere.
When a complete candidate string of symbols is found the distance of the lattice point, generated from the string of symbols, from the received signal is found and the initial radius is reduced to this distance so that as the tree builds only closer strings to the maximuni-likelihood solution are identified. When the tree has been completed the decoder can be used to provide a hard output, i.e. the maximum likelihood solution, by choosing the nearest lattice point to the received signal. Alternatively a soft output can be provided using a selection of the closest lattice points to the received signal, for example using the distance of each of these from the received signal as an associated likelihood value.
Background prior art relating to sphere decoding can be found in: E. Agrell, T. Eriksson, A. Vardy and K. Zeger, "Closest Point Search in Lattices", IEEE Trans. on Information Theory, vol. 48, no. 8, Aug 2002; B. Viterbo and J. Boutros, "A universal lattice code decoder for fading channels", IEEE Trans. Inform. Theory, vol. 45, no. 5, pp. 1639-1642, Jul. 1999; 0. Damen, A. Clikeif and J. C. Belfiore, "Lattice code decoder for space-time codes, "IEEE Comms. Letter, vol. 4. no. 5, pp. 161-163, May 2000; B. M. Hochwald and S. T. Brink, "Achieving near capacity on a multiple- antenna channel," December 2002; 1-1. Vikalo and B. Hassibi, "Low-complexity iterative detection and decoding of multi-antenna systems employing channel and space-time codes", Conference Record of the Thirty-Sixth Asilomar Conference on Signals and Systems and Computers, vol. 1, Nov 3-6, 2002, pp. 294-298; A. Wiesel, X. Mestre, A. Pages and J. R. Fonollosa, "Efficient Implementation of Sphere Demodulation", Proceedings of IV IEEE Signal Processing Advances in Wireless Communications, pp. 535, Rome, June 15-18, 2003; L. Brunel, J. J. Boutros, "Lattice decoding for joint detection in direct-sequence CDMA systems",IEEE Transactions on Information Theory, Volume: 49 issue: 4, April 2003, pp. 1030 -1037; US patent application US20030076890, filed on July 26, 2002, to B. M. Hochwald, S. Ten Brink, "Method and apparatus for detection and decoding of signals received from a linear propagation channel", Lucent Technologies, mc; US patent application US200201 14410, filed on August 22, 2002, to L. Brunel, "Muitiuser detection method and device in DS-CDMA mode", Mitsubishi Denki Kabushiki Kaisha; H. Vikalo, "Sphere Decoding Algorithms for Digital Communications", PhD Thesis, Standford University, 2003; and B. Hassibi and H. Vikalo, "Maximum-Likelihood Decoding and Integer Least-Squares: The Expected Complexity," in Multiantenna Channels: C'apacity, coding and Signal Processing, (editors J. Foschjni and S. Verdu).
Generally publications so far have focussed on algorithms for sphere decoding rather than on practical implementation of a sphere decoder. The work which has been done on sphere decoder implementation has focussed on hard decision decoders and on so- called list sphere decoders which compile a list of good' solutions from which a soft (likelihood) transmitted symbol string estimate may be computed. Examples are described in US 2004/0181419, JP 2004/282757 and in Burg, et al., Performance Trade-offs in the VLSI Impiementatioii of the Sphere Decoding Algorithm", Proc. 5th lEE Intemational Conference on 3G Mobile Communication Technologies (3G 2004), Oct 2004; Wong, et al., "A VLSI Architecture of a K-best Lattice Decoding Algorithm for MIMO Channels", Proc. IEEE International Symposium on Circuits and Systems, 2002, Vol 3, pp. 273-276, May 2002; and Winddup, et al., "A Highly- parallel VLSI Architecture for a List Sphere Detector", Proc. IEEE International Conference on Communications, 2004, Vol 5, pp. 2720-2725, June 2004. Further background prior art can be found in US 2004/0228423 and WO 03/107582, The quality of results from a list sphere decoder depends to a large extent on the number of candidates in the list but the implementation cost increases substantially if the list is long.
There is therefore a need for improved systems for sphere and related decoding.
According to a first aspect of the invention there is therefore provided a hardware accelerator system for a received signal decoder, said received signal decoder being configured to decode a string of transmitted symbols sent over a channel by searching a tree for one or more candidate strings of symbols, a candidate string of symbols comprising a string of candidate symbols, and selecting one or more of said candidate strings of symbols, said searching comprising searching a multidimensional lattice determined by a response of said channel and represented by said tree, each level of the tree corresponding to a symbol of said transmitted string of symbols, a level of said tree having at least two branches each representing a next possible said candidate symbol, and wherein said hardware accelerator system comprises a branch selection module having an estimated symbol input to receive estimated symbol data and an output to provide data for controlling an order of searching of said tree branches responsive to said estimated symbol data.
In one particularly preferred embodiment, described in more detail later, the hardware accelerator system is initially employed to determine a single, most likely candidate string of symbols using an unconstrained search, and then a constrained search is performed in which each bit, in turn, is constrained to be the inverse of its value in the most likely string of symbols. The results of these searches can then be employed to determine bit likelihood values for each bit of the string in what is termed below a max log MAP procedure.
In embodiments the estimated symbol data comprises an estimated signal level value for a transmitted symbol and preferably this is refined as successively levels of the tree are searched, starting from an initial zero-forcing estimate. The data determining an order of the tree branches to be searched may comprise either an index identifying a branch to be searched or a next candidate symbol corresponding to the branch or both. The data for controlling an order of searching tree branches generally identifies one branch or candidate symbol at a time in response, for example, to a step value input which is incremented, say by a controller, as successive branches are searched.
As previously mentioned a search may be unconstrained, that is among all candidate symbols (bearing in mind that generally a complex symbol value will be split into real and imaginary components of the symbol each of which are searched separately) or a search may be constrained, for example to half its full range through a prior determination of a maximum likelihood candidate string of symbols. Thus preferably the branch selection module includes a mode input for selection of such a constrained search mode so that when this is asserted the branch order controlling data provides data identifying branches within the constrained search, and in particular a preferred order of searching branches of such a constrained tree. In embodiments the constraint data is provided to the branch selection module to thereby define a set of branches amongst which an order is to be determined. More particularly this constraint data, which may take the form of a command word, may be employed as an index for a look-up table defining a search order. For an unconstrained search a similar, or in preferred embodiments the same, look-up table may be employed, this time indexed by a comparison between a received signal value and a value determined by the constellation from which transmitted signals are selected. More particularly the data may be indexed in such a way that the branches are searched closest to the received signal first, then in order of increasing distance. For example a comparison may be made between a received signal level defining a real (or imaginary) value to be mapped to the symbol constellation and the branches searched in an order of increasing difference between the received signal value and a signal value determined by the candidate symbol associated with the branch (mapped to a real (or imaginary) constellation signal value). In preferred embodiments one or more comparators are employed to compare the relevant received signal value with values defined by the constellation symbol and the output(s) of the comparator or comparators (i.e., the result of the comparison) used as an index for
the look-up table.
In some particularly preferred embodiments the constraint data for a constrained search is provided as a substitute set of comparator outputs, this substitute comparator data being employed to index a table in place of real comparison data from the one or more comparators when the constrained search mode is selected.
In preferred embodiments the hardware accelerator system also includes a symbol estimate computation module with an input to receive an initial (zero-forcing) estimate for the transmitted string of symbols and a computation system to determine a revised estimate of the transmitted string for a level of the tree from a difference between a signal value for a symbol determined by a tree branch previously taken to reach the tree level and a received signal value for the symbol. Thus, broadly speaking, the initial zero-forcing estimate is progressively refined as the components of the symbol are quantised descending through the tree giving new estimates at each level. At each stage or level of the descent (say, at level k) the ku component of the kt/z estimate (e[k]) is compared with the constellation values and one of these is selected. The comparison of signal level with the current symbol is made in the transmitted symbol domain (because comparisons in the received symbol domain would be complicated because the constellation itself is transformed by the channel). It will be recognised that the comparison is between components (real and imaginary) of a symbol one at a time, in effect performing the searching according to a local most likely next branch whereas an aim of the sphere decoding procedure as a whole is to attempt to find a global closest match to the entire string of symbols.
Preferably the symbol estimate computation module also receives data defining an estimate of a response of the (MIMO) channel, and includes a distance calculation module to determine a distance between a next candidate transmitted symbol and the corresponding received signal data using this estimated channel response. The data defining the estimated channel response is generally pre-processed by QR decomposition. Distance calculations and comparisons are made in the received signal domain and hence use matrix elements derived from the channel response matrix to scale the estimate distance from the symbol constellation quantisation level before it is accumulated into a total distance for the symbol string (which at an intermediate level in the tree not yet at a leaf node may be a partial symbol/string). Preferably a data store is also included for storing an accumulated said distance, in particular for storing the accumulated distance for a candidate string of symbols (generally a partial string) for each level of the tree. The computation module may also have an input defining a sphere radius, that is for defining a search region of the lattice determined by a radius of an N dimensional sphere centred on the received signal, i.e., so that nodes having an accumulated distance greater than this sphere radius are not searched.
Preferably the hardware accelerator system also includes a controller to successively increment a step index data value to control the tree branch selection, This step index value is preferably provided to the branch selection module and successively increments of this value are used to output tree branch order search control data for successively less preferred branches in turn so that on a first step the branch selection module selects a preferred branch for searching, on a next step (at the same level of the tree) a next most preferred branch for searching, and so forth. Such a controller may also perform other general housekeeping and control functions, for example for the symbol estimate computation module.
The invention further provides a sphere decoder including a hardware accelerator system as described above.
In a related aspect the invention provides a hardware tree branch selection unit for searching a lattice defined by a matrix for a closest lattice point to a target point defined by a vector, each said lattice point defining a set of symbols, each symbol being defined by a value of a constellation of possible symbol values, said vector comprising values for a set of said symbols, a level of said tree having nodes each corresponding to a partial said set of symbols, branches of said tree from a said node corresponding to possible values of an additional symbol of said partial set, the tree branch selection unit comprising: a first input to receive an estimated symbol value derived from said vector; at least one second input to receive a said constellation value; at least one comparator coupled to said first input and to said at least one second input and having an output responsive to a comparison between said estimated symbol value and said constellation value; and a data store coupled to comparator output to provide a stored data output responsive to said comparison; and wherein said stored data output comprises data for determining a branch of said tree to search.
The invention further provides a method of accelerating processing of a sphere decoder search tree, the method comprising providing first hardware to determine successively refined symbol string estimates during a descent of said tree; and providing second hardware to determine, responsive to a said successively refined symbol string estimate, an order of branches of said tree to search during said tree descent, for said symbol string estimate refining.
In a related aspect the invention also provides a hardware accelerator sphere decoder including the above described first and second hardware.
It will be appreciated that embodiments of the above described aspects of the invention may be employed in MIMO and multi-user systems, for block or other code decoding (including iterative or turbo decoding) and/or for channel decoding, block equalisation, for example for frequency selective fading. Although embodiments of the above described aspects of the invention are particularly useful in sphere decoders their application is not limited to this type of decoding and includes, for example, other matrix search procedures.
These and other aspects of the invention will now be further described, by way of example only, with reference to the accompanying figures, in which: Figures Ia to Ic show, respectively, a basic MIMO configuration, an example of a lattice for signals transmitted and received over a MllvIO channel, and a typical example MIMO communications system; Figure 2 shows a block diagram of a first example of a max-log-MAP a decoder; Figure 3 shows a block diagram of a second example of a max-log-MAP a decoder; Figures 4a to 4d show, respectively, an example of received signal space represented by a tree and distances in a sphere decoding procedure, an example of a sphere decoder tree search in three dimensions, a diagrammatic illustration of rotation of the coordinate system, and diagrammatic illustration of QR decomposition of the channel matrix; Figures 5a to Se show, respectively, a receiver including a sphere decoder system incorporating a hardware accelerator embodying an aspect of the present invention, a sphere decoder system, subunits of symbol processing unit including a branch selection unit, details of the branch selection unit of Figure 5c, and an example sphere decoder hardware processing element; Figure 6 shows a I6QAM constellation; and Figure 7 shows a sphere decoder symbol processing procedure.
To help in understanding the invention the sphere decoding procedure will first be described. Preferred embodiments of aspects of the invention are adapted to use with what we term a max-log-map approximation procedure in which soft bit values are determined using a series of constrained searches. This technique is described in more detail in the applicant's UK Patent Application No. 0416820.9, priority date 3(11 October 2003 (and in the corresponding US Application 10/938,584) the contents of which are hereby incorporated in their entirety by reference. Again, to assist in understanding operational embodiments of the invention an outline of the max-log-map procedure will be given.
We first present a mathematical description of the max-log-map procedure, by way of
background.
Consider a space-time transmission scheme with n7. transmitted and n received signals, for example in a MIMO communications system with n7. transmit and 1R receive antennas. The x n received signal vector at each instant time k is given by: = --V Equation I where = denotes the transmitted vector whose entries are chosen from some complex constellation C with M = 2q possible signal points and q is the number of bits per constellation symbol. The AWGN (Additive White Gaussian Noise) vector V is a 1 x "R vector of independent, zero-mean complex Gaussian noise entries with variance of Q.2 per real component. The notation denotes an n. 1R multiple- inpurlmultiple-output (MIMO) channel matrix assumed to be known or estimated at the receiver, with ii -row and in -column components h,,,, ii = 1,** , iz., n = 1, representing the narrowband flat fading between the,i -th transmitted signal and in -th received signal. The channel fade may be assumed to be constant over a symbol period.
In a receiver a MIMO channel estimate 1k can be obtained in a conventional maimer using a training sequence. For example a training sequence can be transmitted from each transmit antenna in turn (to avoid interference problems), each time listening on all the receive antennas to characterise the channels from that transmit antenna to the receive antennas. Alternatively i may be an effective channel derived from one or more uses of the true channel. In a block equaliser for frequency selective fading the channel model (L) may be modified to take into account the channel memory; for channel decoding the sphere decoder determines the distance between the received signal and each possible transmitted codeword in its search, as described in our previous UK patent application (ibki). Equation 1 may also be used to represent a CDMA system where, for example, the multi-user detector estimates the signal k transmitted from different users and matrix represents the combined spreading and channel effects for all users.
Ignoring the time index k for simplicity of discussion, the ii -th component of the transmitted symbol is obtained using the symbol mapping function = niapx n = 1,** ,n7. Equation 2 where f Jx' x] Equation 3 is a vector with q transmitted data bits, and q is the number of bits per constellation symbol. (More generally, however, i denotes a string of symbols encoded over space and/or time and/or frequency and n runs over the length of the string). Therefore the (q n)-length vector of bits transmitted can be denoted by x = [x' XIT] Equation 4 and the transmitted vector constellation is written as = nap(x) .
The complex matrix representation of Equation I (ignoring the time index k) can be transformed to a real matrix representation with twice the dimension of the original system as follows: r = sH + v Equation 5 where r = [ fr} ifr)] Equation 6 s =[91fr} fr}] Equation 7 i Hi I Equation 8 [-Z{H} ôRH] v=n} ii Equation 9 We shall use the real-valued representation of Equation 5 to Equation 9 in the following discussion so that, for example, r and are real vectors and H is a real matrix.
The maximum aposteriori probability (APP) bit detection, conditioned on the received signal r for the space-time transmission of Equation 5 can be expressed in log likelihood ratio (LLR) terms as follows: P(x"=+lIr) LplflrIln / P(x=-1Ir) exp(__LT.JJr._fthI2+!.xT.LA)
XEXJ
in exp(__.IIr-11hI2 +! xT.L4) =L1(x) exp___ JJr -II7 + x11 L41) xX* 0.
+lii * exp(__-L.IIr - HII2 + . L4,1) LE(x'Jr) fl 1,. *, n, j = 1, . . Equation 10 where x is a sequence of possible transmitted bits, L,, is a vector of L4-values of x, is a vector of possible transmitted symbols, i.e. = inap(x), x11 denotes the sub vector of x obtained by omitting its element x, and L4111 denotes the vector of all L4 -values, also omitting the element corresponding to bit x'; and where JJ. denotes the Euclidean nonm The set is the set of bit vectors x havingx = +1, i.e. x:1 ={xIx = i} and The symbol is the mapping to the possible transmitted bit vectorx. The functions L (.), L4 (*) and L, (.) denote the a pos/eriori, a priori and extrinsic likelihood ratio respectively.
According to Equation 10 APP detection requires an exhaustive evaluation of 2q'II distance metrics Jjr - Hjj2 corresponding to the number of elements in the set and X;. The computational complexity of APP detection increases exponentially with the number of bits per symbol q and number of spatial-multiplexed transmitted symbols iii.
Here we describe an efficient method to evaluate the max-log approximation of Equation 10 for each bit I I 2 -max.c-_-_.IJr_sHII +x L,, (x; r) 2;e.,J +xT.L4} Equation 1] max - log approximation j=1,.*,q by searching for the candidates that provide the max{.} term for x E and x E X for each transmitted bit without exhaustively evaluating the term _.IIr_HIJ2 +xT *L1 in Equation 11 for all possible. Note that since there are (q n1) transmitted bits, there (q n7) operations which evaluate Equation 11.
A sphere decoder search algorithm is used to search for the candidates that satisfy the condition Ir_HIl2_o2xT.L4 =p2 Equation 12 For every candidate found, the bound p2 is reduced until one candidate is found that satisfies the minimum metric hr -Hhj2 - o2xT L4 for a particular bit. More particularly the search procedure is performed for every bit x to find the two candidates that satisfy the following optimisations: +__miI{IIrHII2 -oxT.L4} for bit x =+l and ( 2 S =mm1lIr-sHfJ -crx *L4 for bit = -] , where n =1,.. , n. and j = 1,.*., q. The corresponding distance metrics are obtained for the two candidates, d,1 and where = Dr - HIr .2+T L, Equation 13 and d,1.. =JIr-HIJ2 _o2x_T *LTA Equation 14 The vectors x, x - and L, L, correspond to the bit sequences and a priori information of the symbols and.
The max-log-MAP approximation of the extrinsic LLR (log likelihood ratio) value is given by: L (x; I r) .-_-(_d,,1,+ + Equation 15 max - log approximation The relationship between L and LE is given by L = L4 + L The noise variance may be obtained in any convenient manner, depending upon the overall system design. For example, the noise variance may be obtained during the training period where channel impulse response is estimated. During the training period, the transmitted symbol sequence is known. Together with the estimated channel impulse response, the noiseless' received signal is obtained. The noise variance may be estimated from evaluating the noise statistic of the sequence of received signal during the training period', knowing the sequence of noiseless' received sigl]al.
Referring to Figure 2, this shows a block diagram of a max-log MAP decoder 200 configured to determine bit likelihood values in accordance with the max-log approximation of Equation 15. The decoder comprises a plurality of hard detectors or decoders 202a-c, 204a-c, each configured to determine a distance metric for a possible value of a particular bit x, +1 for detectors/decoders 202, -1 for detectors/decoders 204, according to respective equations 13 and 14, based upon input values for r, H, a and, where available, L4 (x). In this embodiment,, runs over the transmit antennas andj runs over the bits of a constellation symbol. Each of these detectors/decoders 202,204 provides a distance metric value to an output stage 206 that determines a bit likelihood value for each bit of the transmitted string of symbols according to Equation 15. The likelihood values may comprise "extrinsic" and/or a posleriori bit likelihood values. The skilled person will appreciate that the detectors/decoders 202,204 may be implemented in series, for example as repeated instances of a software process, or in parallel, or in a combination of serial and parallelprocesses.
A detector/decoder 202,204 need only provide a hard output, that is an output identifying a most likely candidate with a particular bit value x being +1 or -1 and/or providing a minimum distance metric or d,1... Thus the skilled person will appreciate that the arrangement of Figure 2 may employ any maximum likelihood hard detectors/decoders that can provide the appropriate distance metrics. However in a preferred embodiment hard detectors/decoders 202,204 are implemented using one or more sphere decoders.
For the received vectorr, either candidate or is the maximum likelihood estimate 5AIL - that is the maximum likelihood solution provides one set of bit values XML and corresponding distance metrics Thus maximum likelihood sphere decoding can be performed first and the bit-wise sphere decoding may then be performed to obtain the distance metrics, d)AfL, for the bit values which do not correspond to the maximum likelihood symbol estimate.
Figure 3 shows a block diagram of a max-log decoder 300 configured to determine bit likelihood values in this way, and employing sphere decoders as hard detectors. In Figure 3 hard detection blocks 304a-c and output stage 306 correspond to a combination of detectors/decoders 202 and 204 which correspond to the set of non- maximum- likelihood bit sequence x X and to output stage 206 of Figure 2 respectively. An additional hard detector 302, preferably a sphere decoder, determines a maximum likelihood symbol string estimate tfL and a demodulator 303 converts this symbol estimate to a bitwise estimate XML; hard detection sphere decoder 302 also provides a corresponding bit likelihood value dL (common to all the bits of Xh,L).
Preferably a set of lattice points is searched according to an increasing distance from an estimated constellation symbol, for example according to the Schnorr-Euchner strategy described in C P Schnorr and M Euchner, "Lattice basis reduction: Improved practical algorithms and solving subset sum problems", Math. Programming, vol 66, pp. 18]- 191, 1994 and in Agrell et a!. (ibid). Other methods for ordering the symbols to be searched using look-up table are described A. Wiesel et al (ibid), incorporated by reference.
As an example, consider the case of a two transmit antenna system with a 4 PAM (Pulse Amplitude Modulation) symbol constellation, C {-3,-1,1,3}, corresponding to the symbol mapping of the bits {_i -1, -1 +1, +1 +1, +1 i} , where the maximum likelihood estimate is found to be SAIL = [- 1 3] with XA = [-i +1 +1 -1] and distance metric dL. In this example bit-wise sphere decoding is then performed for the set X,, Xc,, X1, A', to obtain the distance metrics = = , = dL.- and d;,,,,L = since d_ = = = d_ d1,L.
This can then be used to obtain LLR values for the bits as described above.
In order to increase the speed of the bit-wise sphere decoding after obtaining the maximum likelihood distance metrics dLL, the initial search radius can be bounded using the ML solution, for example by, = 20.2 IL + d,,L where Vi'Lx is preset to a particular value, eg. 50.
Broadly speaking, the above described max-log-MAP procedure provides a fast method of determining the LLR for a bit, x, approximated by: mm 2 mm 2 L(x)cc Hs-r - / Hs-r sE(x=1) sE=O) We next outline the sphere decoding process. Referring to Figure 4, broadly speaking, the hierarchy of dimensions in the signal space is represented by a tree. For example, in a MIMO system with four transmit antennae using I6QAM modulation the free can be structured to divide 16 ways at each node (16 constellation points per transmitter) and has four layers (one per transmitter) ending in ió (> 65000) terminal nodes representing the "super-constellation". Alternatively, as shown in Figure 4, if the I and Q components at each transmitter are separated, the tree can be restructured with four branches (four distinct r or Q values for each transmitter constellation point) at each of eight layers (two layers per transmitter).
Given a received vector in a multidimensional space, it is relatively easy to compute the "perpendicular" distance d to the nearest planes (or, in general, hyperplanes). Having relocated the received vector to the closest point on a plane the perpendicular distance to the next plane in a lower dimension can be calculated and the overall distance from the starting point can be accumulated. A sphere decoder algorithm is generally a recursive, depth-first search of the tree representing the hierarchy of hyperplanes. Once the distance to a point in the constellation has been established by descending to the terminal layer, an exhaustive search of the tree is avoided because the descent along other branches can be abandoned whenever the accumulated distance exceeds the current best (the sphere radius). Each successful descent to the terminal layer results in a candidate for closest constellation point and a corresponding reduction in the sphere radius (IC). An example of a sphere decoding procedure is shown below: SphereDecoder (H, r) D 0 e zero-forcing estimate closest e k dimension of search space descend(e,k,o) // descend into subtree rooted at estimate e on level k descencl(e,k,D) s lookup(e,k) 1/ returns a liet of the planes at level k ordered closest to a for each in turn of the hyperplaries in the vector a d perpendicular distance from received point; 0' 0 + dd// accumulate distance metric if 0' < C // less than current sphere radius? if not at terminal layer adjust received vector to the closest point on the relevant hyperplane and compute corresponding point e' in transmitter space descend(e' , k-l,D') else (at termia1 layer) C 0' I/update sphere radius closest = e' break /1 there cannot be a better leaf node in this subtree endif else break // subsequent branches cannot lead to a closer solution endif endfor Example sphere decoding procedure Figure 4b shows an example of a tree search in three dimensions. The tree illustrates a search of a regular three dimensional lattice of points in which the layers are separated by 10, 9 and 8 units. The branches are ordered with the closest layer in each dimension in the left-most branch. For example, in the full lattice, the most widely spaced layers are 10 units apart so, unless it lies well outside the lattice, the received point cannot be more than 5 units from any layer. In this illustration the diagram shows the closest layer at a distance of 4 units. The figures inside the boxes show the order in which nodes are visited and the figures on the branches show the partial distances to the nearest planes with a lower dimension. The accumulated squared distance is indicated at the end of each branch for which a calculation has been made. The shortened branches indicate where no calculation of distance or estimate needs to be made because the current sphere radius has already been exceeded in a "closer" branch.
The distance calculations in the tree search are considerably simplified by a coordinate transformation of the channel matrix H as the aim is to measure the perpendicular distance from a point in the received signal space to the nearest hyperplane. A rotation can be applied to the channel matrix H so that it takes an upper triangular form R with the most widely spaced planes perpendicular to a principal axis. This is illustrated diagrammatically in Figures 4c and 4d. The distance can then be calculated directly using an element from the principal diagonal of R, The rotation can be achieved by a sorted QR decomposition of the channel matrix using a modified form of the Gram-Schmidt algorithm (see, for example, G.H. Golub and C.F.
van Loan, Matrix Computations, John Hopkins University Press, 1983 and Wubben et al, "Efficient algorithm for decoding layered space-time codes", Electronic Letters, Vol 37, No 22, pp 1348-1350, Oct 2001).
Returning again to a (simplified) mathematical notation, for a wireless communication system which has M transmit antennas and N receive antennas the received signal r may be written as a vector of complex elements r = Hs + v where H is an NxM matrix of complex coefficients, s is a vector of complex elements representing the transmitted symbol and v is a vector of complex elements representing additive white Gaussian noise. Using real elements, the complex matrix equation can be rewritten as: IRC(r)] rRe(H) - Im(H)1[Re(s)1 + FRe(n) LIm(r)J - [Irn(H) Re(H) JLlm(s)J LIm(n) or (using a different typeface to indicate corresponding arrays of real elements) r = Hs + v and optimal detection of the transmitted symbol comprises finding the most likely symbol s such that = arg mini Jr HsJJ2. We can define the unconstrained zero- forcing estimate ezF such that r = HeZF (or ezF = H r where H" = (HHH)HH).
upper triangular matrix R can be found (eg. from the QR decomposition of H) such that: RTR = HT}1 and then = arg minhJH(e, - s)JJ2 for all s in the transmitted constellation, that is = argrninJ(R(e - s)JJ2. By substituting H = QR it can be seen that QReZF = r, ReZF = Q'r = b and thus, the initial estimate (the zero-forcing solution) can be calculated from ezF W' b = Fb where F is the inverse of R. The tree search which constitutes the greater part of the sphere decoder therefore has a preparatory phase in which R, its inverse F, and b (a transformed version of the received vector) are calculated. R and F are updated relatively infrequently and b is computed each FFT symbol period (4ts for an 802.11 OFDM system) so the computation of the input parameters is relatively straightforward.
As previously mentioned, in the real-valued representation each symbol s in the transmitted constellation can be represented as a vector with dimension 2M. Each component in this vector takes a value q corresponding to one of the quantisation levels in either the in-phase (1) or quadrature (Q) components of the constellation. So, for example, in a I6QAM constellation each component of s is assigned one of four possible q values. A transmitted symbol can therefore be represented as a terminal node of a tree in which the branching at each level represents a real component of the transmitted symbol. Thus, for a system with four transmit antennas (M=4) the tree has eight levels and in a 16QAM symbol constellation, for example, the tree branches four ways at each node. Computation of the distance metric D = II(R(e - s)112 can be performed recursively as previously indicated. An example procedure (in C code) to descend the tree is shown below: // descend into subtree rooted at estimate e at layer k // e is an estimate of the transmitted vector // N transmitters // P planes per transmitter // R[N] (N] is the rotated channel matrix in upper triangular form // F is the inverse of N void descend(float *e, mt k, float D) mt i,s(P]; float d, DD, eeUfl; void lookup(float x, 1st k, 1st *); lookup(e(kJ,k,g); // find closest planes and for (i=O;i.p;j+ ) // for each of these planes d (e[k]_s(i])*R(k] tk]; DD = D + d*d; II update distance metric if (DD < C) // if still inside sphere for (i=O;i.N;j++) ee[i] =e[i] - d*F(i) Ek]; // update estimate and if (k>O) /1 if not at leaf plane descend(ee, k-i, DD); // explore lower dimensions else C = DD; II inside sphere at terminal node break; II look no further else break; Example sphere decoder tree-descend function With the symbolic names used above the majority of descents in the tree involve: s = lookup(e,k) comparison and table lookup 6 = e{k] - s{i] a subtraction to find a separation distance in the transmitter space d = 8 * R[kJ[k] multiplication to find perpendicular distance in the receiver space dd = d*d squaring by multiplication D' = D + dd accumulate by addition 3e[i] d*F[iJ{k] adjust vector e (multiplications in parallel for each component) efi] = e[i] - 6e[i] parallel subtraction to calculate new estimate in transmitter space The computation in stages 6 and 7 can be performed concurrently with that in stages 4 and 5. Part of the hardware we describe involves a single processing element used to implement iterative execution of the recursive structure of the algorithm.
Using notation based on the above the sphere decoding procedure, or more particularly computation of the distance metric D = lJ(R(e - s)Ij can be expressed (performed recursively) as: k=2*M I D[kJ=O e[k] = zero-forcing estimate öe[k] e[k][k] - s[kJ{i} where s[k][i] is the ith closest quantisation level in the symbol constellation selected as the kth component of s e[k- lJ = e[kJ - e[kJ*F,,[:J[k] where F is the inverse of R and F[:j{k] is a vector formed by column k of the normalised matrix in which F[i][j} = F[i] j]IF[j][jJ D[k-1J = D[kJ + (R[k][k] * 8e[kJ)2 Since all the elements on the main diagonal of F have a known value (= 1) these are replaced in the hardware implementation by the elements of(R[k}[kJ)2 arid the composite matrix is labelled A. This is a convenience for a hardware implementation of the computation, rather than a mathematical representation, with the aim of making more efficient use of data storage.
The overall distance metric D is accumulated at each stage in the descent of the tree as a sum of partial distance metrics. A "sphere radius" can be specified as a constraint so that, in a depth-first traversal of the tree, the search is abandoned if the accumulated vaiLle of D exceeds C (the square of the sphere radius). A minimum overall distance metric can be arrived at most readily by preferentially selecting for computation the branches with the smallest partial distance metric.
Apart from the constraint on the sphere radius, it is useful to be able to limit the search to members of the constellation in which a particular bit has a known value, in particular for implementing a max-log-MAP procedure. In terms of the tree representation, this comprises limiting the branching at a particular level in the tree to (say) half the full range, for example in a I6QAM constellation to two branches rather than four.
Consider an 802.11 OFDM system in which the 48 sub-carriers are processed concurrently over a symbol period of 4ts. With the timing suggested above a single processing element can perform approximately 160 descents during a symbol period.
Simulations confirm that the tree search typically requires several tens of descents but that there are rare occasions when much longer searches are required. Nevertheless, it is relatively straightforward, for example, to limit the search time and still obtain good results. Hence (for example) 48 processing elements operating independently on the subcarriers can provide a viable hard-decision decoder for a 4x4 MIMO configuration giving a raw data rate of 192 Mbps.
In a sofi-decision decoder each bit in the transmitted symbol would, in principle, require two search engines. One would search for the closest point in the constellation with a I' in a particular position and its companion would search for the closest point with a 0' in the same position. These searches would therefore be conducted over a subconstellation with only half the points of the full version; the branching in the quadtree representing the search space would be restricted at one of its levels to just two ways by manipulation of the "lookup" phase of the descend function for a particular value of k.
A full soft-decision decoder (in the above example) would therefore require 32 processing elements per sub-carrier symbol. However, a reduction can be made in the hardware budget at the expense of a reduction in parallelism (if desired) by performing the search in two phases. In phase #1 a search is conducted over the entire constellation for the closest point, providing a bard decision and an associated distance metric for all the bits in the ML solution. In phase #2 a set of processing elements, each operating on a "half-constellation", returns distance metrics for the inverse of each of the bits in the ML solution. This approximately halves the number of processing elements required for a soft decision.
Figures 5a shows a receiver 500 including a sphere decoder incorporating a hardware accelerator embodying an aspect of the present invention. Receiver 500 comprises one or more receive antennas 502a, b (of which two are shown in the illustrated embodiment) each coupled to a respective rf front end 504a,b, and thence to a respective analogue-to-digital converter 506a,b and to digital signal processor (DSP) 508. DSP 508 performs control and data handling functions as well as providing an interface to a sphere decoder hardware accelerator 522, described further below, via bi-directional data bus 512.
A sphere decoder system 514, shown in Figure 5b, is provided by a combination of functions performed by DSP 508 and hardware accelerator 522. The sphere decoder system comprises a symbol pre-processing block 516 to calculate an initial zero-forcing estimate, a channel matrix preprocessing block 518 to perform the QR decomposition, and a bit LLR computation block 518 to perform bit LLR computations as described above, all preferably implemented on DSP 508, for example by code in permanent program memory. Hardware accelerator 522 receives as inputs (in the illustrated embodiment) matrix A, initial estimate ezF, sphere radius information, and mode (constrained or unconstrained search) selection information in, performs symbol processing, and provides (in the illustrated embodiment) decision data comprising distance data (specifying a distance of an evaluated tree node/string of symbols from the received signal) and optionally decision data (specifying, for example, a ML symbol string).
Figure Sc shows subunits of symbol processing unit 522, in particular a set of one or more registers 524 for storage for the variables used in the computation, closely coupled to a set of one or more arithmetic units 526. For example, for the above mentioned case of a 4 transmit antenna MIMO system with I6QAM modulation (16 bits per transmitted vector) and a desired raw data throughput of 100Mbps the arithmetic units may comprise two integer multipliers, an adder/subtracter and a small bank of subtracters.
Generally there will also be a local controller (not shown in Figure 5c) to provide timing and other control functions, Symbol processing unit 522 also comprises a branch selection unit 528 and a decision stack storage module 530.
Figure 5d shows details of the branch selection unit 528 of Figure Sc. Branch selection unit 528 comprises one or more magnitude comparators 532 and a lookup table 534 and, broadly speaking, is used to order the selection of symbol components at each stage in the evaluation of the distance metric.
We next describe a preferred embodiment of the branch selection unit 528 in more detail.
The branch selection unit orders the selection of component (symbol) values for inclusion in the computation of the distance metric, preferably following the Schnorr- Euchner strategy (ibul) in which the branches are ordered according to their proximity to a current estimate of the relevant component. A comparison is made with a number of constant values (quantisation levels of real and imaginary parts of a constellation symbol) and the outputs of the comparators 532 are used as an input to lookup table 534 which, together with an index, produces a quantised component value and a branch decision.
Figure 6 shows an example of a I6QAM constellation, Gray-coded so that only a single bit changes at a time (for example, along the real axis values -3, -1, +1, +3 are represented by 00, 10, 11, 01). This form of coding is preferred as it helps to separate adjacent symbols, although applications of embodiments of the invention are not limited to Gray coded signals. In the tree representation there are therefore four branches at each level (real and imaginary components being separated in this example). In a constrained search, where one of the two bits has a defined value, there are two branches per level.
As previously mentioned, beginning with the initial, zero-forcing estimate: e[7] in the above notation with M = 4, this estimate is progressively refined as symbols of the string are selected (components of s are quantised), descending through the tree, giving new estimates e[6}, e[5} and so forth. At each stage of the descent (at level k, the kth component of the kth estimate (e[k}[kJ) is compared with the constellation values and one of them is selected. All symbol comparisons are made in the transmitted symbol domain as comparisons in the received symbol domain would be complicated because the constellation itself is transformed by the channel. Distance calculations and comparisons, on the other hand, are made in the received signal domain (hence the use of matrix elements to scale the Hdelta & distance before it is accumulated into D).
Referring again to Figure Sd, the kth component of the kth estimate (e[kJ[kJ) provides an input to the branch selection unit 528, more specifically to comparators 532. In one embodiment lookup table 534 stores branch ordering information for an unconstrained search. An example of such branch ordering information, for an IGQAM search, is given in Table I below, where a, b, and c comprise outputs of comparators 532 for lel> 2, el> 1, and sign(e) respectively.
Jet > 2 let> I sign(e) order a b c 0 1 2 3 1 (1) 0 3 2 1 0 o 1 0 2 3 I 0 o 0 0 2 1 3 0 o 0 1 1 2 0 3 o 1 1 1 0 2 3 1 (1) 1 0 1 2 3 Table 1: Branch ordering in an unconstrained I6QAM search In Table I the columns labelled order correspond to successive steps in searching branches of the tree so that, for example, for a=1, b=l, c=0 step 0 indexes branch 3, step I indexes branch 2, step 2 indexes branch 1, and step 3 indexes branch 0 of the tree.
The step value is incremented, for example, by a symbol processing unit controller. The branch ordering in an unconstrained search can be detennined by comparison with three constant values to produce the ordering shown in Table 1 (and also Table 2, described later). For example, an input value of +0.5 would result in a branching order [2, 1, 3, 0] corresponding to the ordered signal levels [+1,-I, +3, -3].
In preferred embodiments, however, lookup table 534 stores combined branch ordering information for both unconstrained and constrained searches, It can be seen that in Table I the values indicated by (If are logically redundant. Broadly the aim is to re- use the ordering information present in Table 1, exploiting the redundancy in a, b, and c.
In a Gray coded constellation only a single bit changes value between adjacent pairs of branches. In the aforementioned (16QAM) example of Figure 6 branch 0 corresponds to 00 (-3), branch I corresponds to 10 (-1), branch 2 corresponds to 11 (+1), branch 3 corresponds to 01 (+3), In a constrained search, having fixed a particular bit decision, the branching decision at one of the levels in the tree will be confined, here to just two choices. In a Gray-coded I6QAJvI constellation the choice will be between branches {O,1), {1,2), (2,3), or {0,3}. lithe four quantisation levels in a I6QAIVI constellation have the bit assignment (bl,bO) shown in Figure 6 then, given bO=0, the choice is between qO and q 1. Alternatively, given bI =0, the choice is between qO and q3.
Inspection of Table I reveals that the choices {Ofl}, {1 2), {213} already exist as the decisions ordered in columns (0,1); the choice {013} can be added by exploiting the redundancy in the table, Table 2, below, shows how the original scheme can be adapted to achieve an efficient encoding for a constrained search.
__________ IeI > 2 el> I slgn(e) order m a b c 0 1 2 3 o 1 1 0 3 2 1 0 1 1 1 0 3 2 - -.
o 0 1 0 2 3 1 0 1 0 1 0 2 3 -- -- o 0 0 0 2 1 3 0 1 0 0 0 2 1 -- -- o 0 0 1 1 2 0 3 1 0 0 1 1 2 -- -- o 0 1 1 1 0 2 3 1 0 1 1 1 0 -- -- 0 1 1 1 0 1 2 3 I 1 1 1 0 1 -- -- 1 1 0 1 0 3 -- 1 1 0 0 ___ 0 -* -- Table 2: Branch ordering lookup table for unconstrained and constrained I6QAM searches.
In Table 2 the entries with in = 0 are a repeat of Table I. The extra bit(m) specifies whether a modified (constrained) search is required (m 1) , in which case, some of the bits {a,b,c) depend on a command word and not just the comparator outputs (indicated by bold italic type in the table). It will be recognised that for a constrained search the branches amongst which the search is to be constrained are specified but that their order (ie. which is first) still needs to be determined. Broadly speaking the command word provides a code (in the example below, two bits ml,mO) specifying what (pair of) branches to search and the comparator outputs specify the order of the branches to search. This enables a single lookup table to be employed for both unconstrained and constrained searches. However as will be seen below, we describe a system which comprises more than simply using the comparators and then applying the constraint code.
In one preferred embodiment we define a 5-bit command word (m,k2,kl,kO, rnl,mO) as follows: (nil,mO) specifies the pair of branches to be searched in a constrained search (k2,kl,kO) specify the level at which the branching is constrained (3 bits for up to k7, using the previous notation).
If m=0, the values of a, b, c are always obtained from the comparators as shown in the
Tables.
If rn=l, the values of a, b, c are obtained as follows: if level!= k, the values of a, b, c are obtained from the comparators if level = k, b: ml, then ifml=l, a:= (IeI>2) AND (sign(e) XNOR mO), c:= mO ifml=O, a:= mO, c:= sign(e) For example, the command word (1 100 ii) specifies a constrained search in which a binary decision is made at level 4 in the tree between branches 0 and 1.
The skilled person will recognise that the method may be applied to other constellations, although in general these will require a different number of comparators and a different lookup table.
Figure 5e shows details of an example hardware architecture including a branch selection unit similar to that described above (in Figure 5e labelled "cmp + lookup").
This is configured to implement a sequence along the lines of the above described sphere decoder tree descend function rather than the slightly different recursive sphere decoding procedure (implemented as an iterative procedure in Figure 7) described shortly after the tree descend function. Referring back to the above described sphere decoder tree- descend function, registers RI, R2, R3, R4 are used to hold variables C, 6/Dt, dd, d respectively. After loading the matrix elements R, F and initial values of k, step, D, e the engine computes on successive cycles: 6; d; dd and 6e; D' and e'; and sign(C-D') and, based on the result of the latter calculation, stores values of k, step, D, e, C for the next iteration. Details such as how variables are stored may depend upon detailed implementation of the system.
Figure 7 shows a sphere decoder symbol processing procedure which may be implemented using embodiments of the above described hardware. The notation in Figure 7 is that used above; Q defines the number of quantisation levels of a symbol in the I or Q direction (for example Q = 4, that is 4 levels, for I 6QAM); "branch" corresponds to the data in Tables 1 and 2 above; the step variable corresponds to "order" (0, 1, 2, 3). The procedure perfonns a depth-first tree search with k defining the level in the tree and D as the accumulated distance.
We have described a modular hardware architecture which provides a flexible solution to the problem of MEMO detection, in particular using one or more sphere decoders.
Broadly speaking, the hardware architecture may be used with any variant of sphere decoder and potentially has other applications in other lattice/tree search systems.
The skilled person will appreciate that the above described techniques may be employed for example in base stations, access points, for example for wireless computer networks, and/or mobile terminals, for example for mobile phone systems. Broadly speaking embodiments of the invention facilitate cheaper receivers without a loss of performance, or equivalently increased data rates without correspondingly increased complexity and cost. Embodiments of the invention may also potentially find application in non-radio systems, for example a disk drive withmultiple read heads and multiple data recording layers in effect acting as multiple transmitters.
No doubt many other effective alternatives will occur to the skilled person. It will be understood that the invention is not limited to the described embodiments and encompasses modifications apparent to those skilled in the art lying within the spirit and scope of the claims appended hereto.

Claims (19)

  1. CLAIMS: 1. A hardware accelerator system for a received signal decoder,
    said received signal decoder being configured to decode a string of transmitted symbols sent over a channel by searching a tree for one or more candidate strings of symbols, a candidate string of symbols comprising a string of candidate symbols, and selecting one or more of said candidate strings of symbols, said searching comprising searching a multidimensional lattice determined by a response of said channel and represented by said tree, each level of the tree corresponding to a symbol of said transmitted string of symbols, a level of said tree having at least two branches each representing a next possible said candidate symbol, and wherein said hardware accelerator system comprises a branch selection module having an estimated symbol input to receive estimated symbol data and an output to provide data for controlling an order of searching of said tree branches responsive to said estimated symbol data.
  2. 2. A hardware accelerator system as claimed in claim I wherein said branch selection module has a mode input for selection of a constrained search mode, and wherein said branch selection module is configured to select said order controlling data responsive to said mode input.
  3. 3. A hardware accelerator system as claimed in claim 2 wherein said branch selection module includes a constraint data input to receive data determining, for a level of said tree, a constrained set of one or more branches for searching, and wherein said order controlling data controls an order of searching amongst said constrained set of branches.
  4. 4. A hardware accelerator system as claimed in claim 1, 2 or 3 wherein said branch selection module comprises a look-up table coupled to said module output to provide said order controlling data.
  5. 5. A hardware accelerator system as claimed in claim 4 wherein a said symbol is selected from a constellation employed to modulate data bits into said transmitted string of symbols, and wherein said branch selection module comprises at least one comparator coupled to said estimated symbol input and configured to compare a signal value derived from said estimated symbol input with a signal value derived from said constellation and provide a comparison data output to said look-up table for indexing said look-up table using said at least one comparator output.
  6. 6. A hardware accelerator system as claimed in claim 5 comprising a plurality of said comparators, one for each of a set of real or imaginary values of said symbol constellation.
  7. 7. A hardware accelerator system as claimed in claim 5 or 6 when dependent upon claim 3 wherein said branch selection module further comprises hardware to generate comparison data for indexing said look-up table from said constraint data.
  8. 8. A hardware accelerator system as claimed in any preceding claim further comprising a symbol estimate computation module, said symbol estimate computation module having an input to receive an initial estimate for said transmitted string of symbols, a computation system to determine a revised said estimate of said transmitted string of symbols for a level of said tree from a difference between a signal value for a symbol detenriined by a tree branch previously taken to reach said tree level and a received signal value for said symbol, and a computation module output to provide said an estimated symbol for said level of said tree from said revised estimate of said transmitted string for said level of said tree to said estimated symbol input of said branch selection module.
  9. 9. A hardware accelerator system as claimed in claim 8 wherein said computation module input includes: an input to receive data defining an estimate of a response of said channel, and an input coupled to said branch selection module output to receive said order controlling data, wherein said order controlling data defines a next said candidate synibol to be searched; a distance calculation module to determine a distance between said next candidate transmitted symbol and received signal data for a corresponding symbol using said estimated channel response; and a data store for storing an accumulated said distance for a candidate string of symbols including said next candidate transmitted symbol.
  10. 10. A hardware accelerator system as claimed in any preceding claim wherein said order controlling data comprises data identifying a said tree branch or candidate symbol to search, wherein said branch selection module includes a step index data input, and wherein successive steps of said step index data map to an ordered sequence of said tree branches or candidate symbols to search.
  11. 11. A hardware accelerator system as claimed in claim 10 further comprising a controller to control said successive steps of said step index data to perform an ordered search of said tree.
  12. 12. A sphere decoder including the hardware accelerator system of any preceding claim.
  13. 13. A hardware tree branch selection unit for searching a lattice defined by a matrix for a closest lattice point to a target point defined by a vector, each said lattice point defining a set of symbols, each symbol being defined by a value of a constellation of possible symbol values, said vector comprising values for a set of said symbols, a level of said tree having nodes each corresponding to a partial said set of symbols, branches of said tree from a said node corresponding to possible values of an additional symbol of said partial set, the tree branch selection unit comprising: a first input to receive an estimated symbol value derived from said vector; at least one second input to receive a said constellation value; at least one comparator coupled to said first input and to said at least one second input and having an output responsive to a comparison between said estimated symbol value and said constellation value; and a data store coupled to comparator output to provide a stored data output responsive to said comparison; and wherein said stored data output comprises data for determining a branch of said tree to search.
  14. 14. A hardware tree branch selection unit as claimed in claim 13 wherein said data store comprises a look-up table storing a plurality of comparator values each defining a result of a said comparison and for each said comparator value first data defining an order of said tree branches to search.
  15. 15. A hardware tree branch selection unit as claimed in claim 14 further comprising a mode selection input for selecting a constrained selection mode, and wherein said look-up table stores for each of said comparator values second data defining an order of said tree branches to search in said constrained selection mode.
  16. 16. A symbol processing unit including the hardware tree branch selection unit of claim 13, 14 or 15, a controller to control searching of said tree responsive to said branch selection unit, and hardware to determine a distance between said target point and said closest lattice point.
  17. 17. A method of accelerating processing of a sphere decoder search tree, the method comprising providing first hardware to determine successively refined symbol string estimates during a descent of said tree; and providing second hardware to determine, responsive to a said successively refined symbol string estimate, an order of branches of said tree to search during said tree descent, for said symbol string estimate refining.
  18. 18. A hardware accelerated sphere decoder including the first and second hardware of claim 17.
  19. 19. A receiver incorporating the hardware of any one of claims I to 16 and 18.
GB0510127A 2005-05-18 2005-05-18 Signal processing systems Expired - Fee Related GB2426419B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB0510127A GB2426419B (en) 2005-05-18 2005-05-18 Signal processing systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0510127A GB2426419B (en) 2005-05-18 2005-05-18 Signal processing systems

Publications (3)

Publication Number Publication Date
GB0510127D0 GB0510127D0 (en) 2005-06-22
GB2426419A true GB2426419A (en) 2006-11-22
GB2426419B GB2426419B (en) 2007-04-04

Family

ID=34708367

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0510127A Expired - Fee Related GB2426419B (en) 2005-05-18 2005-05-18 Signal processing systems

Country Status (1)

Country Link
GB (1) GB2426419B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1912371A2 (en) * 2006-10-10 2008-04-16 Kabushiki Kaisha Toshiba Wireless communications apparatus
WO2008062329A3 (en) * 2006-11-24 2008-08-21 Nxp Bv Method and arrangement for generating soft bit information in a receiver of a multiple antenna system
WO2010015989A2 (en) * 2008-08-05 2010-02-11 Nxp B.V. Method and arrangement for generating soft bit information in a receiver of a multiple antenna system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020037059A1 (en) * 2000-08-18 2002-03-28 Texas Instruments Incorporated Joint equalization and decoding using a search-based decoding algorithm
EP1376921A1 (en) * 2002-06-24 2004-01-02 Mitsubishi Electric Information Technology Centre Europe B.V. MIMO telecommunication system with accelerated sphere decoding
EP1460813A1 (en) * 2003-03-15 2004-09-22 Lucent Technologies Inc. Spherical decoder for wireless communications
EP1492241A1 (en) * 2003-06-26 2004-12-29 Mitsubishi Electric Information Technology Centre Europe B.V. Improved sphere decoding of symbols transmitted in a telecommunication system
US20050050072A1 (en) * 2003-09-03 2005-03-03 Lucent Technologies, Inc. Highly parallel tree search architecture for multi-user detection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020037059A1 (en) * 2000-08-18 2002-03-28 Texas Instruments Incorporated Joint equalization and decoding using a search-based decoding algorithm
EP1376921A1 (en) * 2002-06-24 2004-01-02 Mitsubishi Electric Information Technology Centre Europe B.V. MIMO telecommunication system with accelerated sphere decoding
EP1460813A1 (en) * 2003-03-15 2004-09-22 Lucent Technologies Inc. Spherical decoder for wireless communications
EP1492241A1 (en) * 2003-06-26 2004-12-29 Mitsubishi Electric Information Technology Centre Europe B.V. Improved sphere decoding of symbols transmitted in a telecommunication system
US20050050072A1 (en) * 2003-09-03 2005-03-03 Lucent Technologies, Inc. Highly parallel tree search architecture for multi-user detection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
http://tx.technion.ac.il/ïamiw/papers/spawc2003.pdf, "Efficient implementation of sphere demodulation", Wiesel, A. et al, Dept. of signal theory and communications, Universitat Politecnica de Catalunya, 31/03/2003 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1912371A2 (en) * 2006-10-10 2008-04-16 Kabushiki Kaisha Toshiba Wireless communications apparatus
WO2008047737A2 (en) * 2006-10-10 2008-04-24 Kabushiki Kaisha Toshiba Wireless communications apparatus
EP1912371A3 (en) * 2006-10-10 2008-06-25 Kabushiki Kaisha Toshiba Wireless communications apparatus
WO2008047737A3 (en) * 2006-10-10 2008-07-24 Toshiba Kk Wireless communications apparatus
WO2008062329A3 (en) * 2006-11-24 2008-08-21 Nxp Bv Method and arrangement for generating soft bit information in a receiver of a multiple antenna system
US8379768B2 (en) 2006-11-24 2013-02-19 Nxp B.V. Method and arrangement for generating soft bit information in a receiver of a multiple antenna system
WO2010015989A2 (en) * 2008-08-05 2010-02-11 Nxp B.V. Method and arrangement for generating soft bit information in a receiver of a multiple antenna system
WO2010015989A3 (en) * 2008-08-05 2010-07-29 Nxp B.V. Method and arrangement for generating soft bit information in a receiver of a multiple antenna system

Also Published As

Publication number Publication date
GB2426419B (en) 2007-04-04
GB0510127D0 (en) 2005-06-22

Similar Documents

Publication Publication Date Title
KR101124863B1 (en) Apparatus and method for processing communications from multiple sources
EP1521414B1 (en) Method and apparatus for sphere decoding
EP1545082A2 (en) Signal decoding methods and apparatus
US20070121753A1 (en) Wireless communications apparatus
JP5243411B2 (en) Method, system and computer program for determining signal vectors
EP1521375A2 (en) Signal decoding methods and apparatus
US20080123764A1 (en) Wireless communications apparatus
WO2008029819A2 (en) Soft decision generation in a lattice reduction mimo system
US20080013444A1 (en) Wireless communications apparatus
US20080084948A1 (en) Wireless communication apparatus
Shin et al. An improved LLR computation for QRM-MLD in coded MIMO systems
GB2406760A (en) Max-log MAP decoder used with maximum likelihood decoders and determining bit likelihoods
GB2426419A (en) A hardware accelerator for a signal decoder
GB2409386A (en) Sphere decoding system, particularly for MIMO applications, which has a different set of candidate points for each symbol of a received string
Izadinasab et al. Bridging the gap between MMSE-DFE and optimal detection of MIMO systems
GB2406761A (en) Sphere decoding in a space-time diversity (e.g. MIMO) communication system and other multi-user systems e.g. CDMA.
Siti et al. Layered orthogonal lattice detector for two transmit antenna communications
GB2427106A (en) Sphere decoder for MIMO applications with reduced computational complexity decomposition of the channel estimate matrix
Izadinasab et al. Near-optimal MIMO detectors based on MMSE-GDFE and conditional detection
Chen et al. Markov chain Monte Carlo: Applications to MIMO detection and channel equalization
Soma et al. Performance Analysis of K-Best Sphere Decoder Algorithm for Spatial Multiplexing MIMO Systems
Hou et al. Efficient quantization scheme for lattice-reduction aided MIMO detection
Ramanathan et al. Low complexity compressive sensing greedy detection of generalized quadrature spatial modulation
Xie et al. A novel low complexity detector for MIMO system
Li et al. Complex sphere decoding with a modified tree pruning and successive interference cancellation

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20140518