WO2022243471A1

WO2022243471A1 - A user interface system and method

Info

Publication number: WO2022243471A1
Application number: PCT/EP2022/063637
Authority: WO
Inventors: Joe Allen; Raymond HEGARTY; Razvan-Dorel Cioarga; Adrian FIT; Ana Cristina TODORAN; Nicholas John Peter WIRTH
Original assignee: Everseen Limited
Priority date: 2021-05-21
Filing date: 2022-05-19
Publication date: 2022-11-24
Also published as: GB202107296D0; GB2607569A

Abstract

There is described a user interface system (702) comprising: one or more output devices (706) extending or spaced along, within or upon an interaction region; a sensor (704) configured to provide a signal from which a position of a user within the interaction region can be determined; and a processor (708) configured to receive the signal from the sensor, identify a target location within the interaction region based on the determined position of the user, and provide an output from the one or more output devices at the target location. A user interface system is also described.

Description

A USER INTERFACE SYSTEM AND METHOD

The present disclosure relates to a user interface system and method and particularly, but not exclusively, to a user interface system and method which is responsive to a location of a user.

BACKGROUND

In the wake of COVID-19, social distancing has become an essential component in the armoury to stop the spread of the disease. In customer-facing services, the isolation of customers from other customers and staff members is especially important. For example, while drive-through restaurant lanes have been used for decades as a driver of sales at fast food chains, demand for such facilities has recently increased as pandemic restriction measures have forced the closure of indoor dining restaurants. A drive-through restaurant uses customer vehicles and their ordered progression along a road to effectively isolate customers from each other. The advantages of the drive-through model has seen its adoption by many other sectors over the years including drive-through parcel centres, grocery stores etc.

Slow service and long queues are a significant customer deterrent in a drive-through facility. However, the throughput of any sequential linear service system is inherently limited by the speed of the slowest service access operation. One way of solving this problem is to provide parallel access channels, spreading access request traffic across the multiple access channels so that a particularly slow access request in one channel does not impact the throughput of the other channels. In the case of a drive-through facility, multiple access channels are represented by multiple service lanes into which individual vehicles can queue, so that the occupants of several vehicles can be served at the same time. However, parallel access channel systems have a significantly larger footprint than sequential linear systems. Where space is plentiful, the larger footprint of parallel access channel systems is not problematic and such systems are a useful way of increasing throughput. However, in environments where space is less available (and/or real estate is costly), alternative approaches must be adopted to increase throughput.

The present disclosure describes a user interface system and method which seeks to address problems and shortcomings associated with conventional arrangements, such as those described above. STATEMENTS OF INVENTION

In accordance with an aspect of the invention, there is provided a user interface system comprising: one or more output devices extending or spaced along, upon or within an interaction region; a sensor configured to provide a signal from which a position of a user within the interaction region can be determined; and a processor configured to receive the signal from the sensor, identify a target location within the interaction region based on the determined position of the user, and provide an output from the one or more output devices at the target location.

The one or more output devices may comprise a display screen extending along within or upon the interaction region.

The target location may be a location on the display screen at which a visual user interface is displayed.

The display screen may be formed by a plurality of display screen units spaced along, within or upon the interaction region

The one or more output devices may comprise a plurality of loudspeakers spaced along, within or upon at least part of the interaction region.

The processor may be configured to provide a sound output from a selected one of the plurality of loudspeakers which is closest to the target location.

The processor may be configured to operate the plurality of loudspeakers as a phased array so as to provide a directional sound output from the plurality of loudspeakers which is directed towards the target location.

The user interface system may further comprise one or more input devices which are communicatively coupled to the processor.

The one or more input devices may comprise a plurality of microphones.

The processor may be configured to operate the plurality of microphones as a phased array.

The one or more input devices may comprise one or more touch panels. The one or more input devices may comprise one or more payment devices.

The processor may be configured to track the user as they move within the interaction region and to continually or periodically update the target location so as to provide an output from the one or more output devices which moves with the user within the interaction region.

The interaction region may be a drive-through facility, an escalator, an inclined walkway, a travelator or waiting or queuing area.

In accordance with another aspect, there is provided a user interface method comprising: determining a position of a user within an interaction region using a sensor; identifying a target location within the interaction region based on the determined position of the user; and controlling one or more output devices extending or spaced along, within or upon the interaction region so as to provide an output to the user at the target location.

BRIEF DESCRIPTION OF THE DRAWINGS

For a better understanding of the invention, and to show more clearly how it may be carried into effect, reference will now be made, by way of example, to the accompanying drawings, in which:

Figure 1 is a block diagram of the hardware components of a user interface system according to an embodiment of the invention;

Figure 2 is a block diagram of a phased speaker array system of the user interface system;

Figure 3 is a block diagram of a Near Field Communications (NFC) antenna-based payment unit of the user interface system;

Figure 4 is a block diagram of the software components of the user interface system;

Figure 5 is a perspective view of an embodiment of a screen system of the user interface system;

Figure 6 is a perspective view of the user interface system in a drive-through restaurant use case scenario; Figure 7 is a front elevation view of the user interface system in the drive-through restaurant use case scenario;

Figure 8 is a schematic plan view of the user interface system in a drive-through restaurant use case scenario with an arcuate arrangement of a railing system and display screens mounted thereon;

Figure 9 is a perspective view of the user interface system in the drive-through restaurant use case scenario of Figure 6, with certain features removed for ease of viewing;

Figure 10 is a perspective view of the user interface system in a moving walkway use case scenario;

Figure 11 is a perspective view of the user interface system in an inclined moving walkway or escalator use case scenario;

Figure 12 is a schematic perspective view of a first wall-based use case scenario;

Figure 13 is a schematic perspective view of a second wall-based use case scenario;

Figure 14 is a flowchart of a user interface method in a drive-through restaurant use case scenario;

Figure 15 is a block diagram of a user interface system according to an embodiment of the invention; and

Figure 16 is a flowchart of a user interface method according to an embodiment of the invention.

DETAILED DESCRIPTION

Figure 1 shows a block diagram of the hardware components of a user interface system 2 according to an embodiment of the invention.

The following description of the user interface system of the present invention will focus on a drive-through facility (e.g. restaurant) use case scenario. In another use case, the system is operable in other environments of substantially linear travel, for example to present information to individuals on a moving walkway (“travelator”), inclined moving walkway or beside an escalator. It will be understood that the user interface system may be configured differently for such use cases. For example, for safety, it may not be advisable for the user interface system to act as a payment receiving system for passengers on an escalator, which might require the passengers to remove their hands from the escalator’s handrail. Thus, payment receiving aspects of the user interface system described below may be removed therefrom in the escalator use case scenario.

The user interface system 2 comprises a screen system 4 communicatively coupled with one or more sensors 6 and a communications unit 8. The communicative coupling may be implemented through either or both of a wired or a wireless communication network. The screen system 4 comprises one or more computer monitors or TV screens, including flat screen or curved LCD, LED, OLED, projection TV screen, etc. The screen system 4 may further comprise one or more touch screens comprising one or more touch sensor elements adapted to detect the touching of the or each touch screen.

The sensor(s) 6 may comprise one or more video sensors (e.g. video camera), one or more audio sensors (e.g. a microphone) and one or more proximity sensors. The audio sensor(s) may comprise a phased array of microphones operable by a Digital Signal Processing (DSP) unit (not shown) to measure a sound field at different microphone positions. The proximity sensors may comprise one or more of an infrared sensor, EMF induction sensor, capacitance sensor or ultrasonic sensor. The skilled person will understand that these sensors are mentioned for illustration purposes only. In particular, the skilled person will understand that the user interface system 2 of the present invention is not limited to the use of the above-mentioned sensors. On the contrary, the user interface system 2 of the present invention is operable with any combination of sensors which permit detection of the presence of a nearby object and the location thereof; permit detection of features of the object sufficient to enable classification thereof; and permit detection of utterances or gestures made by a user of the user interface system 2. The sensor(s) 6 may be mountable on the screen system 4. Alternatively or additionally, the sensor(s) 6 may be mountable elsewhere in the environment in which the user interface system 2 is used; and within a pre-defined distance from the screen system 4 thereof.

The communications unit 8 may comprise an antenna unit (not shown) to permit communication with a remotely located cell phone or another wireless device. The communications unit 8 may also comprise one or more speakers (not shown) adapted to issue an audible message to a customer. The speaker(s) (not shown) may comprise a phased speaker array system. Referring to Figure 2, the phased speaker array system 14 may comprise a plurality of speakers 16 each of which are fed from a single audio source 18. Each speaker 16 transmits sound delayed by an amount which is related to the distance between the speaker 16 and a selected point or region in space. The wave front from each speaker 16 arrives at the target at the same time. Soundwaves from different speakers 16 combine or cancel depending on the relative phase relationship between the soundwaves. In addition, the amplitude of the sound output by each speaker 16 may also be proportional to the distance between the speaker 16 and the selected point or region in space.

The audio source 18 for the phased speaker array system 14 may comprise an analogue signal or a digital signal (for example, from a digital sound recording). In the event the audio source 18 comprises an analogue signal, it may first be transmitted to a pre-amplifier 20 to be re-biased and amplified. The output from the pre-amplifier 20 may be transmitted to an analogue to digital converter (ADC) 22 to be converted into a digital signal. In the event the audio source 18 comprises a digital signal, the pre-amplifier 20 and the ADC 22 are not needed.

The digital signal (from the audio source 18 or the ADC 22) may be transmitted to a Speaker Microcontroller Unit 24 which may comprise a Read Only Memory (ROM) unit (not shown) that contains firmware configured to control the operation of the Speaker Microcontroller Unit 24. Specifically, the Speaker Microcontroller Unit 24 may be adapted to convert a received digital signal into a plurality of phase-shifted digital signals each of which corresponds with one of the speakers 16. The phases of each of the plurality of digital signals may be calculated by the firmware according to a required sound profile at a target location and within a predefined distance thereof. The plurality of phase-shifted digital signals from the Speaker Microcontroller Unit 24 may be transmitted to a digital to analogue converter (DAC) unit 26 to be converted into a corresponding plurality of phase-shifted analogue signals. The plurality of analogue signals from the DAC unit 26 may be passed to one or more speaker amplifiers 28 which may perform both or either of applying a low-pass filter to the or each received analogue signal and buffering the same, before outputting the or each of the resulting analogue signals to a corresponding speaker 16.

Returning to Figure 1, the communications unit 8 may also comprise a transmitter unit 10 communicatively coupled with a corresponding receiver unit (not shown) in a backend operational unit (not shown) to transmit an order received from the customer to the backend operational unit (not shown). The hardware components of the user interface system 2 may further comprise a payment unit 12 which may include one or more payment devices comprising any of a contact-based card reader unit (for example, a magnetic stripe card reader), a contactless card reader unit (for example, a smart card reader) or a combination of both.

Referring to Figure 3, in one embodiment the payment unit 12 may comprise a Near Field Communications (NFC) antenna. The payment unit 12 may comprise an ISO/IEC 14443 reader antenna comprising an antenna coil 30, which is matched to a Reader Integrated Circuit Unit 32 by a matching circuit 34. The matching circuit 34 comprises resistance and capacitance elements configured to deliver a target impedance value (that balances the requirements of RF output power against power supply requirements) and Q factor (that balances the requirements of current in the antenna coil 30 and corresponding power transmission, with the transmission bandwidth of the antenna coil 30). Together the antenna coil 30 and the matching circuit 34 form a resonance circuit. In some examples, a series of NFC reader antennas may be arranged along a longitudinal axis of the screen(s).

Together with a Secure Access Module 36, the Reader Integrated Circuit Unit 32 may be communicatively coupled to a Reader Microcontroller Unit 38. The Reader Microcontroller Unit 38 may comprise a Read Only Memory (ROM) containing firmware for controlling the Reader Integrated Circuit Unit 32 and the Secure Access Module 36. The Secure Access Module 36 may implement a secure protocol for communications between the payment unit 12 and a payment card 40. For example, the Secure Access Module 36 may compute ephemeral challenges, responses, session keys, etc. for authentication and encryption that are integrated into a communication protocol by a reader application, which may comprise the firmware in the Reader Microcontroller Unit 38.

Using inductive coupling, the antenna coil 30 generates an electromagnetic field with a frequency of 13,56Mhz (RFID HF) to:

• provide power to operate a payment card 40 (also known as a proximity integrated circuit card (“PICC”);

• transmit secure communications protocol data to the PICC 40, from either or both of the Reader Integrated Circuit Unit 32 and the Secure Access Module 36; and

• receive corresponding payment confirmation data from the PICC 40.

Returning to Figure 1, alternatively, in further embodiments of the payment unit 12, its payment devices may comprise a radio frequency tag reader or a near field communication enabled reader device (e.g. radio frequency tag reader or a near field tag reader capable of reading smart fobs, smart cards, cell phones or other wireless devices to receive payment therefrom). The payment device(s) may be communicatively coupled with a multiplexer unit (not shown) which is adapted to multiplex the signals received from the payment device(s).

Referring to Figure 4 together with Figure 1, the software components 50 of the user interface system of the present invention comprise a Transaction Co-ordination Engine 52 communicatively coupled with a Detector Unit 54, an Interface Controller 56, a Client Engagement Module 58, a Client Communications Receiver 60, a Billing/Payment Module 62 and a Backend Co-ordination Engine 64. The Transaction Co-ordination Engine 52 is adapted to co-ordinate the activities of the Detector Unit 54, Interface Controller 56, Client Engagement Module 58, Client Communications Receiver 60, Billing/Payment Module 62 and Backend Co ordination Engine 64 to detect the presence and location of a customer, present a visual user interface to the customer, play audio information/message to the customer; receive an order or other request from the customer, receive payment from the customer and communicate the order or request to a operational unit (not shown) for fulfilment thereof.

The Detector Unit 54 comprises a Video Monitor 66, a Sound Monitor 68 and a Touch Monitor 70. The Video Monitor 66 is adapted to receive and fuse video footage (vid(t)) captured by one or more video camera members of the sensor(s) 6. In another embodiment, the Detector Unit 54 is adapted to receive and fuse video footage (vid(t)) captured by the video camera(s) with output signals from proximity sensors(s) to establish a multi-spectral view of an observed region. The Sound Monitor 68 is adapted to receive audio signals captured by members of a phased array of microphones of the sensor(s) 6 to measure a sound field in the observed region. The Touch Monitor 70 is adapted to receive signals from touch screen member(s) of the screen system 4, wherein the signals indicate the detected touching of one or more of the touch screen member(s). The Detector Unit 54 may further optionally comprise a PICC Monitor 72. The PICC Monitor 72 is adapted to operate the NFC antenna-based payment unit 12, as shown in Figure 3, to detect the presence of a payment card/PICC within a predefined distance of the payment unit 12.

The Video Monitor 66 comprises an Object Locater Module 74 which is adapted to use an object detector algorithm to detect from received video footage the presence of an object (not shown) within a pre-defined distance of the screen system 4; and to determine the location of the detected object relative to the screen system 4. For example, the Object Locater Module 74 may be adapted to determine a vertical and/or horizontal distance between a centroid of the detected object and the centroid of the screen system 4. In one embodiment, the location of a detected object is represented by the co-ordinates of a bounding box which is configured to enclose the object. The co-ordinates of the bounding box are established with respect to the co-ordinate system of a video frame received from a video camera member of the sensor(s) 6.

In this case, the Object Locater Module 74 is configured to use pre-configured information regarding the physical dimensions and layout of the video camera(s) relative to the screen system 4 to translate the co-ordinates of the bounding box enclosing a detected object into horizontal and/or vertical distance measurements from the screen system 4.

The object detector algorithm in the Object Locater Module 74 may include the RetinaNet algorithm (as described in T.-Y. Lin, P. Goyal, R. Girshick, K. he and P Dollar, Focal Loss for Dense Object Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, (2020) 42, 318-327), YOLOv4 (as described in A Bochkovskiy, C-Y Wang and H-Y M Liao,

2020 arXiv: 2004.10934) or the EfficientDet (as described in M. Tan, R. Pang and Q.V. Le, EfficientDet: Scalable and Efficient Object Detection, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 10778-10787. The skilled person will understand that the above-mentioned object detector algorithms are provided for example purposes only. In particular, the skilled person will understand that the present invention is not limited to the above-mentioned object-detection algorithms. Instead, the present invention is operable with any object detection algorithm capable of detecting the presence of an object and determining its location in a video frame.

The Video Monitor 66 may also comprise an Object Classifier Module 76 which is adapted to use an object classification algorithm to recognise the detected object and classify it as being one of the following: car, truck, motorcycle, bicycle, van or person. The skilled person will understand that the above-mentioned object classes are provided for example purposes only.

In particular, the skilled person will understand that the present invention is not limited to the detection of objects of the above-mentioned classes. Instead, the present invention is adaptable to determine the classification of any movable object that is detectable in a video frame. The object classification algorithm employed in the Object Classifier Module 76 may include a ResNet-101 convolutional neural network (as described in He K., Zhang X., Ren S. and Sun J. “Deep Residual Learning for Image Recognition”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 770-778) or a DenseNet (as described in G. Huang, Z. Liu, L. Van Der Maaten and K. O. Weinberger, Densely Connected Convolutional Networks 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 2261-2269). The skilled person will understand that the above-mentioned object classification algorithms are provided for example purposes only. In particular, the skilled person will understand that the present invention is not limited to the above-mentioned object classification algorithms. Instead, the present invention is operable with any object classification algorithm capable of classifying an object detected within the Field of View/sensing range of video camera and/or other sensor types including proximity sensor members of the sensor(s) 6.

The classification of the detected object may be used by the Detector Unit 54 to establish whether the detected object is an object of interest, for example, whether the object is a customer, as opposed to another type of movable object, for example, a cat, that might appear in the Field of View of a video camera. Using this information, the Transaction Co-ordination Engine 52 may be adapted to selectively determine instances in which to display a visual user interface on the screen system 4. For example, the Transaction Co-ordination Engine 52 may be adapted to display a visual user interface on the screen system 4 only in the event a customer or a vehicle is detected within a pre-defined distance of the screen system 4, thereby preventing unnecessary disturbances or changes in a current display on the screen system 4.

Alternatively, or additionally, the Detector Unit 54 may include a Vehicle Dimensions Database (not shown) comprising a plurality of vehicle class records each of which details the overall dimensions of a given one of a pre-defined group of classes of customer vehicles. In the event the detected object is classified by the Object Classifier Module 76 as being a vehicle, the Object Classifier Module 76 may be adapted to provide a more granular classification of the vehicle type and the Detector Unit 54 may be adapted to use the classification to retrieve a substantially matching record from the Vehicle Dimensions Database (not shown) to determine the location of the driver’s window or the front passenger’s window of the detected vehicle. This will provide a more refined estimate of the location of the occupant of the vehicle relative to the screen system 4, to permit more useful positioning of a visual user interface on the screen system 4 to improve its accessibility to the occupant of the vehicle. In another embodiment, the Object Classifier Module 76 is replaced with a Driver Detection Module (not shown) which is adapted to detect the driver (or other occupant) of a detected vehicle. The location of the driver is then used to establish a more refined estimate of the location of the occupant of the vehicle relative to the screen system 4.

Recognising that an object may be at least partly occluded in a video frame, the object classification algorithm of the Object Classifier Module 76 may also be adapted to recognise a part of a detected vehicle, for example, a wing mirror of the vehicle. In this case, the Detector Unit 54 may include a Vehicle Parts Dimensions Database (not shown) which details the geometric relationships between different parts of a vehicle, to enable a horizontal and/or vertical distance to be calculated between the detected part of a vehicle and the driver’s window or the front-passenger’s window thereof. This distance may be treated as an offset factor to be applied to the detected distance between the detected part of the vehicle and the screen system 4, to permit positioning of a visual user interface on the screen system 4 to enhance its viewability by the user.

Alternatively or additionally, the Object Locater Module 74 may be coupled with a face detection algorithm, such as the RetinaFace architecture (as described in J. Deng, J Guo, E. Ververas, I Kotsia and S Zafeirious, RetinaFace: Single-stage Dense Face Localisation in the Wild, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 5203-5212) which is adapted to detect the presence and determine the location of a face in a video frame.

For brevity, the location of a detected object, or the location of the windows of a detected vehicle or the location of a detected face will be referred to henceforth as a Target Location.

The Target Location is defined with reference to the peripheries (or other landmark point/attribute) of the screen system 4. The Detector Unit 54 is adapted to transmit the Target Location to the Transaction Co-ordination Engine 52. The Detector Unit 54 may further optionally be adapted to transmit to the Transaction Co-ordination Engine 52 an indicator as to whether the detected object is an object of interest. The Transaction Co-ordination Engine 52 is adapted to transmit the Target Location to the Interface Controller 56.

The Interface Controller 56 is adapted to receive the Target Location and to establish therefrom control signals for:

• the positioning of a display on the screen system 4;

• the configuration of a listening beam for the phased array of microphones of the sensor(s) 6; and

• the configuration of a steered sound beam from the phased speaker array of the communications unit 8.

To this end, the Interface Controller 56 comprises a Location Translator Module 78 communicatively coupled with a Screen Controller Module 80. The Location Translator Module 78 comprises one or more geometric translation algorithms (not shown) adapted to translate the Target Location into a set of display co-ordinates. The set of display co-ordinates describe the co-ordinates of a position on the screen system 4 wherein the position is within a pre-defined distance of the Target Location. For brevity, this set of display co-ordinates will be referred to as the Target Display Co-ordinates. In the event the screen system 4 comprises a plurality of coupled display screens (including computer monitors and/or TV screens), the Target Display Co-ordinates includes an identifier of the display screen disposed closest to the Target Location and the co-ordinates of a position on the corresponding display screen, the position on the corresponding display screen being within a pre-defined distance of the Target Location. The Location Translator Module 78 is adapted to transmit the Target Display Co-ordinates to the Screen Controller Module 80 to cause the screen system 4 or a member thereof to become activated at the specified Target Display Co ordinates.

The Interface Controller 56 further comprises a Microphone Controller 82 which comprises one or more beam-forming algorithms (not shown), such as those described in H. Adel, M. Souad,

A. Alaqeeli and A Hamid, International Journal of Digital Content Technology and Its Applications 2012, 6(20) 659-667, to establish one or more parameters of spatio-temporal filter to be operated by the DSP unit (not shown) on the outputs of the phased array of microphones of the sensor(s) 6, to effectively shape a listening beam therefrom. In use, the listening beam is centered at the Target Location.

The Interface Controller 56 further comprises a Speaker Controller 84 which comprises one or more beam-steering algorithms (not shown) operable by the firmware in the Speaker Microcontroller Unit 24 of Figure 2, to establish one or more parameters for controlling the magnitude and phase of each of the analogue signals generated by the speakers of the phased speaker array system of Figure 2. In use, the magnitude and phase parameters are calculated to produce sound which is contained within a beam aimed at the Target Location. The Interface Controller 56 may further comprise an Antenna Controller (not shown) to permit communication with a customer through the customer’s own cell phone or other wireless device.

The Client Engagement Module 58 comprises a communicatively coupled Messaging Module 86 and Interface Configuration Module 88. The Interface Configuration Module 88 may include visual interface configuration rules which prescribe appearance attributes of a visual user interface to be displayed to a user. For example, the visual interface configuration rules may include the logo or a colour palette of a given vendor. Alternatively, the visual interface configuration rules may include appearance attributes of a human persona, henceforth known as an avatar, for communication with the customer. The Messaging Module 86 is adapted to retrieve from the Interface Configuration Module 88 the visual interface configuration rules and to transmit them to the Transaction Co-ordination Engine 52. The Transaction Co-ordination Engine 52 is configured to transmit the visual interface configuration rules to the Interface Controller 56 to cause the Screen Controller 80 to display a visual user interface on the screen system 4 at the Target Display Co-ordinates. The appearance of the said visual user interface is established according to the visual interface configuration rules. The positioning of the displayed visual interface within a pre-defined distance of the Target Location may be beneficial where the dimensions of the visual user interface are small compared to those of the screen system 4, to optimise the readability of the visual user interface for the user corresponding with the Target Location.

The Interface Configuration Module 88 may also include narrative rules (not shown) pre configured by the system operators, wherein the or each narrative rule (not shown) establishes a narrative framework for communications with a customer. The relevant narrative framework depends on the specific use of the user interface system 2. For example, in a drive-through restaurant scenario, the narrative framework may include a greeting, presentation of a menu, discussion of special offers, receiving an order, advising on waiting time for the order, advising of cost and requesting payment etc.

The Messaging Module 86 is adapted to retrieve from the Interface Configuration Module 88 the or each narrative rule (not shown) and using the same, the Messaging Module 86 is adapted to co-ordinate a dialogue with a customer. To this end, for example, the Messaging Module 86 may be adapted to transmit to the Transaction Co-ordination Engine 52 an audio triggering signal. The audio triggering signal may be forwarded to the Speaker Controller 84 to cause one or more speakers (not shown) at a location within a pre-defined distance of the Target Location to play an audio message to the corresponding customer. Alternatively or additionally, the Transaction Co-ordination Engine 52 may be adapted to transmit the audio triggering signal to an Antenna Controller (not shown) of the Interface Controller 56, to cause a customer’s own cell phone or other wireless device to play the audio message to the customer. In either case, the content of the audio message and subsequent audio communications with the customer is specified by one or more of the narrative rules from the Interface Configuration Module 88.

Using the example of a drive-through restaurant, the Client Engagement Module 58 may comprise a Menu Module 89 which details all the food products available from the restaurant. The Menu Module 89 may be communicatively coupled with the Messaging Module 86 to cause the contents of the menu to be displayed to a customer (not shown) on the screen system 4 or recited to the customer (not shown) through the speaker(s) (not shown) or through the customer’s own cell phone or other wireless device in accordance with the narrative rule(s) of the Interface Configuration Module 88. On receipt from the Client Engagement Module 58 of visual interface configuration rules or an audio triggering signal, the Transaction Co-ordination Engine 52 may be adapted to activate the Client Communications Receiver 60 and the Video Monitor 66, to operate the or each of the communications unit 8 and the sensor(s) to receive an order from the customer (not shown). The Client Communications Receiver 60 comprises a Touch Comms Receiver 90 and a Voice Comms Receiver 92. The Touch Comms receiver 90 is coupled through the Transaction Co ordination Engine 52 to the Touch Monitor 70 to detect the selection of a displayed item from a visual user interface by the detected touching of one or more of the touch screen member(s) at a location within a pre-defined distance of the displayed item.

The Voice Comms Receiver 92 is coupled through the Transaction Co-ordination Engine 52 to the Sound Monitor 68. The Sound Monitor 68 comprises a User Utterance Detection Unit 96 and a Background Noise Detection Unit 98. The Sound Monitor 68 is also coupled through the Transaction Co-ordination Engine 52 to the microphone controller 82. The User Utterance Detection Unit 96 is adapted to operate the microphone controller 82 and to use the resulting listening beam of the phased array of microphones of the sensor(s) 6 to detect utterances or sounds made by the user (not shown) at the Target Location. For brevity, these utterances or sounds will be referred to henceforth as User Sounds. The Background Noise Detection Unit 98 is adapted to activate the microphone controller 82 to operate the phased array of microphones of the sensor(s) 6 to sample the sound field at locations other than those encompassed by the listening beam. The sound detected in these samples represent background noise; and will for brevity be referred to henceforth as Background Noise. The Sound Monitor 68 is adapted to process the User Sounds to extract or otherwise compensate for the Background Noise contamination thereof. The processing may comprise signal subtraction or deconvolution operations. The skilled person will understand that the above- mentioned processing operations are provided for illustration purposes only. Specifically, the user interface system of the preferred embodiment is in no way limited to these processing operations. Instead, the user interface system of the preferred embodiment is operable with any signal processing operations suitable for the reduction of ambient or background noise from a measured audio signal. For brevity, the result of these processing operations will be referred to henceforth as Filtered User Sounds. The Sound Monitor 68 is configured to transmit the Filtered User Sounds to the Transaction Co-ordination Engine 52 which in turn transmits the Filtered User Sounds to the Voice Comms Receiver 92.

The Voice Comms Receiver 92 is also coupled through the Transaction Co-ordination Engine 52 to the Antenna Controller (not shown) to receive audio signals from the customer’s own cell phone or other wireless device (by way of the antenna unit in the communications unit 8). For brevity, these audio signals will be referred to henceforth as User Device Sounds. The Voice Comms Receiver 92 comprises speech recognition and language processing algorithms adapted to recognize and comprehend utterances and instructions from the customer (not shown) in either or both of the Filtered User Sounds and the User Device Sounds. Examples of suitable speech recognition algorithms include hidden Markov modelling, dynamic time warping (DTW) based speech recognition methods and deep neural networks and denoising autoencoders. The skilled person will understand that the preferred embodiment is not limited to these speech recognition algorithms. On the contrary, these examples of algorithms are provided for illustration purposes only. In particular, the skilled person will understand that the preferred embodiment is operable with any speech recognition and language processing algorithm which permits the Voice Comms Receiver 92 to recognize and comprehend audible utterances and instructions from the customer.

The Client Communications Receiver 60 may also comprise a Gesture Comms Receiver 94.

The Gesture Comms Receiver 94 is coupled through the Transaction Co-ordination Engine 52 to the Video Monitor 66. The Gesture Comms Receiver 94 comprises gesture recognition algorithms adapted to recognize and comprehend gestures from the customer in the video footage captured by video camera members of the sensor(s) 6. Examples of suitable gesture recognition algorithms include skeletal-based algorithms and appearance-based algorithms.

The skilled person will understand that the preferred embodiment is not limited to these gesture recognition algorithms. On the contrary, these examples of algorithms are provided for illustration purposes only. In particular, the skilled person will understand that the preferred embodiment is operable with any gesture recognition algorithm which permits the Gesture Comms Receiver 94 to recognize gestural instructions from the customer (not shown).

Using the or each of the speech recognition and language processing algorithms and the gesture recognition algorithms, the Client Communications Receiver 60 is adapted to receive an order (e.g. food order in the drive-through restaurant example) from the customer (not shown). The Client Communications Receiver 60 is further adapted to communicate information regarding the customer’s order to the Transaction Co-ordination Engine 52. For brevity, the information regarding the customer’s order will be referred to henceforth as Customer Order Information.

In the event the user interface system of the present invention comprises a billing and payment functionality (as opposed to mere information retrieval and presentation), the Transaction Co ordination Engine 52 is adapted on receipt of Customer Order Information, to issue a payment triggering signal to the Billing/Payment Module 62. On receipt of the payment triggering signal, the Billing/Payment Module 62 is activated to calculate the total bill for the ordered items. The Billing/Payment Module 62 is further activated together with the Messaging Module 86 to operate:

• the communications unit 8 to advise the customer (not shown) of the total bill and to request the customer (not shown) to present their payment card (or one or more radio frequency or near field communication enabled payment devices (e.g. smart fobs, smart cards, cell phones or other wireless devices)) to the payment unit 12; and

• the payment unit 12 to receive payment from the customer (not shown) through their payment card or other radio-frequency or near field communication enabled payment device.

Specifically, on receipt of Customer Order Information by the Transaction Co-ordination Engine 52, it may be adapted to activate the Messaging Module 86 and the Screen Controller Module 80 to display on the screen system 4 at the Target Display Co-ordinates, an instruction to the customer regarding where to place or otherwise position their payment card (or other NFC enabled payment medium such as smart fobs, smart cards, cell phones or other wireless devices) to enable communication with the payment unit 12. In the event the payment unit 12 comprises a series of NFC reader antennas arranged along a longitudinal axis of the screen system 4, the Transaction Co-ordination Engine 52 may be adapted to receive from the Video Monitor 66 the location relative to the Screen System at which the customer placed or otherwise positioned their payment card (or other NFC enabled payment medium). In this way, the Transaction Co-ordination Engine 52 is configured to identify to which of the NFC reader antennas the customer presented their payment card (or smart fobs, smart cards, cell phones or other wireless devices). The PICC Monitor 72 is adapted to operate the identified NFC antenna-based payment unit 12, as shown in Figure 3, to detect the presence of the customer’s payment card (or other NFC enabled payment medium)/PICC within a predefined distance of the payment unit 12. On detection of the PICC, the PICC Monitor 72 is adapted to implement a secure communications protocol with the PICC to receive payment for the ordered items therefrom. On receipt of the payment, the PICC Monitor 72 is adapted to transmit a payment confirmation message through the multiplexer unit (not shown) to the Billing/Payment Module 62. To permit cross-referencing of the received payment with the customer order; and depending on the configuration of the the PICC Monitor 72 and the multiplexer unit (not shown), the payment confirmation message may include an identifier of the PICC Monitor 72 from which the payment confirmation message originated. In the event multiple customers are simultaneously using the user interface system to pay for placed orders, the Transaction Co-ordination Engine 52 may be adapted to activate the Messaging Module 86 and the Screen Controller Module 80 to display on the screen system 4 an instruction to a first customer regarding where to place or otherwise position their payment card together with instructions to the or each of the other customers to wait until receipt of a further message advising the or each of the other customers where to place their payment cards. In this way, the Transaction Co-ordination Engine 52 ensures that payment- related communications are conducted with only one payment card at a time by the Billing/Payment Module 62, such that the co-ordination of received payments with received orders is simplified.

On receipt of confirmation of payment from the Billing/Payment Module 62 for a given customer order, the Transaction Co-ordination Engine 52 is adapted to transmit the Customer Order Information to the Backend Co-ordination Engine 64 which co-ordinates communications regarding received customer orders with a backend operational function (not shown). In addition to issuing Customer Order Information to the backend operational function (not shown), the Backend Co-ordination Engine 64 may be adapted to receive corresponding customer communications from the backend operational function (not shown), the said customer communications may include updates on waiting times or details of a pick-up zone from which the customer (not shown) may retrieve their order. The Backend Co-ordination Engine 64 is adapted to transmit these client communications to the Transaction Co-ordination Engine 52 for forwarding to the Messaging Module 86.

In a further embodiment, the Client Engagement Module 58 may further comprise a customer recognition module (not shown) which is adapted to employ one or more optical character recognition (OCR) algorithms to detect and read the characters of a registration number plate of a customer’s vehicle from video footage captured by one or more video camera members of the sensor(s) 6. The Client Engagement Module 58 may further comprise a database (not shown) of registration number details of customer vehicles that were previously detected by the invention. The database may also include details of the customer (not shown) who previously ordered items from the corresponding vehicle. The client recognition module (not shown) may be adapted to interrogate the database to compare the detected registration number details of an incoming customer vehicle with those in the database. In the event a match is found, indicating that the customer is a repeat customer, the Client Engagement Module 58 may be configured to adapt the narrative rules employed by the Messaging Module 86 to include the name of the customer, so that the customer is greeted by name by the Client Engagement Module 58. Referring to Figure 5 in combination with Figure 1 and Figure 4, the screen system 4 of the user interface system 2 comprises a railing system 100 with one or more display screens 102 mounted thereon. The railing system 100 comprises a plurality of upright members 104 of substantially the same length, wherein the upright members 104 are disposed in a substantially co-linear spaced apart arrangement. A bottom end of at least some of the upright members 104 is mountable on a mounting plate 106. The railing system 100 further comprises an elongate crossbar member (not shown) for mounting on a top end of at least some of the upright members 104 to straddle the distance between a first peripheral upright member 104 and its opposing peripheral upright member 104. The crossbar member (not shown) comprises at least one central channel (not shown) formed therein, extending along the longitudinal axis of the crossbar member (not shown). The channel (not shown) is adapted to house a corresponding support member (not shown) protruding from a reverse face of a display screen 102. In use, the mounting plates 106 are fixed to the ground and the display screens 102 are mounted in series on the railing system 100 at substantially the same elevation relative to the ground, by the sliding of the support member (not shown) of each display screen 102 in turn into the channel (not shown) of the crossbar member (not shown).

While the present railing system 100 is shown as comprising a single crossbar member (not shown) adapted to support a single linear arrangement of display screens 102, the skilled person will understand that this configuration of display screens 102 is provided for example purposes only. In particular, the skilled person will understand that the present invention is not limited to this configuration of display screens 102. On the contrary, the present invention is operable with any configuration of display screens 102 sufficient to display visual user interfaces at a variety of elevations and positions to accommodate the diversity of vehicle dimensions that may be encountered in a drive-through facility and/or to accommodate the display requirements of the operators of the drive-through facility. For example, the present railing system 100 may comprise a plurality of vertically spaced apart crossbar members (not shown) arranged to support a two-dimensional grid-like arrangement of display screens 102, to enable the display of larger advertisements to passers-by, together with visual user interfaces to the occupants of customer vehicles in the drive-through restaurant facility.

The user interface system may further comprise a plurality of elongate camera support members 108. The camera support members 108 may be of greater length than the upright members 104 of the railing system 100. A bottom end of each camera support member 108 may be mounted on a mounting plate 110. In use, the mounting plates 110 are fixed to the ground at pre-defined distances from the outer edges of the collective assembly of display screens 102; and the camera support members 108 are arranged as uprights from the mounting plates 110. In use, a video camera (not shown) is mountable on the top end of a camera support member 108 to establish a good vantage point over at least some of the user interface system and the users thereof. Further video cameras (not shown) may be mounted in spaced apart arrangements on the top of the uppermost display screens 102. The further video cameras (not shown) may be disposed in a front facing arrangement extending from the front of the display screens 102, to thereby deliver a Field of View which embraces objects (or parts thereof) facing the front of the display screens 102. The number and spacing of such further video cameras (not shown) may be configured according to the requirements of the operator. However, preferably, the user interface system is provided with sufficient further video cameras (not shown) arranged to provide a substantially seamless view of the area in front of the display screens 102. For example, each such further video camera may be mounted in a substantially central position on an upper end of each display screen 102 and arranged so that their Field of View extends forward from a front of the display screen 102.

In one embodiment, a plurality of card reader devices 112 are mounted in a spaced apart arrangement on a bottom end of a front face of the display screens 102. In one embodiment, the card reader devices 112 may be arranged substantially equidistantly along the length of the railing system 100 or the length embraced by the collective assembly of display screens 102. In particular, the card reader devices 112 may be mounted at a central position along the length of a bottom end of a front face of each of the display screens 102 or the bottom display screens 102 in the event of a grid formation of the display screens 102. In another embodiment, the card reader devices 112 may be arranged to substantially hang from the bottom ends of the display screens 102 or the bottom display screens 102 in the event of a grid formation of the display screens 102. The card reader device 112 may comprise any one of a magnetic stripe payment card reader unit, a contactless card reader unit, a combined magnetic stripe payment card reader and contactless card reader unit and any radio-frequency or near field communication enabled reader devices (e.g. radio frequency tag reader or a near field tag reader capable of reading smart fobs, smart cards, cell phones or other wireless devices to receive payment therefrom). In another embodiment, the plurality of card reader devices 112 comprise a plurality of NFC reader antennas (as shown in Figure 3) 113 arranged in series along a longitudinal axis of the display screens 102. The NFC reader antennas 113 may be fixed to the front face or the reverse face of the display screens 102 at a position between the top and the bottom of the display screens 102.

In a further embodiment, a plurality of microphones 114 are mounted in a spaced apart arrangement on either or both of a top end and a bottom end of the display screens 102. In another embodiment, each microphone 114 is mounted at a central position along the length of a bottom end of a front face of each of the display screens 102 or the bottom display screens 102 in the event of a grid formation of the display screens 102. In another embodiment, each microphone 114 is mounted at a central position along the length of a top end of a front face of each of the display screens 102 or the top display screens 102 in the event of a grid formation of the display screens 102. In yet another embodiment, the microphones 114 are mounted on alternating top and bottom ends of adjacent display screens 102. In another embodiment, the microphones 114 are arranged in a spaced apart arrangement with a longitudinal distance of no more than 1 metre between successive microphones 114. The microphones 114 may be condenser microphones or shotgun microphones. The microphones 114 are provided with a windshield to protect against noise from wind sources. Preferably, the microphones have a hypercardioid polar pattern wherein the microphone is most sensitive to on-axis sounds (i.e. where the microphone is pointed), with null points at 110° and 250° and a rear lobe of sensitivity. Furthermore, in view of the long distances between successive microphones 114, the microphones 114 preferably have an X Connector, Locking Connector, Rubber Boot (XLR) output rather than a USB output. Similarly, at least some of the XLR microphones 114 are connected to an audio interface (not shown) capable of providing at least 48V of phantom power (necessary for the amplification of the microphone signal). The output of the audio interface is transmitted to a Digital Signal Processing (DSP) unit (not shown) which is adapted to implement at least some of a plurality of signal enhancement techniques to enhance the quality of a received audio signal corresponding with a detected utterance of a customer (not shown). Specifically, the signal enhancement techniques include noise filtering, noise suppression, noise gating, dynamic gain control, compression, equalisation and limitation. In a preferred embodiment, the plurality of microphones 114 comprises a phased microphone array.

In a further embodiment, a plurality of speakers (not shown) are mounted in a spaced apart arrangement on either or both of a top end and a bottom end of the display screens 102. In another embodiment, each speaker (not shown) is mounted at a central position along the length of a bottom end of a front face of each of the display screens 102 or the bottom display screens 102 in the event of a grid formation of the display screens 102. In another embodiment, each speaker (not shown) is mounted at a central position along the length of a top end of a front face of each of the display screens 102 or the top display screens 102 in the event of a grid formation of the display screens 102. In yet another embodiment, the speakers (not shown) are mounted on alternating top and bottom ends of adjacent display screens 102.

In another embodiment the speakers (not shown) are arranged in a spaced apart arrangement with a longitudinal distance of no more than 1 metre between successive speakers (not shown). In one embodiment, each speaker (not shown) is spaced apart from each of the microphones 114. In a preferred embodiment the plurality of speakers comprises a phased speaker array wherein the distance between the speakers and the number of speakers is configured to balance the requirements for control over higher and lower frequencies of emitted sound waves.

Referring to Figure 6 and Figure 7, in use, the railing system 200 of the user interface system is mounted on or alongside the external walls of a kiosk 220 of a drive-through restaurant (or other) facility. Specifically, the railing system 200 is mounted on or alongside the external walls that face a road 222 which at least partly encircles the kiosk 220. One or more display screens 202 are mounted on the railing system 200 and arranged so that the front face of the display screen(s) 202 face out towards the road 222. The skilled person will understand that the configuration of the road 222 described above and depicted in Figure 6 is provided for example purposes only. In particular, the road 222 may only extend along one external wall of the kiosk 220.

Referring to Figure 8, alternatively or additionally, the or each external wall and the road 222 may be curved. In this case, the railing system 200 may be arcuate or serpentine in shape and configured to follow the curved external wall. To achieve this, a balance must be achieved between the requirement for the radius of curvature of an arcuate/serpentine railing system 200 to be sufficiently small to accommodate the smallest radius of curvature of the external wall; and the physical limitations imposed by the dimensions and curvature, if any, of the display screens 202 mounted on the railing system 200.

In use, one or more customer vehicles 224a-224c are driven along the road 222 from an entry point 226 to an exit point 228 of the drive-through restaurant facility. Between the entry point 226 and the exit point 228, the road 222 may be provided with a plurality of spaced apart side- exit routes 230a, 230b, to one or more order pickup points (not shown), from which customers may retrieve their ordered items.

Combining Figures 6 and 7 with Figure 4, on entry of a customer vehicle 224a-224c to the road 222, the location of the customer vehicle 224a-224c relative to the or each termini of the railing system 200 is detected by the Detector Unit 54 from video footage captured by one or more video cameras (not shown) mounted on upright camera support members 208 and/or by further video cameras installed at different locations in the drive-through restaurant facility. As the customer vehicle 224a-224c is driven along the road 222, the customer vehicle’s 224a-224c location (and corresponding movements thereof) is tracked by the Detector Unit 54 from video footage captured by one or more video cameras (not shown) mounted on the tops of the display screens 202. Referring to Figure 9 together with Figure 4, the Location Translator 78 translates the detected location of the customer vehicle 224a-224c into an identifier of the display screen 202 disposed closest to the customer vehicle 224a-224c. The Location Translator 78 further translates the detected location of the customer vehicle 224a-224c into Target Display Co-ordinates.

The Messaging Module 86 and the Screen Controller 80 causes a visual display interface 242a- 242c to be displayed at the Target Display Co-ordinates. Alternatively, the Messaging Module 86 and the Screen Controller 80 may cause a visual user interface 242a-242c to be moved to the Target Display Co-ordinates from a starting position on a display screen 202 mounted at either end of the railing system 200. Accordingly, a visual user interface 242a-242c is displayed to the occupants of each customer vehicle 224a-224c being driven on the road 222 and/or queued on the road 222. Furthermore, by refreshing the Target Display Co-ordinates of the visual user interface 242a-242c to match the tracked location of the customer vehicle 224a- 224c as it moves along the road 222, the Screen Controller 80 and the Messaging Module 86 cause the visual user interface 242a-242c to effectively follow the movements of the customer vehicle 224a-224c. A visual user interface 242a-242c may include a menu from which the occupants of the closest customer vehicle 224a-224c may select food items. Alternatively, the visual user interface 242a-242c may detail special offers which the occupants of the closest customer vehicle 224a-224c are invited to select. In another embodiment, the visual user interface 242a-242c may be customised according to a known transaction history associated with a recognised customer vehicle or a recognised customer in a given customer vehicle 224a- 224c.

The Messaging Module 86 and the Speaker Controller 84 may also activate one or more speakers (not shown) located within a pre-defined distance of the detected location of the customer vehicle 224a-224c to facilitate an audio message to the occupants of the corresponding customer vehicle 224a-224c. In a preferred embodiment, the Messaging Module 86 and the Speaker Controller 84 may activate and establish one or more parameters for controlling the magnitude and phase of each of the analogue signals generated by the speakers of the phased speaker array system of Figure 2. The magnitude and phase parameters are calculated to produce sound which is contained within a beam aimed at a given customer or a given customer vehicle. In this way, individual customers may receive different, and possibly customised, audio messages.

In the event the display screen 202 is a touchscreen, the occupants of a given customer vehicle 224a-224c may be invited to select items from the visual user interface 242a-242c by pressing a location on the closest display screen 202, the said location being within a predefined distance of a desired item displayed on the display screen 202. The touching of the display screen by the customer is detected by the Touch Monitor 70 in communication with touch sensor elements in the display screen 202. The Touch Monitor 70 is further configured to convert the detected location of the touching into a selection of an item from the visual user interface 242a-242c.

In another embodiment, the occupants of the customer vehicle 224a-224c may be invited to select an item from the visual user interface 242a-242c by speaking aloud a name or other identifier of the item. In a preferred embodiment, the microphones of the user interface system comprise a phased array of microphones. To avoid errors arising from mistaken detection of audio signals from other nearby customers/customer vehicles, the Microphone Controller 82 employs one or more beam-forming algorithms (not shown) to shape a listening beam from the phased array of microphones, wherein the listening beam is centred at a location within a pre defined distance of the customer vehicle 224a-224c. Accordingly, the listening beam is optimally configured to detect sounds made by the occupants of the customer vehicle 224a- 224c. These sounds are referred to as User Sounds. The Background Noise Detection Unit 98 operates the phased array of microphones to sample the sound field at locations other than those encompassed by the listening beam. The sounds detected at these locations are referred to as Background Noise. The Sound Monitor 68 processes the User Sounds to extract or otherwise compensate for the Background Noise contamination thereof. The result of these processing operations is Filtered User Sounds.

In another embodiment, to avoid errors arising from mistaken detection of audio signals from other nearby customers/customer vehicles, a microphone 214 located closest to the detected location of the customer vehicle 224a-224c is activated to receive audio signals corresponding with the customer’s utterances. For brevity, the microphone 214 located closest to the detected location of the customer vehicle 224a-224c will be referred to as the Target Microphone Position. To further reduce the risk of crossed communications, microphones 214 located on either side of the Target Microphone Position and within a pre-defined distance of the Target Microphone Position are deactivated. In addition, one or more features based on the time and frequency domain characteristics of the detected audio signals are examined to enhance the quality of the signal(s) received from the microphone at the Target Microphone Position. The examined features are independent of the signal amplitude. Thus, the examined features are unaffected by variations in the diction, loudness of individual customers and individual microphone setup.

Alternatively, the audio signals may be received from the customer’s own cell phone or other wireless device (by way of the antenna unit in the communications unit 8). Whether received from one or microphones 214 mounted on the display screen(s) 202 or from the customer’s own cell phone or other wireless device, the received audio signals are processed by speech recognition and language processing algorithms 46 to recognize selections made by the customer and/or other instructions from the customer.

In yet another embodiment, the occupants of the customer vehicle 224a-224c may be invited to select an item from the visual user interface 242a-242c by making appropriate gestures towards the displayed required items. In this case, video footage captured by video camera(s) mounted on the upright camera support members 208 or on the top edges of the display screens 202 is processed using gesture recognition algorithms to detect and recognize gestures from the customer (not shown) and thereby receive the customer’s order.

On receipt of the customer’s selection, the Billing/Payment Module 62 calculates the bill for the selected food items; and requests the customer (not shown) for payment of the bill. In particular, the Messaging Module 86 and the Screen Controller Module 80 are activated to display on the display screens 102 the bill total together with an instruction to the customer regarding where to place or otherwise position their payment card (or other NFC enabled payment medium such as smart fobs, smart cards, cell phones or other wireless devices) to make payment of the bill. In the event the user interface system comprises a series of NFC reader antennas arranged along a longitudinal axis of the display screens 102, the Transaction Co-ordination Engine 52 receives from the Video Monitor 66 the location at which the customer placed or otherwise positioned their payment card (or other NFC enabled payment medium). In this way, the Transaction Co-ordination Engine 52 identifies to which of the NFC reader antennas the customer presented their payment card (or other NFC enabled payment medium). The PICC Monitor 72 operates the identified NFC antenna-based payment unit, as shown in Figure 3, to detect the presence of the customer’s payment card (or other NFC enabled payment medium)/PICC within a predefined distance of the payment unit. On detection of the PICC, the PICC Monitor 72 implements a secure communications protocol with the PICC to receive payment for the ordered items. On receipt of the payment, the PICC Monitor 72 transmits a payment confirmation message to the Billing/Payment Module 62. To permit cross- referencing of the received payment with the customer order; and depending on the configuration of the the PICC Monitor 72 and the multiplexer unit (not shown), the payment confirmation message may include an identifier of the PICC Monitor 72 from which the payment confirmation message originated.

In another embodiment, the Billing/Payment Module 62 displays a OR code to a customer on the visual user interface 242a-242c. The OR code comprises an embedded link to the Billing/Payment Module 62 which may be accessed by the customer’s wireless device to enable payment to be made from the wireless device.

On receipt of confirmed payment, the Backend Co-ordination Engine 64 transmits the received customer order to a back-end operation (not shown) with instructions to fulfil the customer order. The visual user interface 242a-242c may also be activated to provide directions to an order pickup point (not shown) at which the customer may collect their order. For example, the driver may be directed to veer the customer vehicle 224a-224c away from the road (222 in Figure 6) onto the nearest appropriate side road (230a, 230b in Figure 6) to reach the order pickup point (not shown). Alternatively, or additionally, the visual user interface 242a-242c may also be activated to display advertisements, promotion messages or entertaining videos to the customers as they wait for their order to be completed.

Alternatively, or additionally, the visual user interface 242a-242c may be used for market research and/or product testing purposes. For example, the visual user interface 242a-242c may be used to demonstrate product samples to a waiting customer and thereafter conduct a brief survey on the product sample just demonstrated. This enables the real time collection and analysis of the results of customer surveys, to support detailed demographic and geographic variables in assessing the likelihood of a trial product’s future success. Indeed, the user interface system enables a retailer to work with its consumer product goods partners to modify both how, when and why promotional and other marketing tactics and strategies are deployed.

On departure of a customer vehicle 224a-224c from a queue alongside the railing system 200, the visual user interface 242a-242c associated with the customer vehicle 224a-224c will be deactivated. Furthermore, the visual user interfaces 242a-242c associated with the customer vehicles 224a-224c behind the departing customer vehicle 224a-224c in the queue will follow these customer vehicles 224a-224c as they move to close the gap formed by the departing customer vehicle 224a-224c. Additionally, a new visual user interface 242a-242c will be activated for a detected newly incoming customer vehicle 224a-224c.

Service to individual customers in a queue is no longer limited by the speed of taking the slowest or most complex order. Instead, customers with shorter/simpler orders may have their order taken quickly without a dependency on the speed of order-taking from customers further ahead of them in the queue. The user interface system of the present invention enables spatially independent and effectively cellular order-taking processes to be carried out. Referring to Figure 10, in another use case scenario, the user interface system 300 comprises one or more display screens 302 mounted on a side wall located within a pre-defined distance of a side of a moving walkway (“travelator”) 304. In one embodiment, the display screens 302 may be mounted on either side of the travelator 304. On detecting the presence of a person 306 on the travelator 304, the user interface system 300 detects the location of the person 306 relative to the display screen(s) 308. The user interface system 300 translates the detected location of the person into co-ordinates on the display screen 302 closest to the person 306.

The co-ordinates represent a position closest to the person 306 on the display screen 302. The user interface system 300 displays a visual user interface 308 centered at the co-ordinates on the display screen 302. The visual user interface 308 may include information for visitors, advertisements or entertainment videos. The user interface system 300 tracks the movement of the person 306 on the travelator 304 and causes the position of the visual user interface 308 to be moved to follow the movements of the person 306. In this way, the visual user interface 308 moves along the longitudinal axis of the display screen 302 at a speed that substantially matches the speed of movement of the person 306. For example, in the event the person 306 stands still on the walkway 304, the visual user interface 308 moves along the longitudinal axis of the display screen 302 at substantially the same speed as the travelator 304 moves.

Referring to Figure 11, the previously discussed concept can also be used beside an inclined moving walkway or an escalator. Supermarkets and airports often have inclined walkways 404 to bring customers 406 from one floor to another. The wheels of a shopping cart 410 or suitcase lock into the walkway 404 and constrain the customer 406 from moving faster than the speed of the walkway 404. In a preferred embodiment, the user interface system 400 comprises one or more display screens 402 mounted on a side wall located within a pre-defined distance of a side of the escalator/moving walkway 404. In one embodiment, the display screens 402 may be mounted on either side of the escalator/moving walkway 404. On detecting the presence of a person 406 on the escalator/moving walkway 404, the user interface system 400 detects the location of the person 406 relative to the display screen(s) 402. The user interface system 400 translates the detected location of the person into co-ordinates on the display screen 402 closest to the person 406. The co-ordinates represent a position closest to the person 406 on the display screen 402. The user interface system 400 displays a visual user interface 408 centered at the co-ordinates on the display screen 402. The visual user interface 408 may include an entertaining video or a peaceful image to distract the person 406 from the annoyance of their constrained movement on the escalator/moving walkway 404. Alternatively, or additionally, the visual user interface 408 may include advertising promotions or informational videos. The user interface system 400 tracks the movement of the person 406 on the escalator/moving walkway 404 and causes the position of the visual user interface 408 to be moved to follow the movements of the person 406. In this way, the visual user interface 408 moves along the longitudinal axis of the display screen(s) 402 at a speed that substantially matches the speed of movement of the escalator/moving walkway 404. For example, in the event the customer 406 stands still on the escalator/moving walkway 404, the visual user interface 408 moves along the longitudinal axis of the display screen 402 at substantially the same speed as the escalator/moving walkway 404 moves.

Referring to Figure 12, in a first wall- based use case scenario, the user interface system 450 comprises one or more display screens 452 mounted in sequence on a wall (not shown) parallel to which one or more people are allowed to queue, wherein an interaction region of the user interface system 450 is formed in an area in front of the wall and display screens 452 where people are allowed to queue. In use, on detecting the approach of a person 454 in a queue to one or more of the display screen(s) 452, the user interface system 450 detects the location of the person 454 relative to the or each of the display screen(s) 452. The user interface system 450 translates the detected location of the person 454 into co-ordinates on the display screen 452 closest to the person 454. The co-ordinates represent a position closest to the person 454 on the display screen 452. The user interface system 450 displays a visual user interface 456 centered at the co-ordinates on the display screen 452. The user interface system 450 may also be used in areas where people are allowed to wait, but not necessarily in an ordered queue.

The visual user interface 456 may include information, advertisements or an interface to a booking facility. For example, the visual user interface 456 may include an interface to a concert ticket booking system, a movie ticket booking system for use in a cinema, a seat ticket booking system in a sports venue, or a flight or a hotel room booking system or the like. The user interface system 450 tracks the movement of the person 454 as they progress in the queue. The user interface system 450 causes the position of the visual user interface 456 to be moved to follow the movements of the person 454. In this way, the visual user interface 456 moves along the longitudinal axis of the display screens 452 at a speed that substantially matches the speed of movement of the person 454 in the queue. Thus, the person may purchase a ticket for a concert, movie, sports event, or a flight or a hotel room while standing in a queue.

Referring to Figure 13, in a second wall-based use case scenario, the user interface system 460 comprises one or more display screens 462 mounted in sequence on a wall (not shown). One or more queues 464a, 464b, 464c of one or more people 466 are allowed to form in a substantially perpendicular arrangement relative to the wall (not shown). An interaction region of the user interface system 460 is formed in an area in front of the wall and display screens 462 where people are allowed to queue.

On detecting the approach of a person 466 to one or more of the display screen(s) 462, the user interface system 460 detects the location of a person 466 relative to the or each of the display screen(s) 462. More specifically, bearing in mind that people are of different heights and may adopt different postures (for example, stooping, stretching or bending), the user interface system 460 detects the location of the person’s face relative to the display screen(s) 462.

The user interface system 460 translates the detected location of the person’s face into co ordinates on the display screen 462 closest to the person 466. The co-ordinates represent a position closest to the person’s face on the display screen 462. The user interface system 460 displays a visual user interface 468 centered at the co-ordinates on the display screen 462.

In this way, the displayed location of the visual user interface 468 is adaptable to either or both of the height and posture of a person 466 using the user interface system 460. The user interface system 460 tracks the movement of the person 466 in the event they change posture (e.g. bends down to pick up something or to look in a bag etc.). The user interface system 460 causes the position of the visual user interface 468 to be moved to follow the movements of the person 466. In this way, the visual user interface 468 moves along either or both of the vertical axis and the longitudinal axis of the display screens 462 at a speed that substantially matches the speed of movement of the person 466.

The visual user interface 468 may include information, advertisements or an interface to a booking facility. For example, the visual user interface 468 may include an interface to a concert ticket booking system, a movie ticket booking system for use in a cinema, a seat ticket booking system in a sports venue, or a flight or a hotel room booking system or the like. Thus, the person 466 may purchase a ticket for a concert, movie, sports event, or a flight or a hotel room. Thus, the person 466 may purchase a ticket for a concert, movie, sports event, or a flight or a hotel room while changing their posture and moving in front of the display screens 462.

This promotes a more natural interaction of a person 466 with the user interface system 460 in situations where the person’s attention is distracted by other or accompanying people, or the need to find something in a bag, or the need of a person to direct their gaze towards a mobile/wireless device and the like. On completion of the relevant transaction and the departure of the person 466 from the queue 464a, 464b, 464c at the display screen(s) 462, the user interface system 460 detects the next person 466 in the queue 464a, 464b, 464c as they approach the display screen(s) 462 and adjusts the displayed location of the visual user interface 468 according to either or both of the height and posture of the said next person 466.

Referring to Figure 14, a visual interface method 500 as implemented in a drive-through restaurant use case scenario comprises the steps of:

• detecting 510 from received video footage the entry of a customer vehicle into a service lane of the drive-through restaurant facility;

• detecting 520 from the received video footage the location of the customer vehicle and characterising features thereof;

• using the detected characterising features to classify the customer vehicle;

• using the detected location of the customer vehicle and the classification thereof to establish an estimated location of a customer in the customer vehicle;

• identifying 530 a display screen located closest to the customer;

• determining a location closest to the customer on the identified display screen;

• displaying 540 a visual user interface at the determined location;

• issuing 550 a greeting to the customer through the displayed visual user interface;

• detecting 560 movements of the customer vehicle in the service lane;

• moving 570 the location of the displayed visual user interface to follow the detected movements of the customer vehicle;

• displaying a menu to the customer;

• receiving 580 an order from the customer;

• requesting 590 payment from the customer for the order;

• sending 600 order to a back-end operation for fulfilment thereof on receipt of payment from the customer;

• directing 610 customer to a pick-up point to retrieve their completed order; and

• deactivating the visual user interface.

Although detailed embodiments of the invention have been presented above, it will be appreciated that, in its simplest form, the user interface system 702, as shown in Figure 15, may comprise one or more output devices 704 extending or spaced along, upon or within an interaction region, a sensor 706 configured to provide a signal from which a position of a user within the interaction region can be determined, and a processor 708 configured to receive the signal from the sensor, identify a target location within the interaction region based on the determined position of the user, and provide an output from the one or more output devices at the target location.

For example, the sensor 706 may comprise one or more video cameras which have the interaction region in their field of view, as described previously. Alternatively, or in addition, other types of sensors may be used to provide an output which is indicative of a position of a user within the interaction region or from which this can be determined. As described previously, the user may be located within a vehicle and the position of the vehicle may be used to determine the target location. The processor 708 may use suitable detection algorithms, as described previously, to identify the location of the user from the sensor signal.

The one or more output devices 704 may comprise a display screen extending along, within or upon the interaction region, such as the display screen 102 described previously. As described previously, the display screen may be formed by a plurality of display screen units spaced along , within or upon the interaction region. The target location may be a location on the display screen at which a visual output, such as a graphical user interface, is displayed. In particular, the target location may be expressed as coordinates on the display screen and/or a reference address of a specific display screen unit.

Alternatively, or in addition, the one or more output devices 704 may comprise a plurality of loudspeakers spaced along, within or upon at least part of the interaction region. The processor 708 may be configured to provide a sound output from a selected one of the plurality of loudspeakers which is closest to the target location. Alternatively, the processor may be configured to operate the plurality of loudspeakers as a phased array so as to provide a directional sound output (via beam-steering) from the plurality of loudspeakers which is directed towards the target location. In either arrangement, it is intended that the sound output is received predominantly at the target location only and not by other users at different locations. Accordingly, this allows multiple users to interact with the user interface system 702 at the same time without confusion.

The loudspeakers may be communicatively coupled to a microphone which allows an operator to speak directly to the user at the target location. Alternatively, or in addition, the loudspeakers may project pre-recorded or computer-generated sounds to the user at the target location.

The output devices 704 allow an output, such as a visual and/or audio output, to be directed to the specific location of the user as determined by the sensor 706. Accordingly, the user does not need to be located at a predefined position within the interaction region, such as at a service window or intercom point.

The user interface system 702 may also comprise one or more input devices 710 which are communicatively coupled to the processor 708. For example, the one or more input devices may comprise a plurality of microphones. The user may provide an input to the user interface system 702 via one or more of the microphones. For example, utterances from the user may be captured by the microphone located closest to the user’s position. Alternatively, the plurality of microphones may be operated as a phased array. In this arrangement, the processor may use beamsteering techniques to effectively “point” the microphones towards the target location, thereby improving the signal to noise ratio of the audio signal from that user.

Alternatively, or in addition, the one or more input devices may comprise one or more touch panels. For example, the touch panels may be integrated with the display screen described previously. The touch panels may allow a user to provide inputs for an order, such as selecting menu items displayed on the display screen.

Alternatively, or in addition, the one or more input devices may comprise one or more payment devices, such as card reader devices 112 described previously, which allow the user to effect payment for their order.

It will be appreciated that other input and/or output devices may be provided, such as those described previously.

The processor may be configured to track the user as they move within the interaction region and to continually or periodically update the target location so as to provide an output from the one or more output devices which moves with the user within the interaction region. For example, audio outputs from the loudspeakers and visual outputs on the display screen may be coordinated to follow the current position of the user. In this way, an interaction between the user and the user interface system can be maintained regardless of the movement of the user.

In some examples, the target location may only be refreshed and the output from the one or more output devices moved to a new position when the user is stationary. In particular, where the user is driving a vehicle, the output may be paused while the vehicle is moving in order to avoid distracting the user and may recommence when the vehicle comes to a stop.

Figure 16 shows a flowchart of a user interface method 802 according to an embodiment of the invention which may be implemented using the user interface system 702 described previously. At 804, the method determines a position of a user within an interaction region using a sensor, such as sensor 706. The sensor may provide a raw signal which is used by a processor to determine the position of the user using, for example, suitable detection algorithms. At 806, a target location within the interaction region is identified based on the determined position of the user using the processor 708, for example. Then at 808, one or more output devices which extend or are spaced along, within or upon the interaction region are controlled so as to provide an output to the user at the target location.

As described herein, the user interface system of the present invention may be adapted to display visual information and/or play audio information to (and potentially to receive information/requests and/or payment from) each of one or more individuals arranged at potentially different locations in a fixed or moving queued arrangement. The interface is adapted to visually present the information at the location (or moving location) of an individual on a visual display medium such as a display screen disposed within a pre-defined distance of the individual. Alternatively or additionally, the interface is adapted to play the audio information or messages at the location (or moving location) of an individual from one or more members of a microphone array disposed within a pre-defined distance of the individual. The interface may be further adapted to present information to an individual at a point ahead of their current location(s) along their direction of travel, to lead the individual to a next position. Accordingly, the interface is adapted to deliver real-time vertical and horizontal movements of the presented information to accommodate variations in the heights and movement trajectories of different users. The content provided to the user may comprise personalised and/or interactive elements.

The user interface system of the present invention may be used to support automatic order taking and payment receipt from one or more customers in a drive-through environment. The user interface system is adaptive to the movements and dimensions of customer vehicles to provide a more physically effective engagement with the occupants thereof. Further, the user interface system supports substantially simultaneous order-taking from a plurality of customers in a sequential linear queuing system, using a plurality of substantially independently-movable and simultaneously-operable visual user interfaces and audio user interfaces.

To avoid unnecessary duplication of effort and repetition of text in the specification, certain features are described in relation to only one or several aspects or embodiments of the invention. However, it is to be understood that, where it is technically possible, features described in relation to any aspect or embodiment of the invention may also be used with any other aspect or embodiment of the invention. The invention is not limited to the embodiments described herein, and may be modified or adapted without departing from the scope of the present invention.

Claims

1. A user interface system comprising: one or more output devices extending or spaced along, upon or within an interaction region; a sensor configured to provide a signal from which a position of a user within the interaction region can be determined; and a processor configured to receive the signal from the sensor, identify a target location within the interaction region based on the determined position of the user, and provide an output from the one or more output devices at the target location.

2. A user interface system according to claim 1 , wherein the one or more output devices comprise a display screen extending along, within or upon the interaction region.

3. A user interface system according to claim 2, wherein the target location is a location on the display screen at which a visual user interface is displayed.

4. A user interface system according to claim 2 or 3, wherein the display screen is formed by a plurality of display screen units spaced along, within or upon the interaction region.

5. A user interface system according to any one of the preceding claims, wherein the one or more output devices comprise a plurality of loudspeakers spaced along, within or upon at least part of the interaction region.

6. A user interface system according to claim 5, wherein the processor is configured to provide a sound output from a selected one of the plurality of loudspeakers which is closest to the target location.

7. A user interface system according to claim 5 or 6, wherein the processor is configured to operate the plurality of loudspeakers as a phased array so as to provide a directional sound output from the plurality of loudspeakers which is directed towards the target location.

8. A user interface system according to any one of the preceding claims, further comprising one or more input devices which are communicatively coupled to the processor.

9. A user interface system according to claim 8, wherein the one or more input devices comprise a plurality of microphones.

10. A user interface system according to claim 9, wherein the processor is configured to operate the plurality of microphones as a phased array.

11. A user interface system according to claim 9 or 10, wherein the one or more input devices comprises one or more touch panels.

12. A user interface system according to any one of claims 9 to 11, wherein the one or more input devices comprises one or more payment devices.

13. A user interface system according to any one of the preceding claims, wherein the processor is configured to track the user as they move within the interaction region and to continually or periodically update the target location so as to provide an output from the one or more output devices which moves with the user within the interaction region.

14. A user interface system according to any one of the preceding claims, wherein the interaction region is a drive-through facility, an escalator, an inclined walkway, a travelator or a queuing or waiting area.

15. A user interface method comprising: determining a position of a user along within an interaction region using a sensor; identifying a target location within the interaction region based on the determined position of the user; and controlling one or more output devices extending or spaced along, within or upon the interaction region so as to provide an output to the user at the target location.