EP4128031A1 - « interaction sans contact avec une station libre-service dans un environnement de transit » - Google Patents

« interaction sans contact avec une station libre-service dans un environnement de transit »

Info

Publication number
EP4128031A1
EP4128031A1 EP21776569.2A EP21776569A EP4128031A1 EP 4128031 A1 EP4128031 A1 EP 4128031A1 EP 21776569 A EP21776569 A EP 21776569A EP 4128031 A1 EP4128031 A1 EP 4128031A1
Authority
EP
European Patent Office
Prior art keywords
display screen
processor
station
face
predetermined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP21776569.2A
Other languages
German (de)
English (en)
Inventor
Aaron Jason Hornlimann
Nicolas Peter Osborne
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Elenium Automation Pty Ltd
Original Assignee
Elenium Automation Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2020900882A external-priority patent/AU2020900882A0/en
Application filed by Elenium Automation Pty Ltd filed Critical Elenium Automation Pty Ltd
Publication of EP4128031A1 publication Critical patent/EP4128031A1/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/012Head tracking input arrangements
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B64AIRCRAFT; AVIATION; COSMONAUTICS
    • B64FGROUND OR AIRCRAFT-CARRIER-DECK INSTALLATIONS SPECIALLY ADAPTED FOR USE IN CONNECTION WITH AIRCRAFT; DESIGNING, MANUFACTURING, ASSEMBLING, CLEANING, MAINTAINING OR REPAIRING AIRCRAFT, NOT OTHERWISE PROVIDED FOR; HANDLING, TRANSPORTING, TESTING OR INSPECTING AIRCRAFT COMPONENTS, NOT OTHERWISE PROVIDED FOR
    • B64F1/00Ground or aircraft-carrier-deck installations
    • B64F1/36Other airport installations
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/0093Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00 with means for monitoring data relating to the user, e.g. head-tracking, eye-tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/02Reservations, e.g. for tickets, services or events
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0281Customer communication at a business location, e.g. providing product or service information, consulting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/167Detection; Localisation; Normalisation using comparisons between temporally consecutive images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/214Specialised server platform, e.g. server located in an airplane, hotel, hospital
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed

Definitions

  • Embodiments relate generally to systems, methods, and processes that may use touch-free interactions at self-service interaction stations.
  • embodiments relate to use of such stations in transit environments, such as airports or other transport hubs.
  • Some embodiments relate to a self-service station for touch-free interaction in a transit environment, the station including: a display screen having a display direction; a video image recording device with a field of view in the display direction; a processor to control the display of display images on the display screen and to process live video images recorded by the video image recording device; a memory accessible to the processor and storing executable program code that, when executed by the processor, causes the processor to: determine a human face in the video images, determine whether the face is proximate the display screen, identify a tracking feature of the face in the video images and track movement of the tracking feature in the video images, cause the display screen to initiate an interaction process in response to determining that the face is proximate the display screen, and cause the display screen to display a cursor over content displayed on the display screen during the interaction process and to move the cursor in response to movement of the tracking feature to interact with the content.
  • the executable program code when executed by the processor, may cause the processor to: in response to determining that the cursor is positioned over one of at least one predefined area of the content for at least a predetermined dwell time, record a user selection in relation to a screen object associated with the one predefined area.
  • the predetermined dwell time may be between about 2 seconds and about 5 seconds.
  • the predetermined dwell time may between about 2 seconds and about 4 second or between about 3 seconds and about 4 seconds.
  • the executable program code when executed by the processor, may cause the processor to: cause the display screen to visually emphasise the screen object when the cursor is positioned over the one predefined area.
  • the executable program code when executed by the processor, may cause the processor to: cause the display screen to visibly and/or audibly indicate the recording of the user selection.
  • the one predefined area may cover between about 5% and about 35% of a visible area of the display screen.
  • the executable program code when executed by the processor, may cause the processor to: cause the display screen to show a progress indicator that is timed to progressively show elapsing of the predetermined dwell time.
  • the executable program code when executed by the processor, may cause the processor to: before causing the display screen to display the cursor over the content, cause the display screen to display a training task to show that face movement correlates to cursor movement.
  • the executable program code when executed by the processor, may cause the processor to: determine that the face is proximate the display screen when a number of pixels in a video image frame of the face exceeds a predetermined pixel count threshold.
  • the executable program code when executed by the processor, may cause the processor to: apply a machine learning model to determine the face in the video images.
  • the machine learning model may be a deep neural network model based on a single shot multibox detector (SSD) framework, for example.
  • SSD single shot multibox detector
  • the tracking feature may be identified by the processor by analysing an image frame of the face to determine a target pixel area of a fixed contrast window size, wherein the target pixel area has a greatest range of colour contrast among pixel areas of the fixed contrast window size in the image frame, wherein the target pixel area is used as the tracking feature.
  • the fixed contrast window size may be selected to correspond to a face area of between about 1 cm 2 and about 10 cm 2 .
  • the executable program code when executed by the processor, causes the processor to: apply a scaling factor to movement of the tracking feature in the video images in order to cause the display screen to proportionately move the cursor over the content.
  • the executable program code when executed by the processor, may cause the processor to: determine whether movement of the tracking feature in the video images is less than a predetermined minimum number of pixels over a predetermined time period or is greater than a predetermined maximum number of pixels over the predetermined time period; and increase the scaling factor if the movement is less than the predetermined minimum number of pixels or decrease the scaling factor is the movement is greater than the predetermined maximum number of pixels.
  • the executable program code when executed by the processor, may cause the processor to: cause the display screen to statically display the cursor over the content for a first predetermined wait time after the tracking feature can no longer be identified in the video images.
  • the executable program code when executed by the processor, may cause the processor to: store the tracking feature; and for a second predetermined wait time after the tracking feature can no longer be identified in the video images, attempt to determine the face and the tracking feature in the video images.
  • the first predetermined wait time may be between about 5 seconds and about 10 seconds; the second predetermined wait time may between about 5 seconds and about 10 seconds.
  • the station may further comprise a housing that houses the display screen, the video image recording device, the processor and the memory, wherein the housing holds the display screen and the video image recording device at a height above floor level sufficient to allow a face of a person to be generally within the field of view when the person stands between about 1 meter and about 2.5 meters in front of the station.
  • the display screen may be non-responsive to touch.
  • Some embodiments relate to a system for touch-free interaction in a transit environment, the system including: multiple ones of the station positioned to allow human interaction at one or more transit facilities; and a server in communication with each of the multiple stations to monitor operation of each of the multiple stations.
  • Some embodiments relate to a method of facilitating touch-free interaction in a transit environment, including: determining a human face in video images captured by a video image recording device positioned at a station; determining whether the face is proximate a display screen at the station, the display screen facing a same direction as field of view of the video image recording device; identifying a tracking feature of the face in the video images and tracking movement of the tracking feature in the video images; causing the display screen to initiate an interaction process in response to determining that the face is proximate the display screen; and causing the display screen to display a cursor over content displayed on the display screen during the interaction process and to move the cursor in response to movement of the tracking feature to interact with the content.
  • the method may further include: in response to determining that the cursor is positioned over one of at least one predefined area of the content for at least a predetermined dwell time, recording a user selection in relation to a screen object associated with the one predefined area.
  • the predetermined dwell time may be between about 2 seconds and about 5 seconds.
  • the method may further include: causing the display screen to visually emphasise the screen object when the cursor is positioned over the one predefined area. [0028] The method may further include: causing the display screen to visibly and/or audibly indicate the recording of the user selection.
  • the method may further include: causing the display screen to show a progress indicator that is timed to progressively show elapsing of the predetermined dwell time.
  • the method may further include: before causing the display screen to display the cursor over the content, causing the display screen to display a training task to show that face movement correlates to cursor movement.
  • the method may further include: determining that the face is proximate the display screen when a number of pixels in a video image frame of the face exceeds a predetermined pixel count threshold.
  • the method may further include: applying a machine learning model to determine the face in the video images, wherein the machine learning model is a deep neural network model based on a single shot multibox detector (SSD) framework.
  • SSD single shot multibox detector
  • Identifying the tracking feature may include analysing an image frame of the face to determine a target pixel area of a fixed contrast window size, wherein the target pixel area has a greatest range of colour contrast among pixel areas of the fixed contrast window size in the image frame, wherein the target pixel area is used as the tracking feature.
  • the fixed contrast window size may be selected to correspond to a face area of between about 1 cm 2 and about 10 cm 2 .
  • the method may further include: applying a scaling factor to movement of the tracking feature in the video images in order to cause the display screen to proportionately move the cursor over the content.
  • the method may further include: determining whether movement of the tracking feature in the video images is less than a predetermined minimum number of pixels over a predetermined time period or is greater than a predetermined maximum number of pixels over the predetermined time period; and increasing the scaling factor if the movement is less than the predetermined minimum number of pixels or decreasing the scaling factor is the movement is greater than the predetermined maximum number of pixels.
  • the method may further include: causing the display screen to statically display the cursor over the content for a first predetermined wait time after the tracking feature can no longer be identified in the video images.
  • the method may further include: storing the tracking feature; and for a second predetermined wait time after the tracking feature can no longer be identified in the video images, attempting to determine the face and the tracking feature in the video images.
  • the first predetermined wait time is between about 5 seconds and about 10 seconds; or the second predetermined wait time is between about 5 seconds and about 10 seconds.
  • the transit environment may be an airport or other transit hub, for example.
  • Figure 1 is a block diagram view of an interaction station system according to some embodiments.
  • Figure 2 is a block diagram view of an interaction station network according to some embodiments.
  • Figure 3 is a schematic illustration of a user at an interaction station according to some embodiments.
  • Figure 4 is an example field of view of a video recording device according to some embodiments.
  • Figure 5 is an example user interface display at an interaction station according to some embodiments.
  • Figure 6 is a flow chart of the operation of the interaction station according to some embodiments.
  • Figure 7 is a flow chart of further aspects of the operation of the interaction station according to some embodiments.
  • Figure 8 is a schematic block diagram of a computer system architecture that can be employed according to some embodiments.
  • Embodiments relate generally to systems, methods, and processes that use touch-free interactions at self-service interaction stations.
  • embodiments relate to use of such stations in transit environments, such as airports or other transport hubs.
  • a self-service station 101 is described, together with systems of which the self-service system may form a part.
  • a self- service interaction station 101 is provided to facilitate users conducting interaction processes. Such processes may include check-in processes for impending travel, incoming or outgoing immigration or customs processes, travel or event reservation processes or information querying processes, for example.
  • Each station 101 may be connected to a client device 145 and database 155 over a network 140.
  • Each station 101 is configured to identify faces of users 1.1 interacting with the station, through a video image recording device 125.
  • the station 101 is further configured to track, through image processing module 114, the head movement of a user 1.1, in order to interact with the user interface 120 to conduct an interaction process.
  • FIG. 1 is a block diagram of a system 100 for managing self-service interaction stations, comprising a station 101, a server 150, a database 155 accessible to the server 150, and at least one client device 145.
  • Station 101 is in communication with server 150 and client device 145 over a network 140.
  • station 101 comprises a controller 102.
  • the controller 102 comprises a processor 105 in communication with a memory 110 and arranged to retrieve data from the memory 110 and execute program code stored within the memory 110.
  • the components of station 101 may be housed in a housing 108.
  • Station 101 may be connected to network 140, and in communication with client device 145, server 150, and database 155.
  • Processor 105 may include more than one electronic processing device and additional processing circuitry.
  • processor 105 may include multiple processing chips, a digital signal processor (DSP), analog-to digital or digital-to analog conversion circuitry, or other circuitry or processing chips that have processing capability to perform the functions described herein.
  • DSP digital signal processor
  • Processor 105 may execute all processing functions described herein locally on the station 101 or may execute some processing functions locally and outsource other processing functions to another processing system, such as server 150.
  • the network 140 may comprise at least a portion of one or more networks having one or more nodes that transmit, receive, forward, generate, buffer, store, route, switch, process, or a combination thereof, etc. one or more messages, packets, signals, some combination thereof, or so forth.
  • the network 140 may include, for example, one or more of: a wireless network, a wired network, an internet, an intranet, a public network, a packet- switched network, a circuit- switched network, an ad hoc network, an infrastructure network, a public- switched telephone network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, some combination thereof, or so forth.
  • PSTN public- switched telephone network
  • Server 150 may comprise one or more computing devices configured to share data or resources among multiple network devices.
  • Server 150 may comprise a physical server, virtual server, or one or more physical or virtual servers in combination.
  • Database 155 may comprise a data store configured to store data from network devices over network 140.
  • Database 155 may comprise a virtual data store in a memory of a computing device, connected to network 140 by server 150.
  • Station 101 may further comprise a wireless communication device 115, user interface 120, video image recording device 125, and document printer 130.
  • Wireless communication device 115 may comprise a wireless Ethernet interface, SIM card module, Bluetooth connection, or other appropriate wireless adapter allowing wireless communication over network 140.
  • Wireless communication device 115 may be configured to facilitate communication with external devices such as client device 145 and server 150. In some embodiments, a wired communication means is used.
  • User interface 120 may comprise a reader device 121, and is configured to allow a user to initiate and interact with an interaction process hosted by the user interface 120.
  • the interaction process comprises a series of steps allowing a user 1.1 to provide identification details to the station 101 to retrieve booking details and/or undertake a check-in process.
  • the interaction process may comprise a series of steps wherein the user 1.1 provides booking details to the station 101 to identify themselves.
  • the interaction process may take between 1 and 20 minutes, for example.
  • Reader device 121 may comprise a barcode scanner, QR code scanner, magnetic strip reader, or other appropriate device arranged to allow a user to scan a document (such as a passport, boarding pass, ticket, or other identification document) at the station 101.
  • the data read by the reader device 121 may be stored in the memory 110, or transmitted to database 155 through the server 150 over a network 140.
  • the data read by the reader device 121 may trigger the processor 105 to send a request for information associated with the data over network 140 to the server 150.
  • the server 150 may then retrieve additional data associated with the identification data form database 155 and transmit the additional data over network 140 to the processor 105.
  • the user interface 120 may further comprise a display screen 122, configured to allow a user to be shown content during the interaction process.
  • Such content may include a series of actionable items, buttons, information related to a booking, or other appropriate information, in order to conduct the interaction process.
  • Display screen 122 may also depict the location of a moveable cursor on the screen to enable the use to interact with the content.
  • Video image recording device 125 may comprise a camera, arranged to capture images of an area from which the user interface 120 is accessible.
  • image capture device 130 comprises a digital camera device.
  • the video image recording device 125 may have an image resolution of about 1280x720 pixels (known as 720p) or greater, for example.
  • the display resolution of the display screen 122 may be less than the image resolution of the video image recording device 125 since display resolution is not of particular importance. However, various suitable levels of resolution can be used for display screen 122.
  • Document printer 130 may comprise a printer configured to allow for printing user documents as a result of the interaction process.
  • the document printer 136 prints boarding passes, receipts, or other documentation related to the user or the interaction process.
  • the memory 110 may further comprise executable program code that defines a communication module 111, user interface (UI) module 112, and image processing module 114.
  • the memory 110 is arranged to store program code relating to the communication of data from memory 110 over the network 140.
  • Communication module 111 may comprise program code, which when executed by the processor 105, implements instructions related to initiating and operating the wireless communication device 115. When initiated by the communication module 111, the wireless communication device 115 may send or receive data over network 140. Communication module 111 may be configured to package and transmit data generated by the UI module 112 and/or retrieved from the memory 110 over network 140 to a client device 145, and/or to server 150. In some embodiments, this transmitted data includes an alert, relating to a person identified by image capture device 130. In some embodiments, the alert relates to a status of the interaction process. In some embodiments, the alert relates to data sent to the image processing module 114, indicating that a face has been lost or remains undetected from the video feed from video image recording device 125.
  • UI module 112 may comprise program code, which when executed by the proce or 105, implements instructions relating to the operation of user interface 120.
  • UI module 112 may be configured to implement instructions related to the position of a user’s 1.1 head within a field of view of the video image recording device 125.
  • the UI module 112 may receive instructions from the user interface 120 or image processing module 114 about advancing, reverting, or otherwise interacting with stages of an interaction process.
  • Image processing module 114 may comprise program code, which when executed by the processor 105, implements instructions relating to the operation of the video image recording device 125.
  • the video image recording device 125 may activate and transmit a stream of captured video frames to the processor 105.
  • image processing module 114 comprises program code, which when executed by the processor 105, implements instructions configured to allow the module 114 to identify pixel regions in images that correspond to faces within the image frame.
  • Image processing module 114 may further comprise an artificial-intelligence (AI) model 116, trained on facial image frames.
  • AI model 116 may be trained using supervised machine learning in order to accurately provide instructions to image processing module 114 to identify faces in image frames.
  • face images captured by video image recording device 125 are stored in memory 110, or image processing module 114 for verification by a human operator. The human verification of the stored image frames as containing a face or not may be used to generate the AI model 116.
  • AI model 116 may utilise machine learning algorithms and increase accuracy of face detection by facial processing model 114.
  • Image processing module 114 may further comprise feature tracking module 117, comprising program code, which when executed by the processor 105, implements instructions related to identifying a feature on the face of a user in an image frame, and to track the movement of the feature within a field of view across multiple image frames.
  • AI model 116 may include a machine learning-based model that includes a deep neural net (DNN)-based model to detect the face area of a person.
  • the DNN may be based on a SSD framework (Single Shot MultiBox Detector), for example.
  • the SSD framework may use a reduced ResNet-10 model (“resl0_300x300_ssd_iter_140000”), for example.
  • ResNet-10 model resl0_300x300_ssd_iter_140000
  • any analogous model used for face detection may be used in addition to the aforementioned model.
  • the station 101 may operate in an AI model 116 training mode in order to generate an accurate model for automatic face detection, developed specific to an individual station 101.
  • the individual generation of AI model 116 can accommodate the particular location, positioning, angle, and lighting of the field of view 1.5 of a particular station 101.
  • the AI model 116 may be trained on a data set of captured images from the image capture device 130, a pre-existing data set of images from other sources, or some combination thereof.
  • the AI model 116 may be generated by supervised learning, for example, through manual review of control images wherein the AI model 116 identifies a face or does not identify a face.
  • the AI model 116 may use captured images captured during operation of the image capture device 130 to continually develop a more accurate model throughout its normal operation.
  • AI model 116 may use a single-stage dense face localisation image processing technique, such as the RetinaFace (“RetinaFace: Single-stage Dense Face Localisation in the Wild”, Deng et al, 4 May 2019) facial image processing model, to identify facial features in captured images.
  • the AI model 116 is required to consistently locate faces in images across a wide variety of contexts.
  • the AI model 116 is preferably trained on a dataset that includes multiple faces in an image, occluded faces, blurry images, and images where the colour and rotation make face detection more difficult.
  • the relatively high difficulty of the training dataset means that the trained AI model 116 is robust enough to correctly classify partially occluded faces (such as when a person is wearing a face mask) which is a likely and expected occurrence during normal usage of station 101.
  • FIG. 2 depicts a block diagram of a self-service station network 200 according to some embodiments.
  • the network 200 comprises an individual self-service station bank or array 210, a separately located self-service station bank or array 215, server 150, database 155, and client device array 220.
  • the individual self-service station array 210 may comprise at least one self-service station 101 individually connected to network 140.
  • the stations 101 of array 210 are located together at a single installation site, such as an airport check-in, or an airport customs or immigration area.
  • the stations 101 of array 210 may be separately located throughout a number of individual sites throughout an airport, or may be located at multiple installation sites, such as a series of airports.
  • the locations of installation of array 210 comprise self-service facilities including, but not limited to, self-service check-in kiosks, self-service bag drop, automated departure gate boarding gates, automated immigration entry or exit gates, airline lounge gates, or other appropriate self-service areas, for example.
  • the client device array 220 may comprise at least one client device 145 connected individually to network 140.
  • the array 220 comprises any combination of smartphones, tablet computing devices, personal computers, or other devices capable of sending instructions over network 140 and executing instructions from memory 147.
  • FIG. 3 depicts a diagram of a user 1.1 interacting with a self-service interaction station 101.
  • the self-service station 101 comprises a housing 108 that in some embodiments includes a solid upstanding cabinet 305 defining internal space to house the components of station 101 described herein.
  • the housing 108 houses the display screen 122, the video image recording device 125, the processor 102 and the memory.
  • the housing 108 holds the display screen 122 and the video image recording device 125 at a height above floor level (i.e. a bottom extent of the housing 108) sufficient to allow a face of a person to be generally within the field of view when the person stands between about 1 meter and about 2.5 meters in front of the station.
  • the display screen 122 may be held by the housing 108 so that the bottom edge of the display screen 122 is at a height of between about 1.3 and about 1.6 metres above floor level, for example.
  • the display screen 122 may have a top edge about 0.2 to about 0.5 metres above the bottom edge, for example.
  • a light-receiving aperture of the video image recording device 125 may be positioned at or slightly above the top edge of the display screen, for example.
  • user 1.1 is at least partially within the field of view 1.5 of the video image recording device 125.
  • the video image recording device 125 may be positioned to ensure the field of view 1.5 defines an area substantially facing the direction from which the user interface 120 may be accessed from a user 1.1.
  • the user 1.1 may be an airline passenger, airline or airport staff, or other individual at an airport requiring self- service interaction or check-in processes.
  • the user 1.1 may be a train, ship or other transport passenger, staff, or other individual requiring self-service interaction or check-in processes for transport purposes.
  • the user 1.1 may be an event participant, attendee at a secure facility or other person requiring self-service check-in processes.
  • the field of view 1.5 defines a horizontal range of approximately 1 meter either side of the anticipated position of a user 1.1 (standing at between about 1 metre and about 2.5 metres from the display screen) using the user interface 120. In some embodiments, the field of view 1.5 defines a vertical range of about 0.5 meters above and below the anticipated position (standing at between about 1 metre and about 2.5 metres from the display screen) of a user 1.1 using the interface 120.
  • the field of view 1.5 is substantially centred at an anticipated average height of an adult person who would be accessing the user interface 120.
  • the field of view 1.5 may extend in a horizontal and vertical area to cover other people close to the user 1.1. In some embodiments, other appropriate ranges may be defined.
  • the field of view 1.5 may be arranged to be substantially centred at the anticipated area of the upper portions of a user 1.1.
  • the upper portions of a user 1.1 are intended to include at least the user’s chest, neck, face, and head.
  • the field of view 1.5 may comprise an area aligned with the facing direction of the display screen 122.
  • the field of view 1.5 may be dynamically altered by the video imaging device 125 to be extended, shrunk or laterally or vertically shifted in accordance with user specified requirements.
  • the user specified requirements may be configurable by an operator to allow individual stations 101 to have an optimised field of view 1.5 depending on their installation position, angle, and lighting.
  • Figure 4 depicts an example field of view 400 of the video image recording device 125, with a user 1.1 within the field of view 1.5 frame.
  • Figure 5 depicts an example user interface screen 500 with corresponding cursor 2.26 actions mapped to the movement of the head of user 1.1 in Figure 4.
  • a live continuous feed of video image frames may be sent to image processing module 114 by video image recording device 125.
  • the image processing module 114 may then identify a face within the field of view 1.5 using face-detection algorithms within feature tracking module 117, AI model 116, or other appropriate means.
  • the feature tracking module 117 may then identify an initially selected tracking feature (TF) position 2.9 on the user’s 1.1 face.
  • the TF position 2.9 is then mapped by feature tracking module 117 to an initial on screen cursor position of 2.25 on display screen 122.
  • the image processing module 114 may use AI model 116 in order to identify a region of the user’ s 1.1 face within the image frame to track throughout the interaction process.
  • This tracked feature may be a user’s 1.1 eye, nose, glasses, or other suitably high contrast facial region.
  • Video image frames captured by the video image recording device 125 may be analysed by the feature tracking module 117 by applying a moving (scanning) contrast window of a fixed contrast window size to the area in the captured image frames identified as the face to select a target pixel area for the tracking feature.
  • the moving contrast window may be moved pixel by pixel (or by blocks of pixels) across the face area in the image while calculating contrast values of the group of pixels currently in the contrast window.
  • the calculation of the contrast values comprises a binary value determination, associating each pixel with a zero or a one, corresponding to a black value or a light value.
  • a pixel colour determination is made, with specific RGB values from the contrast window mapped to positions within the contrast window.
  • other contrast calculation methods may be used.
  • the target pixel area is selected to have a greatest range of colour contrast among pixel areas of the fixed contrast window size in the image frame.
  • the target pixel area is used as the tracking feature.
  • the fixed contrast window size may be selected to correspond to a face area of between about 1 cm 2 and about 10 cm 2 , for example around 2 cm by 2 cm. In other embodiments, other sizes can be used.
  • the user 1.1 may then move their head, which in turn moves the TF along the path of 2.10, resting momentarily on position 2.11.
  • the TF moves in a negative direction on the x-axis 2.2 and a positive direction on the y-axis 2.3 (as shown in Figure 4).
  • the processor 105 may, when executing the feature tracking module 117 of image processing module 117, determine this movement of the tracking feature across successive captured live image frames and control the display screen to move the on-screen cursor accordingly along the path of the TF by performing corresponding (scaled) x/y-axis movements of the on-screen cursor 2.26.
  • the movement of the on-screen cursor 2.26 on the display screen 122 provides feedback to the user 1.1, and allows them to observe the control they are performing, thereby improving usability.
  • the TF rests on position 2.11, and the processor 105 controls the display to show the cursor resting on position 2.27 in Figure 5.
  • the movement of the TF across live successive captured image frames may be calculated by the feature tracking module 117.
  • the feature tracking module 117 may identify the location of the TF in a first frame of a live video feed, and scan each successive frame within an anticipated region of the image frame for the TF at a subsequent location.
  • the feature tracking module 117 may be configured to perform this subsequent position determination with a TF confidence match of 90% , 95%, or other appropriate percentage in order to successfully track the TF through the frames of a live video feed, for example.
  • the process of dwell and interaction event generation is illustrated in Figures 4 and 5.
  • the user 1.1 has moved the TF to position 2.11, thereby moving the on screen cursor to position 2.27 in figure 5, where it is located over an on-screen button (Button A, 2.30).
  • the image processing module 114 measures the time in which the cursor remains in the location, in order to determine an action event.
  • a predetermined margin of movement is allowed as the user 1.1 will most likely not hold their head exactly still but will instead by slightly moving all the time.
  • the margin of movement may comprise a distance threshold. In such embodiments, the distance threshold may be between about 1cm to 5cm.
  • different threshold amounts may be provided in order to accurately capture user 1.1 intent when interacting with user interface 120.
  • distance thresholds may be measured in pixel distances between movement actions of a user 1.1 tracked feature between video image frames or may be measured as total pixel distance over a number of frames.
  • buttons B may be configured to be relatively larger than normal.
  • one or more of the buttons may take up an area of between about 5% and about 35%, optionally about 10% to about 25%, of the visible area of display screen 122. There may also be at least 25% inactive (i.e.
  • the process and relationship of TF to on-screen cursor 2.26 is also described in the flow chart of Figure 6, at items 4.9, 4.10 and 4.11.
  • the process 600 may be event-driven, where the image processing module 114 continuously tracks the TF at step 4.8 and reacts to either the movement event of a user 1.1 at step 4.10 or the dwell event at step 4.13 as each occurs, before returning to the cycle of tracking in order to detect the next event.
  • Figure 5 depicts an example user interface screen 500 displayed on display screen 122.
  • the interface screen displays content that comprises button A 2.30 and button B 2.31 within an interaction area 2.20.
  • cursor 2.26 is provided to allow a user 1.1 to interact with buttons 2.30 and 2.31 through tracked head movements.
  • cursor 2.26 may begin at a substantially central position within interaction area 2.20.
  • a detected movement of the user’s 1.1 TF by image processing module 114 may then cause the processor 105 (executing UI module 112) to cause cursor 2.26 to move to a second position 2.27.
  • a subsequent movement of the user’s 1.1 TF along path 2.28 may similarly cause cursor 2.26 to move to a third position 2.29.
  • the module 114 may associate the (relative) lack of movement as a dwell action.
  • a dwell action at 2.27 or 2.29 causes the image processing module 114 to transmit an interaction command to UI module 112 corresponding to the action of the respective button 2.30, 2.31.
  • the cursor 2.26 or a button (or other selectable screen content) over which the cursor 2.26 rests may display a dynamic graphical indicator of elapsing of a dwell action within display area 2.20 over to communicate to a user 1.1 that a dwell action is occurring (i.e. a dwell time is elapsing).
  • this graphical indicator may comprise a countdown timer, animated progress indicator, or other dynamic graphical progress indicator.
  • the feature tracking module 117 tracks tracking features and the UI module 112 determines dwell time of the cursor. These two modules 117, 112 cooperate to determine the “attention” of the user in order to determine whether the user intends to perform a selection interaction. The attention of the user is determined to be positive when the tracking feature (TF) is currently visible and tracked by the feature tracking module 117. If the tracking feature has moved out of the DVC 125 field-of-view 1.5, then the feature tracking module 117 changes the status of attention to negative.
  • the tracking feature may be lost if the person has moved outside the DVC field-of-view for some reason, for example, if they have bent down to open their bag, or have turned around to speak to another person, to the point that the TF is not visible to the DVC 125.
  • Making a determination by feature tracking module 117 regarding a binary attention status can help to avoid accidental activation (selection) of a button or other content. For example, if the on-screen cursor 2.26 has moved to a position over a button, and then the TF cannot be found by feature tracking module 117, then selection of the button should not be performed by feature tracking module 117 even though the dwell time may have elapsed. An interaction will only be performed when the dwell time threshold has been reached and feature tracking module 117 has the TF currently acquired (i.e. status of attention is positive).
  • the feature tracking module 117 has the capability to re-acquire the tracking feature when it is temporarily lost without needing to again begin the process of classifying and isolating a face in the DVC 125 field-of-view 1.5. Temporary loss of the tracking feature may occur when the user with TF attention turns their head or briefly moves out of the DVC field- of-view, causing TF attention to change to negative. At this point, feature tracking module 117 will attempt to re-acquire the TF using the latest tracking feature stored in memory by searching in the DVC field-of-view for a short period of time. This short time period should be relatively short, for example in the order of 5-10 seconds, but is configurable to a different time period based on user experience.
  • the cursor 2.26 may be shown on display screen 122 but will not move since no TF is being tracked.
  • the feature tracking module 117 will not interrupt the passenger processing application (that is conducting the interaction process) during the time of attempting to reacquire the TF. If the TF is reacquired during this time, then the process of on screen cursor control using the TF movement of the user continues as normal. This allows on screen cursor control to be readily resumed after the user has momentarily moved in a way that the TF attention is lost, for example if the user were to look away for a moment, without any impact to the current interaction process.
  • the dimensions of a live video image frame received by the image processing module 114 is mapped to the interaction area 2.20.
  • a mapping ratio may be applied, such that a detected pixel movement of a TF in the frame may correspond to a scaled movement of cursor 2.26 in the interaction area 2.20 on display 122.
  • a mapping (scaling) ratio of around 1:1, 1:5, 1:8, 1:10, 1:15, 1:20, 1:30, 1:40, 1:50, 1:60, 1:70, 1:80, 1:90, 1:100 (or ratios in between such numbers) or other ratios may be applied between image frame and interaction area 2.20, depending on the resolution difference between the video image recording device and the display screen 122.
  • movement distance thresholds may be provided in order to prevent stray or accidental movements of the TF by a user as registering as cursor 2.26 movements on interaction area 2.20.
  • the mapping ratio may be adjusted by the feature tracking module 117 based on a scaling factor.
  • the scaling factor may be incrementally increased (up to an upper limit) if the movement of the TF is less than a predetermined minimum number of pixels over a period of time, or incrementally decreased (down to a lower limit) if the movement of the TF is greater than a predetermined maximum number of pixels over the period of time.
  • the capability of the feature tracking module 117 to dynamically adjust the sensitivity of the interpolation of physical movement to cursor movement may help to compensate for a person being too subtle or too exaggerated with their head movement.
  • the pixel resolution of DVC 125 will be greater that the display resolution of the display screen 122.
  • the default sensitivity ratio is configurable based on observation of user experience and may be set at one of the example mapping ratios listed above, for example.
  • the feature tracking module 117 may dynamically increase the sensitivity ratio from the default ratio to a higher level in order to add more weight to the smaller pixel movements in the DVC field-of-view.
  • the default sensitivity ratio may be 30:1 where 30 pixels in the DVC field-of- view is interpolated to 1 pixel of movement by the on-screen cursor.
  • feature tracking module 117 may increase the sensitivity ratio gradually until the on-screen cursor movement over time fits within an expected range.
  • the feature tracking module 117 may therefore increase the sensitivity ratio from 30:1, to 40:1, then 50:1 and so on (for example), to an upper limit configured in the feature tracking module 117. This adjusted sensitivity ratio will be persisted or further adjusted as necessary until the TF attention is completely lost or the passenger processing application completes the current interaction process.
  • FIG. 6 depicts a flow chart 600 of the operation of a user interaction process at station 101 according to some embodiments.
  • the UI module 112 is initialised by processor 105, and the station 101 remains in a standby operation until a user 1.1 interacts with the station 101.
  • the UI module 112 may implement an application at step 4.1, such as a passenger check-in application, immigration control application, vehicle boarding application, passport management application, or other application.
  • the image processing module 114 is initialised by processor 105 and receives a stream of video image frames from video image recording device 125.
  • the image processing module 114 then processes the image frames at step 4.3, using AI model 116 to identify the face of a user 1.1 within the field of view 1.5. If, at step 4.4, a user 1.1 moves within the field of view 1.5 of the video image recording device 125, and the user is determined to be close enough to the station 101, then the image processing module 114 may isolate the face of the user 1.1 at step 4.5 to subsequently determine a tracking feature (TF) of the face at step 4.6.
  • the processing that is applied by image processing module 114 seeks to determine both the presence of a face in the image frame, and the size of the detected face in the frame.
  • the size of the detected face may be analysed by the image processing module 114 and compare the number of pixels of the identified face to a range of known values.
  • the known values correspond to facial size and proximity models within AI model 116.
  • the facial processing module 114 may determine that the face belongs to a user 1.1, intending to initiate an interaction process.
  • the interaction process application implemented by UI module 112 may be activated to allow user 1.1 to conduct the interaction process.
  • the display screen 122 may display a cursor within the interaction area 2.20.
  • UI module 112 may cause display screen 122 to display a calibration prompt to the user 1.1. In such embodiments, the user 1.1 is informed that the interaction process is conducted through tracked head-movements.
  • the user 1.1 may be prompted to undertake a training task, which may include making a series of example head movements, or to place their head within a specific area of the field of view 1.5 corresponding to a bounding box on the display screen 122.
  • a training task may include making a series of example head movements, or to place their head within a specific area of the field of view 1.5 corresponding to a bounding box on the display screen 122.
  • the specific movements or location of the user’s 1.1 head allow for feature tracking module 117 to identify and track a specific feature of the user’s 1.1 face and/or help the user to learn how to use head movement to move the cursor 2.26 on the display screen 122.
  • the TF may be selected based on a series of potential factors including, but not limited to, eye, mouth, and ear location, or presence of eyeglasses, or other appropriate features that are easily trackable.
  • the feature tracking module 117 identifies an area within the face of high contrast between facial elements. Some examples of this may include the eyes which have a high contrast between the “white” of the eye and the iris or pupil, presence of eyeglasses, and so on.
  • feature tracking module 117 may use edge detection algorithms, models from AI model 116, or other means of identifying appropriate facial elements to track.
  • TF movements are continually tracked at step 4.8 and 4.9 in order to allow the user 1.1 to move cursor 2.26 and engage with the interaction process via the display screen 122.
  • the TF is identified by the feature tracking module 117 as having moved location between a series of video image frames provided to the image processing module 114 by the video image recording device. These displacements of the TF are calculated by the feature tracking module 117 at step 4.10, and are then mapped by the feature tracking module 117 to cursor 2.26 position within interaction area 2.20. At step 4.11, the cursor 2.26 position is updated relative to the mapped TF displacement.
  • a dwell time threshold is exceeded at 4.13.
  • the dwell time may be between about 2 seconds and about 5 seconds, or optionally between about 3 and about 4 seconds, for example.
  • the elapsing of the dwell time may be graphically indicated on display screen 122 through an animated countdown, timer, progress bar, or other dynamic graphical indicator or element, for example.
  • a simulated ‘touch’ action occurs on the interaction area 2.20 under the current location of cursor 2.26.
  • the simulated touch action is processed by UI module 112, and an interaction process action occurs relative to the simulated touch action. In some embodiments, this comprises a button press action, UI field selection, interaction process progression or a request to revert to a prior stage of the interaction process. In other embodiments, the simulated touch action may interact with other UI elements, or perform different interaction process actions.
  • the action loop of the process 600 is concluded. This may comprise a loop for a single stage of an interaction process at station 101, or may comprise the entire interaction process. After the interaction process at station 101 is concluded, then the process 600 reverts to step 4.1 and awaits for a face to enter the field of view 1.5.
  • the completion of the process 600 at step 4.16 may be due to the operation of the interaction process, indicating that a user has completed the process and the machine is ready to accept another user 1.1.
  • the interaction process is suspended or terminated due requirements of the interaction application in the UI module 112.
  • the process may conclude due to a lack of an identified face of a user 1.1 within the field of view 1.5.
  • a user 1.1 may have left the field of view 1.5 before the interaction process has concluded, or the facial tracking module 117 have lost position of the TF for a configurable amount of time, indicating that the user 1.1 has been lost by the module 117.
  • the station 101 may display a warning prompt on display screen 122 notifying the user that their face has become undetected, and/or that the process may be lost unless the TF or their face is detected again.
  • the feature tracking module 117 may allow for a configurable detection-loss threshold to prevent unintended movements of the user 1.1 from unintentionally ending the interaction process. This detection-loss threshold may be about 5 seconds or 10 seconds. In other embodiments, the threshold may be other periods of time.
  • the feature tracking module 117 may cause the display screen 122 to statically display the cursor 2.26 over the interaction area 2.20 for a first predetermined wait time after the tracking feature can no longer be identified in the video images.
  • the first predetermined wait time may be between about 5 seconds and about 10 seconds, for example.
  • the feature tracking module 117 may wait for a predetermined time after the tracking feature can no longer be identified in the video images while attempting to again determine the face and locate the previous (stored) tracking feature or a new tracking feature in the video images.
  • the second predetermined wait time may between about 5 seconds and about 10 seconds.
  • Figure 8 illustrates an example computer system 800 according to some embodiments.
  • one or more computer systems 800 perform one or more steps of one or more methods described or illustrated herein.
  • one or more computer systems 800 provide functionality described or illustrated herein.
  • software running on one or more computer systems 800 performs one or more steps of one or more methods described or illustrated herein or provides functionality described or illustrated herein.
  • Particular embodiments include one or more portions of one or more computer systems 800.
  • reference to a computer system may encompass a computing device, and vice versa, where appropriate.
  • reference to a computer system may encompass one or more computer systems, where appropriate.
  • Controller 102 is an example of computer system 800.
  • This disclosure contemplates any suitable number of computer systems 800.
  • This disclosure contemplates computer system 800 taking any suitable physical form.
  • computer system 800 may be an embedded computer system, a system-on-chip (SOC), a single -board computer system (SBC) (such as, for example, a computer-on-module (COM) or system-on-module (SOM)), a special-purpose computing device, a desktop computer system, a laptop or notebook computer system, a mobile telephone, a server, a tablet computer system, or a combination of two or more of these.
  • SOC system-on-chip
  • SBC single -board computer system
  • COM computer-on-module
  • SOM system-on-module
  • computer system 800 may: include one or more computer systems 800; be unitary or distributed; span multiple locations; span multiple machines; span multiple data centers; or reside partly or wholly in a computing cloud, which may include one or more cloud computing components in one or more networks.
  • one or more computer systems 800 may perform without substantial spatial or temporal limitation one or more steps of one or more methods described or illustrated herein.
  • one or more computer systems 800 may perform in real time or in batch mode one or more steps of one or more methods described or illustrated herein.
  • One or more computer systems 800 may perform at different times or at different locations one or more steps of one or more methods described or illustrated herein, where appropriate.
  • computer system 800 includes at least one processor 810, memory 815, storage 820, an input/output (I/O) interface 825, a communication interface 830, and a bus 835.
  • Processor 105 is an example of processor 810.
  • Memory 110 is an example of memory 815.
  • Memory 110 may also be an example of storage 820.
  • processor 810 includes hardware for executing instmctions, such as those making up a computer program.
  • processor 810 may retrieve (or fetch) the instructions from an internal register, an internal cache, memory 815, or storage 820; decode and execute them; and then write one or more results to an internal register, an internal cache, memory 815, or storage 820.
  • processor 810 may include one or more internal caches for data, instructions, or addresses. This disclosure contemplates processor 810 including any suitable number of any suitable internal caches, where appropriate.
  • processor 810 may include one or more instruction caches, one or more data caches, and one or more translation lookaside buffers (TLBs). Instructions in the instruction caches may be copies of instructions in memory 815 or storage 820, and the instruction caches may speed up retrieval of those instructions by processor 810. Data in the data caches may be copies of data in memory 815 or storage 820 for instructions executing at processor 810 to operate on; the results of previous instructions executed at processor 810 for access by subsequent instmctions executing at processor 810 or for writing to memory 815 or storage 820; or other suitable data. The data caches may speed up read or write operations by processor 810. The TLBs may speed up virtual-address translation for processor 810.
  • TLBs translation lookaside buffers
  • processor 810 may include one or more internal registers for data, instructions, or addresses. This disclosure contemplates processor 810 including any suitable number of any suitable internal registers, where appropriate. Where appropriate, processor 810 may include one or more arithmetic logic units (ALUs); be a multi-core processor; or include one or more processors 810. Although this disclosure describes and illustrates a particular processor, this disclosure contemplates any suitable processor.
  • ALUs arithmetic logic units
  • memory 815 includes main memory for storing instructions for processor 810 to execute or data for processor 810 to operate on.
  • computer system 800 may load instructions from storage 820 or another source (such as, for example, another computer system 800) to memory 815.
  • Processor 810 may then load the instructions from memory 815 to an internal register or internal cache.
  • processor 810 may retrieve the instructions from the internal register or internal cache and decode them.
  • processor 810 may write one or more results (which may be intermediate or final results) to the internal register or internal cache.
  • Processor 810 may then write one or more of those results to memory 815.
  • processor 810 executes only instructions in one or more internal registers or internal caches or in memory 815 (as opposed to storage 820 or elsewhere) and operates only on data in one or more internal registers or internal caches or in memory 815 (as opposed to storage 820 or elsewhere).
  • One or more memory buses (which may each include an address bus and a data bus) may couple processor 810 to memory 815.
  • Bus 835 may include one or more memory buses, as described below.
  • one or more memory management units reside between processor 810 and memory 815 and facilitate accesses to memory 815 requested by processor 810.
  • memory 815 includes random access memory (RAM). This RAM may be volatile memory, where appropriate.
  • this RAM may be dynamic RAM (DRAM) or static RAM (SRAM). Moreover, where appropriate, this RAM may be single-ported or multi-ported RAM.
  • Memory 815 may include one or more memories 815, where appropriate. Although this disclosure describes and illustrates particular memory, this disclosure contemplates any suitable memory. [0118]
  • storage 820 includes mass storage for data or instructions. As an example and not by way of limitation, storage 820 may include a hard disk drive (HDD), a floppy disk drive, flash memory, an optical disc, a magnetooptical disc, magnetic tape, or a Universal Serial Bus (USB) drive or a combination of two or more of these.
  • HDD hard disk drive
  • floppy disk drive flash memory
  • USB Universal Serial Bus
  • Storage 820 may include removable or non-removable (or fixed) media, where appropriate. Storage 820 may be internal or external to computer system 800, where appropriate. In particular embodiments, storage 820 is non-volatile, solid-state memory. In particular embodiments, storage 820 includes read-only memory (ROM). Where appropriate, this ROM may be mask- programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically erasable PROM (EEPROM), electrically alterable ROM (EAROM), or flash memory or a combination of two or more of these. This disclosure contemplates mass storage 820 taking any suitable physical form. Storage 820 may include one or more storage control units facilitating communication between processor 810 and storage 820, where appropriate. Where appropriate, storage 820 may include one or more storages 820.
  • ROM read-only memory
  • PROM programmable ROM
  • EPROM erasable PROM
  • EEPROM electrically erasable PROM
  • EAROM electrically alterable ROM
  • I/O interface 825 includes hardware, software, or both, providing one or more interfaces for communication between computer system 800 and one or more I/O devices.
  • Computer system 800 may include one or more of these EO devices, where appropriate.
  • One or more of these EO devices may enable communication between a person and computer system 800.
  • an I/O device may include a keyboard, keypad, microphone, monitor, mouse, printer, scanner, speaker, still camera, stylus, tablet, touch screen, trackball, video camera, another suitable EO device or a combination of two or more of these.
  • An I/O device may include one or more sensors.
  • I/O interface 825 may include one or more device or software drivers enabling processor 810 to drive one or more of these I/O devices.
  • I/O interface 825 may include one or more EO interfaces 825, where appropriate.
  • communication interface 830 includes hardware, software, or both providing one or more interfaces for communication (such as, for example, packet-based communication) between computer system 800 and one or more other computer systems 800 or one or more networks.
  • communication interface 830 may include a network interface controller (NIC) or network adapter for communicating with a wireless adapter for communicating with a wireless network, such as a WI-FI or a cellular network.
  • NIC network interface controller
  • This disclosure contemplates any suitable network and any suitable communication interface 830 for it.
  • computer system 800 may communicate with an ad hoc network, a personal area network (PAN), a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), or one or more portions of the Internet or a combination of two or more of these.
  • PAN personal area network
  • LAN local area network
  • WAN wide area network
  • MAN metropolitan area network
  • computer system 800 may communicate with a wireless cellular telephone network (such as, for example, a Global System for Mobile Communications (GSM) network, or a 3G, 4G or 5G cellular network), or other suitable wireless network or a combination of two or more of these.
  • Computer system 800 may include any suitable communication interface 830 for any of these networks, where appropriate.
  • Communication interface 830 may include one or more communication interfaces 830, where appropriate.
  • bus 835 includes hardware, software, or both coupling components of computer system 800 to each other.
  • bus 835 may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a frontside bus (FSB), a HYPERTRANSPORT (HT) interconnect, an Industry Standard Architecture (ISA) bus, an INFINIBAND interconnect, a low-pin-count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCIe) bus, a serial advanced technology attachment (SATA) bus, a Video Electronics Standards Association local (VLB) bus, or another suitable bus or a combination of two or more of these.
  • Bus 835 may include one or more buses 835, where appropriate.
  • a computer-readable non-transitory storage medium or media may include one or more semiconductor-based or other integrated circuits (ICs) (such, as for example, field-programmable gate arrays (FPGAs) or application-specific ICs (ASICs)), hard disk drives (HDDs), hybrid hard drives (HHDs), optical discs, optical disc drives (ODDs), magneto-optical discs, magneto-optical drives, (FDDs), solid-state drives (SSDs), RAM- drives, or any other suitable computer-readable non-transitory storage media, or any suitable combination of two or more of these, where appropriate.
  • ICs such, as for example, field-programmable gate arrays (FPGAs) or application-specific ICs (ASICs)
  • HDDs hard disk drives
  • HHDs hybrid hard drives
  • ODDs optical disc drives
  • magneto-optical discs magneto-optical drives
  • FDDs magneto-optical drives
  • SSDs solid-state drives
  • RAM- drives or any other

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Human Computer Interaction (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Marketing (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Development Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Accounting & Taxation (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Optics & Photonics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Educational Administration (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Mathematical Physics (AREA)

Abstract

Des modes de réalisation concernent de manière générale des systèmes, des procédés et des processus qui utilisent des interactions sans contact au niveau de stations d'interaction en libre-service. En particulier, des modes de réalisation concernent l'utilisation de telles stations dans des environnements de transit, tels que des aéroports ou d'autres concentrateurs de transport. Certains modes de réalisation concernent une station d'interaction en libre-service, ayant un dispositif d'enregistrement d'image vidéo avec une vue de vue dans une direction d'affichage de l'écran d'affichage, un processeur configuré pour déterminer un visage humain à l'intérieur d'images capturées, pour identifier et suivre le mouvement d'une caractéristique de suivi d'un visage identifié dans les images capturées, et pour amener l'écran d'affichage à afficher un curseur sur un contenu affiché sur l'écran d'affichage pendant un processus d'interaction et à déplacer le curseur en réponse au mouvement de l'élément de suivi pour interagir avec le contenu.
EP21776569.2A 2020-03-23 2021-03-17 « interaction sans contact avec une station libre-service dans un environnement de transit » Withdrawn EP4128031A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2020900882A AU2020900882A0 (en) 2020-03-23 Touch-free interaction with a self-service station in a transit environment
PCT/AU2021/050241 WO2021189099A1 (fr) 2020-03-23 2021-03-17 « interaction sans contact avec une station libre-service dans un environnement de transit »

Publications (1)

Publication Number Publication Date
EP4128031A1 true EP4128031A1 (fr) 2023-02-08

Family

ID=77889870

Family Applications (1)

Application Number Title Priority Date Filing Date
EP21776569.2A Withdrawn EP4128031A1 (fr) 2020-03-23 2021-03-17 « interaction sans contact avec une station libre-service dans un environnement de transit »

Country Status (3)

Country Link
EP (1) EP4128031A1 (fr)
AU (1) AU2021242795A1 (fr)
WO (1) WO2021189099A1 (fr)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020126090A1 (en) * 2001-01-18 2002-09-12 International Business Machines Corporation Navigating and selecting a portion of a screen by utilizing a state of an object as viewed by a camera
US8159458B2 (en) * 2007-12-13 2012-04-17 Apple Inc. Motion tracking user interface
US9507417B2 (en) * 2014-01-07 2016-11-29 Aquifi, Inc. Systems and methods for implementing head tracking based graphical user interfaces (GUI) that incorporate gesture reactive interface objects
US10102358B2 (en) * 2015-12-29 2018-10-16 Sensory, Incorporated Face-controlled liveness verification

Also Published As

Publication number Publication date
WO2021189099A1 (fr) 2021-09-30
AU2021242795A1 (en) 2022-11-24

Similar Documents

Publication Publication Date Title
US20240289930A1 (en) Deep learning-based real-time detection and correction of compromised sensors in autonomous machines
CN109145680B (zh) 一种获取障碍物信息的方法、装置、设备和计算机存储介质
KR101411058B1 (ko) 비디오 분석 테스트 시스템
US8970696B2 (en) Hand and indicating-point positioning method and hand gesture determining method used in human-computer interaction system
US9646200B2 (en) Fast pose detector
JP2018113038A (ja) 検査機器および荷物における銃器を検出する方法
US20210049366A1 (en) Detecting Fake Videos
WO2023005091A1 (fr) Systèmes et procédés de détection d'objet
US20220277162A1 (en) Optical person recognition techniques for social distancing
WO2021189101A1 (fr) Lecture de document sans contact au niveau d'une station libre-service dans un environnement de transit
US11410443B2 (en) Labelling training method and system for implementing the same
AU2021242795A1 (en) Touch-free interaction with a self-service station in a transit environment
US20230135198A1 (en) Self-service station having thermal imaging camera
US11594335B2 (en) Augmented reality virus transmission risk detector
El Mrabet et al. Unlocking doors: A tinyml-based approach for real-time face mask detection in door lock systems
US11222468B1 (en) Object tracking using sparse sensor captures
US12125282B2 (en) Hazard notifications for a user
US11321838B2 (en) Distributed sensor module for eye-tracking
KR102720888B1 (ko) 자율 머신에서의 오작동 센서의 딥 러닝 기반의 실시간 검출 및 수정
US20240094825A1 (en) Gesture recognition with hand-object interaction
US20240265735A1 (en) Guiding method and apparatus for palm verification, terminal, storage medium, and program product
CN113715019B (zh) 机器人控制方法、装置、机器人和存储介质
US11663790B2 (en) Dynamic triggering of augmented reality assistance mode functionalities
KR20240155165A (ko) 자율 머신에서의 오작동 센서의 딥 러닝 기반의 실시간 검출 및 수정
WO2024103041A1 (fr) Système de détection d'activité et procédés associés

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20221021

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20231003