CN110313152A - 用于智能助理计算机的用户注册 - Google Patents
用于智能助理计算机的用户注册 Download PDFInfo
- Publication number
- CN110313152A CN110313152A CN201880011946.4A CN201880011946A CN110313152A CN 110313152 A CN110313152 A CN 110313152A CN 201880011946 A CN201880011946 A CN 201880011946A CN 110313152 A CN110313152 A CN 110313152A
- Authority
- CN
- China
- Prior art keywords
- people
- data
- registration
- user
- unregistered
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims description 80
- 239000012634 fragment Substances 0.000 claims description 39
- 230000004044 response Effects 0.000 claims description 30
- 238000013500 data storage Methods 0.000 claims description 26
- 230000014509 gene expression Effects 0.000 claims description 15
- 230000000694 effects Effects 0.000 claims description 12
- 230000000007 visual effect Effects 0.000 claims description 12
- 230000006870 function Effects 0.000 description 24
- 230000033001 locomotion Effects 0.000 description 24
- 238000005259 measurement Methods 0.000 description 19
- 238000012545 processing Methods 0.000 description 19
- 238000004891 communication Methods 0.000 description 17
- 238000001514 detection method Methods 0.000 description 15
- 230000015654 memory Effects 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 13
- 239000010813 municipal solid waste Substances 0.000 description 13
- 230000008569 process Effects 0.000 description 13
- 238000003860 storage Methods 0.000 description 12
- 239000013598 vector Substances 0.000 description 12
- 241000282414 Homo sapiens Species 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 8
- 239000000284 extract Substances 0.000 description 6
- 238000003384 imaging method Methods 0.000 description 6
- 230000010354 integration Effects 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000009826 distribution Methods 0.000 description 5
- 230000000977 initiatory effect Effects 0.000 description 5
- 230000000153 supplemental effect Effects 0.000 description 5
- 230000000712 assembly Effects 0.000 description 4
- 238000000429 assembly Methods 0.000 description 4
- 230000006399 behavior Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000036541 health Effects 0.000 description 3
- 241001269238 Data Species 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000005520 cutting process Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000005684 electric field Effects 0.000 description 2
- 230000013016 learning Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 229910052760 oxygen Inorganic materials 0.000 description 2
- 239000001301 oxygen Substances 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 230000036387 respiratory rate Effects 0.000 description 2
- 230000008093 supporting effect Effects 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- UGFAIRIUMAVXCW-UHFFFAOYSA-N Carbon monoxide Chemical compound [O+]#[C-] UGFAIRIUMAVXCW-UHFFFAOYSA-N 0.000 description 1
- 101000822695 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C1 Proteins 0.000 description 1
- 101000655262 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C2 Proteins 0.000 description 1
- MQJKPEGWNLWLTK-UHFFFAOYSA-N Dapsone Chemical compound C1=CC(N)=CC=C1S(=O)(=O)C1=CC=C(N)C=C1 MQJKPEGWNLWLTK-UHFFFAOYSA-N 0.000 description 1
- 101000655256 Paraclostridium bifermentans Small, acid-soluble spore protein alpha Proteins 0.000 description 1
- 101000655264 Paraclostridium bifermentans Small, acid-soluble spore protein beta Proteins 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000017531 blood circulation Effects 0.000 description 1
- 230000036760 body temperature Effects 0.000 description 1
- 230000007177 brain activity Effects 0.000 description 1
- 229910002091 carbon monoxide Inorganic materials 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005352 clarification Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 238000010304 firing Methods 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 210000003127 knee Anatomy 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000002106 pulse oximetry Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000007958 sleep Effects 0.000 description 1
- 239000000779 smoke Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- GOLXNESZZPUPJE-UHFFFAOYSA-N spiromesifen Chemical compound CC1=CC(C)=CC(C)=C1C(C(O1)=O)=C(OC(=O)CC(C)(C)C)C11CCCC1 GOLXNESZZPUPJE-UHFFFAOYSA-N 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 230000009182 swimming Effects 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/02—Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
- A61B5/0205—Simultaneously evaluating both cardiovascular conditions and different types of body conditions, e.g. heart and respiratory condition
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/05—Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves
- A61B5/0507—Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves using microwaves or terahertz waves
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
- A61B5/1113—Local tracking of patients, e.g. in a hospital or private home
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/117—Identification of persons
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/74—Details of notification to user or communication with user or patient ; user input means
- A61B5/7475—User input or interface means, e.g. keyboard, pointing device, joystick
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S13/00—Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
- G01S13/66—Radar-tracking systems; Analogous systems
- G01S13/72—Radar-tracking systems; Analogous systems for two-dimensional tracking, e.g. combination of angle and range tracking, track-while-scan radar
- G01S13/723—Radar-tracking systems; Analogous systems for two-dimensional tracking, e.g. combination of angle and range tracking, track-while-scan radar by using numerical data
- G01S13/726—Multiple target tracking
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/18—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/18—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
- G01S5/28—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves by co-ordinating position lines of different shape, e.g. hyperbolic, circular, elliptical or radial
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
- G06F1/3231—Monitoring the presence, absence or movement of users
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/324—Power saving characterised by the action undertaken by lowering clock frequency
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/329—Power saving characterised by the action undertaken by task scheduling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/28—Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/40—Software arrangements specially adapted for pattern recognition, e.g. user interfaces or toolboxes therefor
- G06F18/41—Interactive pattern learning with a human teacher
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/32—User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/34—User authentication involving the use of external additional devices, e.g. dongles or smart cards
- G06F21/35—User authentication involving the use of external additional devices, e.g. dongles or smart cards communicating wirelessly
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/0304—Detection arrangements using opto-electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
- G06F3/0482—Interaction with lists of selectable items, e.g. menus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04842—Selection of displayed objects or displayed text elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
- G06N5/025—Extracting rules from data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/046—Forward inferencing; Production systems
- G06N5/047—Pattern matching networks; Rete networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/292—Multi-camera tracking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/94—Hardware or software architectures specially adapted for image or video understanding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
- G06V20/653—Three-dimensional objects by matching three-dimensional models, e.g. conformal mapping of Riemann surfaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/26—Techniques for post-processing, e.g. correcting the recognition result
- G06V30/262—Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
- G06V30/274—Syntactic or semantic context, e.g. balancing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/166—Detection; Localisation; Normalisation using acquisition arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/167—Detection; Localisation; Normalisation using comparisons between temporally consecutive images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
- G06V40/173—Classification, e.g. identification face re-identification, e.g. recognising unknown faces across different face tracks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
- G06V40/25—Recognition of walking or running movements, e.g. gait recognition
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07C—TIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
- G07C9/00—Individual registration on entry or exit
- G07C9/20—Individual registration on entry or exit involving the use of a pass
- G07C9/28—Individual registration on entry or exit involving the use of a pass the pass enabling tracking or indicating presence
-
- G—PHYSICS
- G08—SIGNALLING
- G08B—SIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
- G08B13/00—Burglar, theft or intruder alarms
- G08B13/02—Mechanical actuation
- G08B13/14—Mechanical actuation by lifting or attempted removal of hand-portable articles
- G08B13/1427—Mechanical actuation by lifting or attempted removal of hand-portable articles with transmitter-receiver for distance detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/24—Speech recognition using non-acoustical features
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/08—Use of distortion metrics or a particular distance between probe pattern and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L51/00—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
- H04L51/02—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail using automatic reactions or user delegation, e.g. automatic replies or chatbot-generated messages
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
- H04L63/102—Entity profiles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/12—Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/535—Tracking the activity of the user
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/231—Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42203—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/442—Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
- H04N21/44213—Monitoring of end-user related data
- H04N21/44218—Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/442—Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
- H04N21/44213—Monitoring of end-user related data
- H04N21/44222—Analytics of user selections, e.g. selection of programs or purchase activity
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/442—Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
- H04N21/44213—Monitoring of end-user related data
- H04N21/44222—Analytics of user selections, e.g. selection of programs or purchase activity
- H04N21/44224—Monitoring of user activity on external systems, e.g. Internet browsing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/10—Cameras or camera modules comprising electronic image sensors; Control thereof for generating image signals from different wavelengths
- H04N23/11—Cameras or camera modules comprising electronic image sensors; Control thereof for generating image signals from different wavelengths for generating image signals from visible and infrared light wavelengths
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/61—Control of cameras or camera modules based on recognised objects
- H04N23/611—Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
- H04N7/181—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
- H04N7/188—Capturing isolated or intermittent images triggered by the occurrence of a predetermined event, e.g. an object reaching a predetermined position
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
- H04R1/406—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/029—Location-based management or tracking services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/30—Services specially adapted for particular environments, situations or purposes
- H04W4/33—Services specially adapted for particular environments, situations or purposes for indoor environments, e.g. buildings
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/05—Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
- A61B5/1118—Determining activity level
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S11/00—Systems for determining distance or velocity not using reflection or reradiation
- G01S11/14—Systems for determining distance or velocity not using reflection or reradiation using ultrasonic, sonic, or infrasonic waves
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S13/00—Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
- G01S13/02—Systems using reflection of radio waves, e.g. primary radar systems; Analogous systems
- G01S13/06—Systems determining position data of a target
- G01S13/08—Systems for measuring distance only
- G01S13/32—Systems for measuring distance only using transmission of continuous waves, whether amplitude-, frequency-, or phase-modulated, or unmodulated
- G01S13/36—Systems for measuring distance only using transmission of continuous waves, whether amplitude-, frequency-, or phase-modulated, or unmodulated with phase comparison between the received signal and the contemporaneously transmitted signal
- G01S13/38—Systems for measuring distance only using transmission of continuous waves, whether amplitude-, frequency-, or phase-modulated, or unmodulated with phase comparison between the received signal and the contemporaneously transmitted signal wherein more than one modulation frequency is used
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S13/00—Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
- G01S13/86—Combinations of radar systems with non-radar systems, e.g. sonar, direction finder
- G01S13/867—Combination of radar systems with cameras
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S13/00—Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
- G01S13/88—Radar or analogous systems specially adapted for specific applications
- G01S13/887—Radar or analogous systems specially adapted for specific applications for detection of concealed objects, e.g. contraband or weapons
- G01S13/888—Radar or analogous systems specially adapted for specific applications for detection of concealed objects, e.g. contraband or weapons through wall detection
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/16—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using electromagnetic waves other than radio waves
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/038—Indexing scheme relating to G06F3/038
- G06F2203/0381—Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2111—Location-sensitive, e.g. geographical location, GPS
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2117—User registration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20101—Interactive definition of point of interest, landmark or seed
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30204—Marker
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30232—Surveillance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/05—Recognition of patterns representing particular kinds of hidden objects, e.g. weapons, explosives, drugs
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07C—TIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
- G07C9/00—Individual registration on entry or exit
- G07C9/30—Individual registration on entry or exit not involving the use of a pass
- G07C9/32—Individual registration on entry or exit not involving the use of a pass in combination with an identity check
-
- G—PHYSICS
- G08—SIGNALLING
- G08B—SIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
- G08B29/00—Checking or monitoring of signalling or alarm systems; Prevention or correction of operating errors, e.g. preventing unauthorised operation
- G08B29/18—Prevention or correction of operating errors
- G08B29/185—Signal analysis techniques for reducing or preventing false alarms or for enhancing the reliability of the system
- G08B29/186—Fuzzy logic; neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0635—Training updating or merging of old and new templates; Mean values; Weighting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/228—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/90—Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Computer Networks & Wireless Communication (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Animal Behavior & Ethology (AREA)
- Surgery (AREA)
- Heart & Thoracic Surgery (AREA)
- Pathology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Biophysics (AREA)
- Social Psychology (AREA)
- Evolutionary Computation (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Computing Systems (AREA)
- Computer Hardware Design (AREA)
- Physiology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
Abstract
向智能助理计算机注册人包括获得经由一个或多个相机捕捉的描绘最初未注册人的一个或多个图像帧。该最初未注册人的面部识别数据从该一个或多个图像帧中提取。注册该最初未注册人的口述命令经由一个或多个话筒被接收。在确定该口述命令源自具有预建立的注册特权的注册人之际,通过在新注册人的人员简档中将一个或多个附加特权与该面部识别数据相关联来将该最初未注册人注册为新注册人。
Description
背景
智能助理计算机可以向用户提供语音交互、音乐回放、天气或新闻信息、和搜索界面,仅举数例。智能助理计算机可以向家庭或工作场所的多个人提供对一些信息的访问。然而,由智能助理计算机提供的其他信息可以是特定个人私有的,诸如举例而言入站通信。
概述
提供本概述以便以简化的形式介绍以下在详细描述中进一步描述的概念的选集。本概述并不旨在标识所要求保护的主题的关键特征或必要特征,亦非旨在用于限制所要求保护的主题的范围。此外,所要求保护的主题不限于解决在本公开的任一部分中提及的任何或所有缺点的实现。
向智能助理计算机注册人包括获得经由一个或多个相机捕捉的描绘最初未注册人的一个或多个图像帧。最初未注册人的面部识别数据从该一个或多个图像帧中提取。注册最初未注册人的口述命令经由一个或多个话筒被接收。在确定该口述命令源自具有预建立的注册特权的注册人之际,通过在新注册人的人员简档中将一个或多个附加特权与该面部识别数据相关联来将该最初未注册人注册为新注册人。这些附加特权可以准许新注册人发起由智能助理计算机执行的一项或多项操作,这些操作先前在注册之前是不被准许的。
附图简述
图1描绘了智能助理计算系统的示例使用环境。
图2是描绘示例智能助理计算系统的示意图。
图3是描绘用于向智能助理计算机注册人的示例方法的流程图。
图4描绘了其中在由智能助理计算系统捕捉到最初未注册人的图像和/或音频数据之后发起并执行该最初未注册人的注册的示例实现的时间线。
图5描绘了其中发起最初未注册人的注册并且作为注册操作的一部分由智能助理计算系统捕捉该最初未注册人的图像和/或音频数据的另一示例实现的时间线。
图6示意性地示出了根据本公开的一示例的可由话音监听器利用的语音识别程序。
图7示出了根据本公开的一示例的意图模板。
图8示意性地示出了根据本公开的一示例的处理对话的一部分的解析器和意图处理器。
图9示意性地示出了根据本公开的各示例的可确定一个或多个实体的身份、位置和/或当前状态的实体跟踪器。
图10示意性地示出了根据本公开的各示例的实现智能助理计算系统的一体化计算设备。
图11示意性地示出了根据本公开的各示例的其中一个或多个远程服务用个体场所内(on-premises)计算设备执行智能助理计算系统的功能的示例实现。
图12示意性地示出了根据本公开的各示例的其中一个或多个远程服务结合多个独立场所内传感器和/或设备执行智能助理计算系统的功能的另一示例实现。
图13示意性地示出了根据本公开的各示例的其中一个或多个远程服务利用设备选择器的另一示例实现。
图14示意性地示出了其中在检测到一个或多个口述关键词之际激活智能助理计算系统的一个或多个功能的示例实现。
图15示意性地示出了根据本公开的各示例的其中响应于语音激活而选择(诸)传感器和(诸)输出设备的多设备环境的示例实现。
图16示意性地示出了根据本公开的各示例的计算系统。
详细描述
图1描绘了智能助理计算系统100的示例使用环境。在该示例中,第一人120通过大声说出由计算系统100经由话筒捕捉的短语来向该计算系统介绍第二人122。例如,图1中描绘了第一人120正说出短语:“嘿计算机,这是我的朋友Tom(Hey Computer,this is myfriend Tom)”,这是指第二人122。由第一人120进行的这种介绍可被用于发起第二人122关于计算系统100的注册。
利用传统计算系统来注册用户对于一些用户而言可能是麻烦且令人沮丧的。通常,管理用户的任务是在图形用户界面内导航计算机程序的非直观菜单和设置。由本文公开的智能助理计算系统支持的自然语言界面使用户能够通过使用直观的人对人介绍向该计算系统介绍新用户来注册那些新用户。例如,可以通过口述短语向计算系统宣告新人的姓名和/或与注册人的关系状态以注册新人。以此方式,用户可以按更类似于基于人类的交互的直观方式来与计算系统对话。
在该示例中,第一人120向计算系统100注册,并且可被称为关于该计算系统的注册人或用户。例如,第一人120可以是计算系统100的所有者、主用户或管理用户,其先前已参与关于该计算系统的注册操作。向计算系统100注册的人可以获得关于该计算系统的附加特权,如将在本文中进一步详细地描述的。相反,第二人122最初未向计算系统100注册,并且可被称为关于该计算系统的最初未注册人或用户。例如,第二人122可以是在由计算系统100监视的位置130处访问第一人120的访客。在该示例中,位置130是第一人120的住所内的起居室。
智能助理计算系统100包括提供智能助理服务的一个或多个计算设备。因此,计算系统100至少包括提供智能助理服务的智能助理计算设备110,即智能助理计算机。在至少一些实现中,计算设备110可以采用场所内一体化智能助理计算设备的形式。计算设备110可包括与该计算设备或其外壳集成和/或板载地位于其上的一个或多个图形显示设备、一个或多个音频扬声器、一个或多个话筒、一个或多个相机等。
然而,在至少一些实现中,计算设备110可以是智能助理计算系统100的多个组件之一。例如,除了计算设备110之外,计算系统100还可包括一个或多个其他计算设备、图形显示设备、音频扬声器、话筒、相机等。图1描绘了计算系统100的图形显示设备112、音频扬声器114和116、以及相机118的示例,其相对于位置130位于场所内,但在物理上与计算设备110分开。计算系统100可包括位于同一场所的不同位置处和/或位于远离场所(例如,基于云的服务器)的一个或多个计算设备。计算设备110可以使用有线和/或无线连接来与一个或多个其他设备可操作地连接。例如,计算设备110可以使用任何合适的有线和/或无线通信协议集经由通信网络通信地耦合到一个或多个其他计算设备、传感器设备或受控设备。
如本文进一步详细描述的,计算系统100可被配置成检测被监视区域内的人的存在,单独地跟踪那些人的空间位置,与那些人通信,使用经由一个或多个相机捕捉的图像数据和/或经由一个或多个话筒捕捉的音频数据以及其他传感器输入来单独地标识那些人。计算系统100可被配置成接收并处理自然语言输入,诸如举例而言口述短语。
扮演用户角色的人可以利用由计算系统100支持的智能助理特征来实现大量功能。例如,用户可以提供自然语言输入(例如,口述命令)以命令计算系统100执行各种操作,诸如提供对查询的信息响应、发送或呈现通信消息、呈现音频/视频内容、捕捉和存储图像或音频内容、从一个设备向另一设备传递用户会话的实例、或者控制其他设备,仅举数例。这些各种操作中的一些或全部可以与并非对所有用户(诸如举例而言,未注册人)可用的特权相关联。例如,用户可向计算系统100询问有关各种各样的主题的信息,诸如天气、个人日历事件、电影放映时间等。作为另一示例,用户可以经由计算系统100来控制其他设备,诸如图形显示设备112、音频扬声器114和116、燃气壁炉140或电动窗帘142。作为又一示例,计算系统100可被用于接收和存储将在适当的将来时间被递送的消息和/或提醒。
图2是描绘提供智能助理服务的示例智能助理计算系统200的示意图。计算系统200是图1的计算系统100的非限制性示例。计算系统200能够识别和响应自然语言输入。如参考图1的计算系统100类似地描述的,计算系统200可被实现为单个计算设备或两个或更多个设备。计算系统200的两个或更多个设备可被分布在要被智能助理服务服务的场所的不同位置处和/或该计算系统的两个或更多个设备是地理上分布的(例如,在云支持的网络配置中)。
计算系统200包括至少一个传感器220、实体跟踪器210、话音监听器230、解析器240、意图处理器250、承诺引擎260和至少一个输出设备270。在一些示例中,传感器220可包括一个或多个话筒222、可见光相机224、红外相机226和诸如Wi-Fi或蓝牙模块之类的连接设备228。在一些示例中,(诸)传感器220可包括立体和/或深度相机、头部跟踪器、眼睛跟踪器、加速度计、陀螺仪、注视检测设备、电场感测部件、GPS或其他位置跟踪设备、温度传感器、设备状态传感器、和/或任何其他合适的传感器。
实体跟踪器210被配置成检测实体(包括人、动物或其他生物以及非生物对象)及其活动。实体跟踪器210包括实体标识符212,其被配置成识别个体用户和/或非生物对象。话音监听器230接收音频数据并利用语音识别功能来将口述话语翻译成文本。话音监听器230还可向经翻译的文本指派(诸)置信值,并可执行发言者识别以确定正在发言的人的身份,以及向此类标识的准确度指派概率。解析器240分析从话音监听器230接收到的文本和置信值以导出用户意图并生成相应的机器可执行语言。
意图处理器250从解析器240接收表示用户意图的机器可执行语言,并辨析缺失的和有歧义的信息以生成承诺。承诺引擎260存储来自意图处理器250的各承诺。在上下文适当的时间,承诺引擎可递送一条或多条消息和/或执行与一个或多个承诺相关联的一个或多个动作。承诺引擎260可将消息存储在消息队列262中或者使一个或多个输出设备270生成输出。输出设备270可包括(诸)扬声器272、(诸)视频显示器274、(诸)指示灯276、(诸)触觉设备278和/或其他合适的输出设备中的一者或多者。在其他示例中,输出设备270可包括可经由承诺引擎260执行的动作被控制的诸如家庭照明、恒温器、媒体程序、门锁等的一个或多个其他设备或系统。
在不同的示例中,话音监听器230、解析器240、意图处理器250、承诺引擎260和/或实体跟踪器210可被包含于存储在存储器中并由计算设备的一个或多个处理器执行的软件中。在一些实现中,专门编程的逻辑处理器可被用于提高智能助理计算机的计算效率和/或有效性。
在一些示例中,话音监听器230和/或承诺引擎260可以从实体跟踪器210接收包括相关联置信值的上下文信息。如下文更详细地描述的,实体跟踪器210可确定一个或多个传感器的范围内的一个或多个实体的身份、位置和/或当前状态,并且可将此类信息输出到诸如话音监听器230、承诺引擎260之类的一个或多个其他模块。在一些示例中,实体跟踪器210可解释和评估从一个或多个传感器接收到的传感器数据,并可基于该传感器数据来输出上下文信息。上下文信息可包括实体跟踪器基于接收到的传感器数据对一个或多个检测到的实体的身份、位置和/或状态的猜想/预测。
图3是描绘用于向智能助理计算机注册人的示例方法300的流程图。方法300可由包括智能助理计算机的计算系统(诸如举例而言图1和2的先前描述的计算系统)来执行。
在310,最初未注册人的视觉捕捉可被执行。如以下进一步详细地描述的,计算系统可以将从由该计算系统经由一个或多个相机观察到的人的图像帧中提取的面部识别数据与先前观察到的人的数据库进行比较,以确定一个人是已注册还是未注册。如果一个人不能与先前观察到的人相匹配,则计算系统可以为该未被识别的人建立新的人员简档,并将该人标识为最初未注册。
在312,该方法包括获得经由一个或多个相机捕捉的描绘最初未注册人的一个或多个图像帧。该一个或多个图像帧可以形成经由一个或多个相机捕捉的一个或多个视频片段的一部分。这些图像帧或其视频片段可以从多个相机角度和/或在由最初未注册人到被监视区域的多次访问中捕捉出现在该被监视区域内的最初未注册人。因此,该一个或多个图像帧可以在跨越时刻、分钟、小时、天或其他合适的时间段的不同时间点描绘最初未注册人。在312捕捉的一个或多个图像帧或其视频片段可包括被监视区域的可见光、红外和/或深度表示。在312捕捉的一个或多个图像帧或其视频片段可被按原始的和/或经处理的形式存储在数据存储系统中,并可被随后从该数据存储系统中检索以用于由计算系统进行进一步处理或者用于后续呈现和由用户审查。
在314,该方法包括从该一个或多个图像帧中提取该最初未注册人的面部识别数据。在314提取的面部识别数据可被存储在数据存储系统中。作为示例,在342,在314提取的面部识别数据可以与为该最初未注册用户建立的人员简档相关联。可以由计算系统通过将从图像帧中提取的面部识别数据与存储在数据存储系统中的先前获得的面部识别数据(诸如举例而言与各人员简档相关联的面部识别数据)的数据库进行比较来视觉地标识人。该数据库可以由注册人和未注册人来组织,以使计算系统能够将注册人与未注册人区分开。在至少一些实现中,在通过视觉检测来标识或以其他方式检测到未注册人的存在之际,计算系统可以在316向与该计算系统相关联的注册人输出通知。该通知可以使注册人能够审查描绘未注册人的图像帧,并且如果需要,则为该未注册人发起注册操作。
在320,最初未注册人的听觉捕捉可被执行。如以下进一步详细地描述的,计算系统可以将从由该计算系统经由一个或多个话筒观察到的人的语音的音频片段中提取的发言者识别数据与先前观察到的语音的数据库进行比较,以确定一个人是已注册还是未注册。如果一个人不能与先前观察到的人相匹配,则计算系统可以为该未被识别的人建立新的人员简档,并将该人标识为最初未注册。
在322,该方法包括获得经由一个或多个话筒捕捉的包括由最初未注册人说出的一个或多个单词或短语的一个或多个音频片段。这些音频片段可以从多个话筒位置和/或在由最初未注册人到被监视区域的多次访问中捕捉正在该被监视区域内发言的最初未注册人。因此,该一个或多个音频片段可以在跨越时刻、分钟、小时、天或其他合适的时间段的不同时间点捕捉源自最初未注册人的语音。在312捕捉的一个或多个音频片段可被按原始的和/或经处理的形式存储在数据存储系统中,并可被随后从该数据存储系统中检索以用于由计算系统进行进一步处理或者用于后续呈现和由用户审查。
在324,该方法包括从该一个或多个音频片段中提取该最初未注册人的发言者识别数据。在324提取的发言者识别数据可被存储在数据存储系统中。作为示例,在344,在324提取的面部识别数据可以与为该最初未注册人建立的人员简档相关联。可以由计算系统通过将从音频片段中提取的发言者识别数据与存储在数据存储系统中的先前获得的发言者识别数据(诸如举例而言与各人员简档相关联的发言者识别数据)的数据库进行比较来听觉地标识人。如先前参考人的视觉检测所描述的,该数据库可以由注册人和未注册人来组织,以使计算系统能够将注册人与未注册人区分开。
在通过听觉和/或视觉检测来标识或以其他方式检测到未注册人的存在之际,计算系统可以在316向与该计算系统相关联的注册人输出通知,如先前参考人的视觉检测所描述的。该通知可以使注册人能够审查未注册人的语音或描绘的音频片段和/或视频片段或图像帧,并且如果需要,则为该未注册人发起注册操作。
在至少一些实现中,可以通过观察由一个人在经由一个或多个相机捕捉的一个或多个视频片段内进行的发言活动来获得该人的发言者识别数据。作为示例,最初未注册人的发言活动可被在一个或多个视频片段内标识。经由一个或多个话筒捕捉的与该一个或多个视频片段在时间上匹配的一个或多个音频片段可由计算系统获得。最初未注册人的发言者识别数据可基于与该一个或多个视频片段中描绘的该最初未注册人的发言活动相对应的一个或多个口述单词或短语来从该一个或多个音频片段中提取。
在330,最初未注册人的注册由注册人发起。在332,该方法包括经由一个或多个话筒接收注册最初未注册人的口述命令。注册一个人的口述命令可以采用各种形式,可以随实现而变化,并且可以通过用户设置而是用户定义的。在图1的先前示例中,注册人120大声说出短语“嘿计算机,这是我的朋友Tom(Hey Computer,this is my friend Tom)”是可被用于注册最初未注册人的口述命令的非限制性示例。可以在检测到由用户说出的一个或多个关键词或关键短语之际发起或以其他方式激活计算系统的一个或多个操作。例如,短语“嘿计算机(Hey Computer)”可被用作关键词短语以发起或激活计算系统的一个或多个操作,诸如监听进一步指示要执行的一个或多个附加操作的一个或多个附加关键词或关键短语。在至少一些示例中,计算系统可能需要或以其他方式依赖术语“注册”或其他合适的关键词或关键短语来发起注册操作。在又一示例中,计算系统可以向注册人输出关于是否要注册未注册人的询问(例如,视觉和/或听觉)。响应于此类询问,口述命令可以采用肯定的形式,诸如“是(yes)”或“好(ok)”。因此,在至少一些实现中,口述命令可以采用单个词、单个音位或音位集的形式。
虽然在该示例中描述了口述命令,但是将理解,可以使用其他形式的用户输入来接收其他合适的命令(即,非口述和/或非听觉命令)以发起人的注册。例如,可以经由基于硬件的用户界面(诸如触摸屏、键盘或键区、计算机鼠标或其他定点设备、或者其他合适的用户输入设备)来接收指示命令的用户输入。
在334,该方法包括确定口述命令或其他合适的命令源自具有预建立的注册特权的注册人。可以从存储在数据存储系统中的注册人的人员简档中检索或以其他方式引用预建立的注册特权。注册人可以通过本文中进一步详细地描述的视觉、听觉或其他合适技术中的一种或多种来标识。附加地或替换地,注册人可以基于上下文来标识,而不依赖于对该注册人的听觉或视觉检测。例如,注册用户可以经由基于硬件的用户界面来提供非口述命令,其中该注册人先前已经由该基于硬件的用户界面使用用户凭证(例如,用户名和/或密码)登录到计算系统中。
在336,该方法包括执行注册操作以将最初未注册人注册为新注册人。在确定口述命令或其他合适的命令源自具有预建立的注册特权的注册人之际,可以在336执行注册。与特定人相关联的注册特权标示该人被准许向计算系统注册其他人。在至少一些实现中,响应于注册操作的发起而可以在340为最初未注册人建立人员简档。然而,如先前关于在310和320对最初未注册人的视觉和听觉捕捉所描述的,图像数据和/或音频数据可以与在发起注册操作之前为该人建立的人员简档相关联。在该实现中,在由计算系统通过视觉或听觉标识以及与先前观察到的人的比较来标识新人之际,可以建立人员简档。人员简档可以通过将简档标识符341(例如,标识符的特定域内的唯一标识符)指派或关联到该人员简档来建立,以使得该人员简档能够与其他人员简档区分开。
可以通过在346将关于计算系统的一个或多个附加特权与新注册人的人员简档相关联来执行注册。取决于实现,包括面部识别数据的图像数据和/或包括发言者识别数据的音频数据可以在注册时、在发起注册之前(例如,如果先前已为当时的最初未注册人而获取)、或在注册之后(例如,在获取时)与人员简档中的附加特权相关联,如以下将进一步详细地描述的。
在至少一些实现中,可以在接收到注册最初未注册人的口述命令之后捕捉在312获得的图像帧中的一些或全部。作为示例,可以引导最初未注册人将其面部定位在一个或多个相机的视野内以捕捉用于面部识别和提取面部识别数据的一个或多个图像帧。附加地或替换地,可以在接收到注册最初未注册人的口述命令之后捕捉在322获得的一个或多个音频片段中的一些或全部。作为示例,可以引导最初未注册人在计算系统的一个或多个话筒的范围内说出一个或多个单词或短语,以捕捉用于发言者识别和提取发言者识别数据的一个或多个音频片段。
引导最初未注册人进行图像或音频捕捉可包括经由音频扬声器输出听觉引导和/或经由图形显示设备输出视觉引导中的一者或多者。计算系统可以响应于或稍后在从注册人接收到注册最初未注册人的口述命令之后引导该最初未注册人。例如,再次参考图1,由相机(例如,计算设备110的相机或相机118)捕捉的人122(即,“Tom”)的一个或多个图像帧可被呈现在图形显示设备112上以使人122能够将他的面部定位在该相机的视野内。计算系统可以通过输出关于人靠近/远离、往上/往下、往右/往左、讲话更响、重复单词或短语等的视觉和/或听觉提示来向该人提供反馈。
在至少一些实现中,可以在经由一个或多个相机捕捉到描绘最初未注册人的一个或多个图像帧中的一些或全部之后接收注册该最初未注册人的口述命令。在这些实现中,在接收到注册最初未注册人的口述命令之前,该一个或多个图像帧可被存储在数据存储设备中。该一个或多个图像帧可被从数据存储系统中检索,并且经由图形显示设备呈现以供注册人审查。例如,再次参考图1,在人120提供注册人122的口述命令之前,人122(即,“Tom”)的一个或多个图像帧可由相机(例如,计算设备110的相机或相机118)捕捉。
在至少一些实现中,可以在经由一个或多个话筒捕捉到包含最初未注册人的口述单词或短语的一个或多个音频片段中的一些或全部之后接收注册该最初未注册人的口述命令。在这些实现中,在接收到注册最初未注册人的口述命令之前,该一个或多个音频片段可被存储在数据存储设备中。该一个或多个音频片段可被从数据存储系统中检索,并且经由音频扬声器呈现以供注册人审查。例如,再次参考图1,包含第二人122的口述单词或短语的音频片段可以经由音频扬声器114和116输出,以在人120提供注册人122的口述命令之前供人120进行审查。
在最初未注册人离开一个或多个相机的视野之后,或者在未注册人离开被计算系统监视的区域之后,可以呈现该最初未注册人的一个或多个图像帧和/或音频片段。例如,在最初未注册人离开被监视区域片刻、几分钟、几小时、几天、或其他合适的时间段之后,注册过程可由注册人在呈现该最初未注册人的图像帧和/或音频片段之后发起。
口述命令可以在呈现最初未注册人的一个或多个图像帧和/或音频片段期间或之后被接收。例如,再次参考图1,人120可以在审查图形显示设备112上的第二人122的图像帧时提供口述命令,无论第二人122是否仍然出现在区域130。在至少一些实现中,可以响应于由注册人发起的命令计算系统呈现由该计算系统捕捉的未注册人的图像或音频的另一命令(诸如口述或其他另一命令)而呈现一个或多个图像帧。
在350,新注册人发起操作。在该示例中,该操作由口述命令发起。例如,在352,该方法包括经由一个或多个话筒接收后续口述命令以执行一个或多个操作。在354,该方法包括基于发言者识别数据来确定该后续口述命令源自具有一个或多个附加特权的新注册人。作为示例,计算系统可以检索或以其他方式引用与一些或全部人员简档相关联的发言者识别数据,以标识该口述命令源自的特定人。
在356处,该方法包括响应于该口述命令而执行由一个或多个附加特权准许的一个或多个操作中的操作。每个特权可以准许将由智能助理服务响应于源自与该特权相关联的人的命令而执行的一个或多个相关联的操作。作为示例,与新注册人相关联的一个或多个附加特权可以准许将由智能助理服务响应于源自该新注册人的口述命令而执行的先前在注册之前未被准许的一个或多个操作。在至少一些实现中,未注册用户或未标识人员可具有初始特权集(即,基本特权集)。在注册之后,新注册用户可以提供口述命令,例如,打开由计算系统服务的房间中的灯。响应于将该口述命令标识为源自与准许打开灯的特权相关联的新注册人,计算系统可以输出控制信号以打开房间中的灯。在该示例中,新注册人在注册之前(作为最初未注册人)可能在最初指派给所有未注册用户或未标识人员的基本特权集内不具有准许由计算系统打开灯的特权。作为另一示例,特权标识符可以指示新注册人是否被准许注册其他最初未注册人(即,注册特权)。在该示例中,新注册人在注册之前可能在基本特权集内不具有注册特权。
在332接收的口述命令可包括源自注册人的口述短语或者形成源自注册人的口述短语的一部分,其进一步包括新注册人的人标识符和/或特权标识符。人标识符可以采用要指派给正被注册的人的人的姓名或昵称的形式。在348,该方法可进一步包括将人标识符与新注册人的人员简档相关联。在346,该方法可进一步包括将特权标识符与新注册人的人员简档相关联。
特权标识符可被用于标识与正被注册的人的人员简档相关联的一个或多个附加特权。计算系统可以支持一个、两个或更多个特权标识符,其各自具有其自己相应的特权集。特权标识符可以取决于实现而采用各种形式并且可以是在用户设置内由用户定义的。作为示例,特权标识符可以采用基于关系的关键词或关键短语的形式,诸如“朋友”、“家庭”、“客人”、“同事”、“妻子”、“丈夫”、“孩子”等,其中每个关键词或关键短语指代其自己相应的特权集。作为另一示例,特权标识符可以采用关键词或关键短语值的分级集合的形式,诸如“级别1”、“级别2”、“级别3”等,其中每个关键词或关键短语同样指代其自己相应的特权集。
在向计算系统开始注册一个人之后,其可以保持注册人达预定义历时,在该历时之后该注册可以可任选地被计算系统终止。作为示例,该预定义历时可以是几分钟、几小时、几天、几年或不确定的。附加地或替换地,预定义历时可以是基于计算系统的被监视区域内的注册人的持续在场。例如,一个人可以保持被注册,直到该人离开该场所或该人离开该场所但在离开的阈值时间段内未返回被监视区域。预定义历时和/或阈值时间段可通过由计算系统维护的用户设置来设置或以其他方式定义。新注册人的用户设置可以对与注册特权或其他类型的特权相关联的另一注册人可用。用户设置可以对任何注册人关于与其自己的注册相关联的预定义历时可用,从而使该人能够终止向计算系统进行注册。在人员注册终止之际,该人的人员简档和相关联数据可被删除、使得无法访问、被盖写、或使得可用于被新数据盖写。以此方式,与人的注册相关联的信息的寿命可被控制,并且其进一步传播或使用可被限制。
图4描绘了其中在由智能助理计算系统捕捉到最初未注册人的图像和/或音频数据之后发起并执行该最初未注册人的注册的示例实现的时间线。在410,由计算系统在未注册人到被监视区域的一次或多次访问中捕捉该未注册人的图像数据和/或音频数据。例如,计算系统可以执行与在310的用于获得图像数据的对最初未注册人的视觉捕捉和/或在320的用于获得音频数据的对最初未注册人的听觉捕捉相关联的图3的先前描述的操作。图像数据可包括一个或多个图像、一个或多个视频片段、和/或从其导出的面部识别数据。音频数据可包括一个或多个音频片段和/或从其导出的发言者识别数据。
在412,注册人(例如,具有注册特权)发起最初未注册人的注册,并且注册由计算系统基于先前捕捉的图像数据和/或音频数据来执行。例如,计算系统可以执行与在330的注册操作和在340的用户简档相关联的图3的先前描述的操作。在注册之后,新注册人被授予一个或多个附加特权,如在414指示的。此外,在注册之后,计算系统可以可任选地获得新注册人的补充图像数据和/或音频数据,如在416指示的。在418,新注册人发起由授予该人的特权准许的操作。
在注册之后获得的补充图像数据可包括除了在注册之前捕捉的图像帧和/或在没有该正被注册的人的任何先前捕捉的图像帧的情况下在注册之前捕捉的一个或多个音频片段之外的新注册人的一个或多个图像帧。例如,对新注册人的语音的听觉检测可被用于在注册之后在视觉上标识发言者,并且捕捉该人的图像帧和从其导出的面部识别数据。补充图像数据可被用于提取面部识别数据和/或进一步细化或更新该人的面部识别数据。在至少一些实现中,补充图像数据可包括从可在通信网络上被计算系统访问的第三方源(诸如社交媒体服务、照片或视频库等)获得的图像或视频片段。
在注册之后获得的补充音频数据可包括除了在注册之前捕捉的音频片段和/或在没有该正被注册的人的任何先前捕捉的音频片段的情况下在注册之前捕捉的图像帧之外的捕捉新注册人的语音的一个或多个音频片段。例如,对新注册人的发言活动的视觉检测可被用于在注册之后在视觉上标识并捕捉该人的音频片段和从其导出的发言者识别数据。补充音频数据可被用于提取发言者识别数据和/或进一步细化或更新该人的发言者识别数据。在至少一些实现中,补充音频数据可包括从可在通信网络上被计算系统访问的第三方源(诸如社交媒体服务、照片或视频库等)获得的音频片段(个体地以及与视频片段相关联的那些)。
图5描绘了其中发起最初未注册人的注册并且作为注册操作的一部分由智能助理计算系统捕捉该最初未注册人的图像和/或音频数据的另一示例实现的时间线。在510,注册人(例如,具有注册特权)发起最初未注册人的注册。在512,计算系统可任选地关于捕捉用于面部识别的人的面部的一个或多个图像帧和/或用于发言者识别的人的话音或语音的一个或多个音频片段来引导最初未注册人。在514,捕捉图像帧和/或音频片段,并从中获得面部识别数据和/或发言者识别数据。在516,计算系统基于正被注册的人的所捕捉的图像帧和/或音频片段来执行注册操作。在注册之后,新注册人被授予一个或多个特权,如在518指示的。在520,可任选地获得补充图像数据和/或补充音频数据以提取、细化或更新面部识别数据和/或发言者识别数据。在522,新注册人发起由授予该人的特权准许的操作。
虽然图4和5的时间线描绘了不同的实现,但是将理解,这些实现可被组合到工作流中,该工作流包括在注册之前捕捉该人的图像数据和/或音频数据,作为注册操作的一部分捕捉该人的图像数据和/或音频数据,以及可任选地在注册之后捕捉该人的补充图像数据和/或音频数据。利用所捕捉的每个新图像帧或音频片段,可以细化和改进面部识别数据和/或发言者识别数据,以提供对该人的更准确的检测和标识。
再次参考图2,现在将提供对智能助理计算系统200的各组件的附加描述。在一些示例中,话音监听器230可从周围环境接收音频数据。在一些示例中,诸如在图1的计算设备110中,话音监听器230可包含被具体化在包括一个或多个话筒的独立设备中的软件模块。在其他示例中,话音监听器230软件模块可被存储在位于远离用户环境的计算设备的存储器中(诸如在基于云的服务中)。在一些示例中,话音监听器230在执行下文更详细地描述的其功能时,可接收并利用来自一个或多个其他传感器的附加数据。话音监听器230可包括将口述话语的音频数据翻译成文本的语音识别功能。如下文更详细地描述的,话音监听器230还可将置信值指派给经翻译文本的一个或多个部分,诸如单个语音成分、单词、短语等。
现在参考图6,在一些示例中,话音监听器630可包括存储在计算设备624的非易失性存储622中的语音识别程序620。语音识别程序620可被加载到存储器626中并由计算设备624的处理器628执行以执行下文更详细地描述的用于语音识别的方法和过程中的一者或多者。
自然语言语音形式的音频输入630可以由话筒625捕捉并由音频处理器634处理以创建音频数据。来自音频处理器634的音频数据可由特征提取器636变换成数据以供语音识别程序620的语音识别引擎640处理。在一些示例中,特征提取器636可在一时间区间上标识音频数据的包含用于处理的语音的各部分。特征提取器636可从数据的这些部分提取特征向量642,其中特征向量表示在给定部分的该时间区间内口述话语的质量。多个特征向量642的矩阵可被提供给语音识别引擎640以供进一步处理。
特征提取器636可利用任何合适的降维技术来处理音频数据并生成特征向量642。示例技术包括使用梅尔频率倒频谱系数(MFCC)、线性判别分析、深度神经网络技术等。
语音识别引擎640可将由特征提取器636生成的特征向量642与用于语音声音(例如,语音分量)的声学模型进行比较。语音分量的各示例可包括音位(phoneme)、单音素(phone)、双音素(diphone)、三音素(triphone)等。在一些示例中,语音识别引擎640可包括评估由一个或多个特征向量642表示的口述话语与语言声音的声学模型的相似性的声学表示生成器644(例如,声学建模器)。声学模型可包括将语音分量(诸如音位)的发音与特定单词和/或短语匹配的数据。
语音识别引擎640还可将特征向量和其他音频数据与声音序列进行比较,以标识与音频数据的口述声音匹配的单词和/或短语。语音识别程序620可包括可利用语言模型来评估特定单词将被包括在短语(在一些情形中可包括句子)中的特定位置处的似然性的语言表示生成器646(例如,语言建模器)。就本公开而言,短语可包括可能被或可能不被认为是完整句子的两个或更多个单词。
在一些示例中,语音识别引擎640可利用隐马尔可夫模型(HMM)来将特征向量642与音位和/或其他语音分量进行匹配。HMM输出n维向量的序列,其中n是诸如10之类的整数。可以以给定频率生成序列,诸如每10毫秒生成一个序列。
HMM的每个状态可包括对角协方差高斯的混合的统计分布,这可指示每个观察到的向量的似然性。每个音位或单词可具有不同的输出分布。用于单独的音位和单词的各个体HMM可被组合以创建用于音位或单词序列的HMM。
各音位的上下文依赖关系可以由HMM的不同状态提供。此类上下文相关的HMM状态可以与诸如高斯混合模型(GMM)之类的模型相关联。在一些示例中,各状态之间的转换可被指派与当前状态可从先前状态到达的似然性相对应的概率。HMM的各状态之间的不同路径可以表示经输入的声音,其中不同路径表示针对相同声音的多个可能的文本匹配。
使用特征提取器636和语音识别引擎640,语音识别程序620可处理特征向量642和其他语音识别数据648以生成经识别文本666。在其他示例中,可利用用于将特征向量642与音位和/或其他语音分量进行匹配的任何合适的技术。
在一些示例中,语音识别程序620可为语音识别数据648的一个或多个部分(诸如各个体语音分量、单词和短语)确定经估计置信值652。经估计置信值652可定义相应的经识别文本准确的统计似然性。如下文更详细地描述的,智能助理计算系统200的解析器240可在处理经识别文本和确定用户的意图时利用此类置信值652。
在不同的示例中,可通过利用一个或多个统计分析方法、机器学习技术、依经验导出的数据以及前述的组合来确定置信值652。在一些示例中,语音识别程序620可利用一个或多个概率模型来分析语音识别数据648的各部分、从语音识别分析流水线提取的一个或多个结果、和/或与此类部分相关联的经估计置信值652。例如,GMM可被利用以分析语音识别数据648的各部分和相应的结果。将理解,可利用诸如各种有监督学习和无监督学习方法之类的任何其他合适的机器学习技术来分析语音识别数据648。
将理解,对语音识别技术的前述描述仅仅是示例,并因此可在本公开的范围内利用和构想任何合适的语音识别技术和过程。
再次参考图2,在一些示例中,话音监听器230可从实体跟踪器210接收包括相关联的置信值的上下文信息。如下文更详细地描述的,实体跟踪器210可确定一个或多个传感器的范围内的一个或多个实体的身份、位置和/或当前状态,并且可将此类信息输出到诸如话音监听器230、承诺引擎260等一个或多个其他模块。在一些示例中,实体跟踪器210可解释和评估从一个或多个传感器接收到的传感器数据,并可基于该传感器数据来输出上下文信息。上下文信息可包括实体跟踪器基于接收到的传感器数据对一个或多个检测到的实体的身份、位置和/或状态的猜想/预测。在一些示例中,猜想/预测可附加地包括定义信息准确的统计似然性的置信值。
继续参考图2,话音监听器230可将经识别文本和对应的置信值发送到解析器240。如下文更详细地描述的,解析器240分析文本和置信值以确定用户说出接收到的话语的意图。解析器240可将从话音监听器230接收到的自然语言文本翻译成表示自然语言背后的用户意图的机器可执行语言。
在一些示例中,用户的意图可对应将被立即执行的命令,诸如话语“播放艺术家B的歌曲A(Play song A by artist B)”(“播放音乐”意图)。在一些示例中,意图可被表征为在触发器发生之际执行动作的承诺(commitment),在下文中被称为“添加承诺”意图。例如,话语“当Bob回到家时提醒他把垃圾拿出去(When Bob gets home remind him to takeout the trash)”是添加承诺意图。在该示例中,触发器是Bob到家,而动作是提醒他把垃圾拿出去。添加承诺意图的另一示例可以是话语“当Keith在烤箱附近时,提醒我(When Keithis near the oven,alert me)”。在该示例中,该添加承诺意图的承诺包括触发器(Keith在烤箱附近)和当检测到该触发器时将被执行的动作(提醒我)。下文提供了关于承诺的附加描述和示例。
在一些示例中,解析器240可利用各自包含多个槽的多个意图模板,这些槽可填充有从话音监听器230接收到的单词或术语,或者基于从话音监听器接收到的其他单词的单词或术语。在一个或多个槽未被填充的一些示例中,解析器240可通过检查一个或多个其他单词的语义含义来填充这些槽。例如,智能助理计算系统200可告诉用户“你有15封电子邮件”。用户可能会回复话语“好吧,我上车后再浏览它们”。作为对用户的话语的响应,解析器240可用类型“提醒”来填充“承诺类型”槽,即使“提醒”这个单词本身不在用户的话语中。
总之,意图模板的多个槽定义或以其他方式表征用户说出话语的意图。在各个不同的示例中,槽可包括动作槽、触发器槽、承诺槽、主题槽、内容槽、标识槽和各种其他类型的槽。在一些示例中,每个槽可具体化为以下三种状态中的一者:(1)缺失信息、(2)存在的具有未辨析的歧义的信息、以及(3)存在的具有任何已被辨析的歧义的信息。
在一些示例中,一个或多个槽可以是不需要被填充的可选槽。例如,在一个场景中,两个槽可表示可选信息,而在另一场景中,相同的两个槽可表示所需信息。例如,话语“播放音乐”可被理解为应当从被用于该对话的设备播放音乐的命令。以此方式,系统推断关于用户的意图的信息(以经由被用于对话的设备播放音乐)而不需要用户明确地声明该信息。在不同的示例中,话语“只要是Eve生日,就播放生日快乐歌(Whenever it’s Eve’sbirthday,play Happy Birthday)”将要求用户指定将使用的设备,因为播放音乐动作被安排成在只要指定的条件被满足的将来某时刻就被执行。
意图模板的一个示例是对应于添加承诺意图的承诺意图模板。现在参考图7,解说了承诺意图模板700的一个示例。在该示例中,解析器可从话音监听器230接收到读作“当Keith在烤箱附近时提醒我(When Keith is near the oven alert me)”的文本短语710。短语“当Keith在烤箱附近时(When Keith is near the oven)”可被标识为触发器714。短语“提醒我(alert me)”可被标识为在检测到触发器时将被执行的动作718。如下文更详细地描述的,在一些示例中,解析器240可将该文本短语710翻译成被传递到意图处理器230以供进一步处理的机器可执行语言。
如以上所提及的,解析器240可从话音监听器230接收标示相应文本准确的似然性的准确度置信值。在一些示例中并如下文更详细地描述的,意图处理器250还可接收与实体信息相关联的实体置信值。在一些示例中,可经由实体跟踪器210接收此类实体置信值和其他上下文信息。
在本示例中,短语710中的单词“我(me)”填充主题槽722。在此示例中,主题槽722对应于在检测到触发器时将被提醒的人或其他实体。单词“我”可与将该单词与名为Joe的特定人员相关联的上下文信息、以及标示“我”即是人员“Joe”的确定度水平的实体置信值(诸如90%)一起被解析器接收。
在一些示例中,意图模板中的一个或多个单词的预期含义可能不是显而易见的。例如,在短语710中,单词“附近(near)”的含义可能是有歧义的,因为“附近”是相对术语。各种上下文因素可能会影响“附近”的预期含义以及在该短语中构想的相应距离。例如,在“Keith”是婴儿的情况下,“附近”的预期含义可能是基于说出该短语的用户的巨大安全担忧。在“Keith”是该用户的丈夫的情况下,“附近”的预期含义可能受到安全担忧的影响较小,而更多地受到便利因素的影响,这可能导致与“Keith”是婴儿的情形不同的相关距离。在另一示例中,短语“在烤箱附近(near the oven)”中所预期传达的距离可能与短语“在自由女神像附近(near the Statue of Liberty)”中所预期传达的距离不同。
因此,意图模板中的一个或多个单词在传递给意图处理器250时可能是有歧义的。如下文更详细地描述的,意图处理器250可利用多种技术来辨析歧义并填充意图模板中具有缺失信息的槽。
在另一示例中,解析器240可从话音监听器230接收文本短语“与Fred播放音乐(Play music with Fred)”。在一些示例中,短语“播放音乐(Play music)”通常被解释成意味着用户希望经由媒体播放器来播放数字音乐文件。然而,在“播放音乐(Play music)”之后使用短语“与Fred(with Fred)”是不寻常的,因为人们通常不会在他们的意图是经由媒体播放器播放音乐的情况下使用此短语。解析器240可识别此歧义并可生成其确定是与用户的实际意图相对应的统计上最可能的意图模板的N个最佳意图模板的列表。在一些示例中,意图处理器250可使用附加的上下文信息来从N个最佳意图模板的列表中选择意图模板。
在另一示例中,从话音监听器230接收到的文本短语可以是单个单词“播放(Play)”。例如,用户在“播放(Play)”之后口述的一个或多个单词可能由于一个或多个原因(诸如背景中很响的噪声)而无法被话音监听器理解。在该示例中,解析器240可预测用户的意图是播放数字音乐,但是在相应的意图模板中,表示将播放什么音乐的内容槽是空的。在该示例中,解析器240可向意图处理器250发送“播放音乐”意图模板以供进一步处理和辨析此歧义,如下文更详细地描述的。
在一些示例中,解析器240可分析接收到的文本以形成用户的意图的决策树。在一些示例中,解析器240可根据接收到的文本生成If-Then(如果-就)语句(或规则)。每个If-Then(如果-就)语句可包括相应的触发器和动作。只要触发器的条件被满足,就执行动作。由此产生的If-Then(如果-就)语句可执行各种各样的任务,诸如家庭安全(“如果后院中的运动检测器被激活就向我发信息”)、家庭自动化(“当我到家时打开壁炉”)、个人事务整理(“将我的有关慈善捐款的电子邮件收据收集到电子表格中”)、与健康相关的任务(“如果我跑了超过7英里,就提醒我吃蛋白质”)以及许多其他任务。
在一些示例中,可从可被用户激活的一系列渠道中抽取触发器和动作。这些渠道可代表不同的实体和服务,包括设备(诸如智能电话操作系统、诸如智能灯开关之类的连通家庭组件)、知识源(诸如娱乐网站、电子邮件提供商等)以及类似物。每个渠道可展示针对触发器和动作两者的一组功能。
例如,If-Then(如果-就)语句可采用“如果[(诸)输入]被识别,就执行[(诸)动作](IF[Input(s)]are recognized,THEN perform[Action(s)])”的形式。例如,接收到的短语“当Oz在厨房时,告诉他把垃圾拿出去(When Oz is in the kitchen,tell him to takeout the garbage)”可被翻译成以下If-Then(如果-就)语句:“如果确定人员Oz在厨房内,那么就向该人员Oz广播把垃圾拿出去的消息。(IF the person Oz is determined to bein the kitchen,THEN broadcast a message to the person Oz to take out thegarbage.)”在一些示例中,解析器240可基于对接收到的话语进行解析来确定用户意在建立重复出现的消息或动作。例如,在短语“当Oz在厨房时,告诉他把垃圾拿出去(When Oz isin the kitchen,tell him to take out the garbage)”中,单词“当(when)”可被解析器240解释成指明每次满足条件时都应执行相应的动作(即,每次Oz在厨房都告诉他把垃圾拿出去)。在另一示例中,在短语“如果Oz在厨房,就告诉他把垃圾拿出去(If Oz is in thekitchen,tell him to take out the garbage)”中,单词“如果(if)”可被解释成指明相应的动作只应被执行一次(即,下次Oz在厨房,告诉他把垃圾拿出去)。
在一些示例中并且如以上所提及的,这些If-Then(如果-就)语句可能是依概率生成的。以此方式并且对于给定的文本串,解析器240可生成可能对应于用户的话语的If-Then(如果-就)语句的多个N个最佳候选语句。
在对If-Then(如果-就)规则进行解析的一些示例中,解析器240可利用包含非常简单的语言的抽象语法树(AST)的含义表示。例如,每个根节点可扩展为“触发器”和“动作”对。这些节点进而扩展为受支持的触发器和动作的集合。这些树可被建模成生成If-Then(如果-就)任务的几乎上下文无关的语法。
在一些示例中,解析器240可使用两种技术的组合来根据从话音监听器230接收到的文本生成If-Then(如果-就)语句和/或导出意图:(1)采用长短期记忆(LSTM)网络形式的递归神经网络(RNN)架构;以及(2)逻辑回归模型。在一些示例中,图形长短期记忆(图形LSTM)神经网络可被用于从接收到的文本中提取语义含义以及自然语言固有的各单词之间的关系。例如,文本可使用图形LSTM神经网络被解析以使用根据文本片段中的术语的句法关系布置的若干图形LSTM单元来提取跨句子的n元关系。可在图形LSTM神经网络中跟踪各单词之间的这些句法关系以允许人工智能和机器学习技术标识文本中的实体及其上下文并形成它们所存在的语法结构。例如,标识代词所指的名词的上下文、修饰给定动词的副词、影响给定单词的介词短语等可被合并到各种单词中以使得能够更准确地搜索自然语言文档的内容。
在一些示例中,解析器240可接收和处理文本以在各个体短语中并跨各短语的边界绘制节点(例如,单词、短语、字符等)和边(例如,各节点之间的依赖关系链接)。在各种示例中,绘制图形可包括标识文本中各节点之间的一个或多个链接(例如,句法、语义、共同引用、语篇等)。链接可包括各节点之间的短语内和短语间链接。例如,链接可表示一个短语的根与相邻短语的根之间的关系。再例如,链接可表示短语中两个单词之间的关系,诸如针对单词“午餐(lunch)”的修饰语“安妮的(Annie's)”。
如上所述,在一些示例中,解析器240将意图模板传递给意图处理器250以供进一步处理。意图处理器250包括可辨析歧义信息和/或意图模板所缺失的信息的多步骤流水线。如下文更详细地描述的,意图处理器250可利用多种技术来辨析歧义并填充与意图模板相关的缺失信息的槽。在一些示例中,意图处理器250可利用因域而异的信息和因域而异的推理来辨析歧义、补全缺失信息、以及以其他方式澄清意图模板以更接近地对应于用户的实际意图。
在一些示例中,意图处理器250可通过分析对话历史中用户的先前话语来收集关于用户意图的知识,并且可利用此类洞察来辨析歧义并将缺失的信息添加到意图模板。一旦意图处理器250充分澄清了歧义并补全了缺失信息,相应的承诺就可被生成并被传递给承诺引擎260以供执行。
意图处理器250可被配置成处理可包括对话的多个意图模板。出于本公开的目的并且如下文更详细地描述的,对话可包括与用户和智能助理计算系统200之间的一个或多个交换相关的多个信息和其他数据。在不同的示例中,此类信息和数据可包括由用户口述的单词和/或短语、由智能助理计算系统200呈现给用户的查询、从一个或多个传感器接收到的传感器数据、诸如人员和/或身份信息之类的上下文信息等。
如下文提供的使用情形示例中描述的,意图处理器250可包括将从解析器240接收到的意图模板及其相关联的数据翻译为内部数据引用的多个辨析器。为了解决意图模板中包括缺失和/或未辨析信息的槽,意图处理器250可在多阶段过程中利用多个辨析器。在一些示例中,每个辨析器都可被专门编程以处理与可从解析器240接收到的特定意图模板相关联的问题。
辨析器的各示例可包括将专有名称、别名和其他标识符翻译为内部表示数据的查找辨析器(例如,“Bob(鲍勃)”被翻译为人员“Bob(鲍勃)”的内部表示,诸如Bob(鲍勃)的联系信息)。辨析器的各示例可包括回指辨析器和指示辨析器,回指辨析器解决具有依赖于上下文中的先行表达式或后置表达式的解释的表达式(例如,“她(she)”被翻译成表示“代词‘她’的个人身份”的槽),而指示辨析器解决在没有附加上下文信息的情况下不能被完全理解的单词和短语,诸如“这里(here)”或“那里(there)”(例如,“那里”可能会被翻译成表示“那里是哪里(where is there)?”的槽)。在其他示例中,可利用许多其他形式和类型的辨析器。
现在参考图8,示意性地解说了处理对话的一部分的解析器240和意图处理器250的一个示例。在此示例中,解析器240将第一短语1解析为意图模板1。解析器240将意图模板1提供给意图处理器250,该意图处理器250利用第一辨析器1来辨析该意图模板中的歧义和/或缺失信息。从解析器240接收对应于第二短语2的第二意图模板2。如下文更详细地描述的,意图处理器250可分析意图模板2以及上下文信息810以确定是利用第一辨析器1还是利用第二辨析器2来辨析意图模板2。基于第三经解析的短语3的第三意图模板3可随后由意图处理器250接收。意图处理器250可利用第三辨析器3来辨析意图模板3。下文提供了使用辨析器分析意图模板的附加详细信息和使用情形示例。
在一些示例中,意图处理器250可确定两个或更多个意图模板是否应该被融合或合并在一起以继续现有的对话路径。如果意图处理器250确定两个或更多个意图模板应该被融合在一起,则意图处理器可以融合与这两个或更多个意图模板相关联的数据并继续使用经融合的数据遵循现有的对话路径。如果意图处理器250确定这两个或更多个意图模板不应该被融合在一起,则可使用最新近接收到的意图模板来开始新主题。
如下文更详细地描述的,在意图模板的槽具有缺失信息的情况下,意图处理器250可执行数据收集操作(诸如要求用户澄清或提供信息,或尝试以另一种方式收集信息)以便将信息填充到该槽。一旦每个槽包含信息,意图处理器250可确定每个槽中的信息是否是无歧义的。对于被标识为有歧义的信息,意图处理器250可应用各种技术中的一种或多种来辨析歧义。
再次参考图2,在一些示例中,意图处理器250可包括映射器252,该映射器252将一个或多个系统目标映射到对应的(诸)用户意图。系统目标的各示例可包括澄清歧义、从用户获取附加信息等。在一些示例中,映射器252可在内部将系统目标重新解析为用户意图或目标。例如,映射器252可将系统需要的信息(诸如,用于辨析歧义意图的信息)映射到用户在提供该信息时会触发的用户意图。换言之,映射器252可将信息映射到将从话语中辨析出的意图,该话语将会被用户说出以便生成该意图。在一些示例中,映射器252可将系统目标映射到用户将会说出以便生成相同结果的单词或短语。
在一些示例中,当系统需要来自用户的信息来辨析用户意图时,系统可在内部提示一个状态,该状态相当于声明如果该用户提供了包含除了所需信息之外的意图的所有组成的输入(诸如话语)则该系统将处于的状态。换言之并且在一些示例中,系统可假设用户已经提供了更多输入,而该输入仅缺失与所需信息对应的一个或多个特定槽。以此方式,意图处理器250可继续利用所提供的任何用户输入。在一些示例中,这允许系统重新使用诸如意图模板之类的组件。因此并在这些示例中,通过使意图处理器250假设用户意图(相对于系统目标)正在驱动其操作,系统可在内部重新使用相应的逻辑并且可以更深入、更丰富地理解此类用户意图。
在一些示例中,系统可具有从用户获取信息以继续导出用户意图的目标。在第一示例中,用户可以说出两句话语:“给我预订明天飞往加利福尼亚的航班(Book me aflight to California tomorrow);该航班需要飞往旧金山。(The flight needs to beto San Francisco.)”。在第一话语中,用户指示预订航班的意图,而在第二话语中,用户将意图缩小到飞往旧金山的航班。在这两个话语中都指定了用户意图。
在另一示例中,用户说出第一话语“给我预定明天的航班。(Book me a flighttomorrow.)”。系统可能会用询问“您想飞往哪里?(Where do you want to fly to?)”来回应。用户可随后回应“飞往旧金山。(To San Francisco.)”。在生成系统查询之际,映射器252可将意图处理器的目标(获取用户目的地的信息)映射到用户意图。例如,映射器252可假设用户将会提供该信息就好像其是用户的意图一样。
在一些示例中,通过将映射器252配置成假设用户意图正在驱动其操作,系统可最小化执行这些操作的代码并重新使用相应的逻辑。以此方式,系统可以更深入、更丰富地理解此类用户意图。因此,在这些示例中,系统可利用用于意图处理器250和映射器252的包括仅用户意图系统的代码,而不是利用多个专用代码片段来管理所有的歧义并以其他方式处理多个相应的任务和离散情况。
图9示意性地解说了示例实体跟踪器210,其可包括智能助理计算系统200的组件。实体跟踪器210可被用于确定一个或多个传感器范围内的一个或多个实体的身份、位置和/或当前状态。实体跟踪器210可将此类信息输出到智能助理计算系统200的一个或多个其他模块,诸如承诺引擎260、话音监听器230等。
在实体跟踪器210的上下文中使用的单词“实体”可以指人、动物或其他生物以及非生物对象。例如,实体跟踪器可被配置成标识家具、器具、结构、景观特征、车辆和/或任何其他物理对象,并确定此类物理对象的位置/定位和当前状态。在一些情形中,实体跟踪器210可被配置成仅标识人而不标识其他生物或非生物。在此类情形中,单词“实体”可能与单词“人”同义。
实体跟踪器210从一个或多个传感器222(诸如传感器A 902A、传感器B 902B和传感器C 902C)接收传感器数据,但是将理解,实体跟踪器可以与任何数目和种类的合适的传感器一起使用。作为示例,可与实体跟踪器一起使用的传感器可包括相机(例如,可见光相机、UV相机、IR相机、深度相机、热相机)、话筒、压力传感器、温度计、运动检测器、邻近度传感器、加速度计、全球定位卫星(GPS)接收机、磁力计、雷达系统、激光雷达系统、环境监视设备(例如,烟雾检测器、一氧化碳检测器)、气压计、健康监视设备(例如、心电图仪、血压计、脑电图)、汽车传感器(例如,速度计、里程表、转速计、燃料传感器)和/或收集和/或存储与一个或多个人或其他实体的身份、位置和/或当前状态有关的信息的任何其他传感器或设备。在一些示例中,实体跟踪器210可用多个传感器220中的一者或多者来占据公共设备壳体,和/或实体跟踪器及其相关联的传感器可跨被配置成经由一个或多个网络通信接口(例如,Wi-Fi适配器、蓝牙接口)通信的多个设备分布。
如图9的示例中所示,实体跟踪器210可包括实体标识符212、人标识符905、位置(定位)标识符906和状态标识符908。在一些示例中,人标识符905可以是实体标识符212的专用组件,其被特别优化以用于识别人,而非识别其他生物和非生物。在其他情形中,人标识符905可以与实体标识符212分开操作,或者实体跟踪器210可能不包括专用的人标识符。
取决于特定实现,与实体标识符、人标识符、位置标识符和状态标识符相关联的任何或所有功能可以由各个体传感器902A-902C执行。尽管本说明书一般将实体跟踪器210描述为从传感器接收数据,但这并不要求实体标识符212以及实体跟踪器的其他模块必须被实现在单个计算设备上,该设备和与实体跟踪器相关联的多个传感器分离并区别开来。相反,实体跟踪器210的功能可被分布在多个传感器之间。例如,与向实体跟踪器发送原始传感器数据不同,单个传感器可被配置成尝试标识其检测到的实体,并将该标识报告给实体跟踪器210和/或智能助理计算系统200的其他模块。在一些情况下,该标识可包括置信值。
实体标识符212、人标识符905、位置标识符906和状态标识符908中的每一者被配置成解释和评估从多个传感器220接收到的传感器数据,并基于传感器数据输出上下文信息910。上下文信息910可包括实体跟踪器基于接收到的传感器数据对一个或多个检测到的实体的身份、位置和/或状态的猜想/预测。如下文将更详细地描述的,实体标识符212、人标识符905、位置标识符906和状态标识符908中的每一者可输出它们的预测/标识以及置信值。
实体标识符212可输出检测到的实体的实体身份912,并且此类实体身份可具有任何合适的特异性程度。换言之,基于接收到的传感器数据,实体跟踪器210可预测给定实体的身份,并将此类信息输出为实体身份912。例如,实体标识符212可报告特定实体是家具、狗、男人等。附加地或替换地,实体标识符212可报告特定实体是具有特定型号的烤箱;具有特定名字和品种的宠物狗;智能数字助理计算系统200的拥有者或用户,其中该拥有者/用户具有特定姓名和简档;等等。在一些示例中,实体标识符212标识/分类检测到的实体的特异性程度可取决于用户偏好和传感器限制中的一者或多者。
当被应用于人时,实体跟踪器210可在一些情形中收集关于无法通过姓名标识的个人的信息。例如,实体标识符212可记录人脸的图像,并将这些图像与人声的录制音频相关联。如果该人随后向智能助理计算系统200说话或以其他方式对待智能助理计算系统200,则实体跟踪器210将随后具有关于智能助理计算系统正在与谁交互的至少一些信息。在一些示例中,智能助理计算系统200还可提示人们声明他们的姓名,以便在将来更容易对人进行标识。
在一些示例中,智能助理计算系统200可利用人的身份来为该人定制用户界面。在一个示例中,可标识具有有限视觉能力的用户。在该示例中并且基于该标识,可修改智能助理计算系统200(或用户正在与之交互的其他设备)的显示以显示更大的文本、或者提供仅语音接口。
位置标识符906可被配置成输出检测到的实体的实体位置(即,定位)914。换言之,位置标识符906可基于收集到的传感器数据预测给定实体的当前位置,并将此类信息输出为实体位置914。与实体身份912一样,实体位置914可具有任何合适的细节水平,并且该细节水平可随用户偏好和/或传感器限制而变化。例如,位置标识符906可报告检测到的实体具有在诸如地板或墙壁之类的平面上定义的二维位置。附加地或替换地,经报告的实体位置914可包括检测到的实体在真实世界三维环境中的三维位置。在一些示例中,实体位置914可包括GPS位置、映射系统内的位置等。
检测到的实体的经报告实体位置914可对应于实体的几何中心、被分类为重要的实体的特定部分(例如,人的头部)、在三维空间中定义实体边界的一系列边界等。位置标识符906可进一步计算描述检测到实体的位置和/或方向的一个或多个附加参数,诸如俯仰、滚转和/或偏航参数。换言之,检测到的实体的报告位置可具有任意数目的自由度,并且可包括定义实体在环境中位置的任意数目的坐标。在一些示例中,即使实体跟踪器210无法标识实体和/或确定实体的当前状态,也可报告检测到的实体的实体位置914。
状态标识符908可被配置成输出检测到的实体的实体状态916。换言之,实体跟踪器210可被配置成基于接收到的传感器数据来预测给定实体的当前状态,并将此类信息输出为实体状态916。事实上“实体状态”可以指给定实体的任何可测量或可分类的属性、活动或行为。例如,当被应用于一个人时,该人的实体状态可指示该人的姿态(例如站立、坐下、躺下)、该人行走/跑步的速度、该人当前的活动(例如睡觉、看电视、工作、玩游戏、游泳、打电话)、该人当前的情绪(例如,通过评估人的面部表情或语调)、该人的生物/生理参数(例如,该人的心率、呼吸频率、氧饱和度、体温、神经活动)、该人是否有任何当前或即将发生的日历事件/约会等。“实体状态”可以指应用于其他生物或非生物对象时的附加/替换属性或行为,诸如烤箱或厨房水槽的当前温度、设备(例如,电视、灯、微波炉)是否通电、门是否打开等。
在一些示例中,状态标识符908可使用传感器数据来计算人的各种不同的生物/生理参数。这可以以各种合适的方式完成。例如,实体跟踪器210可被配置成与光学心率传感器、脉搏血氧计、血压计、心电图仪等接口。附加地或替换地,状态标识符908可被配置成解释来自环境中的一个或多个相机和/或其他传感器的数据,并处理数据以便计算人的心率、呼吸率、氧饱和度等。例如,状态标识符908可被配置成利用欧拉放大和/或类似技术放大由相机捕捉到的微小运动或变化,从而允许状态标识符可视化通过人体循环系统的血流并计算相关联的生理参数。例如,此类信息可被用于确定该人何时睡着、工作、遇险、遇到健康问题等。
在确定实体身份912、实体位置914、和实体状态916中的一者或多者之际,此类信息可作为上下文信息910被发送到各种外部模块或设备中的任何一者,其中此类信息可以以各种方式被使用。例如,承诺引擎260可使用上下文信息910来管理承诺和相关联的消息和通知。在一些示例中并且如下文更详细地描述的,承诺引擎260可使用上下文信息910来确定特定消息、通知或承诺是否应该被执行和/或呈现给用户。类似地,当解释人类语音或响应于关键词触发器激活功能时,话音监听器230可利用上下文信息910。
如上所述,在一些示例中,实体跟踪器210可在单个计算设备中实现。在其他示例中,实体跟踪器210的一个或多个功能可跨多个协同工作的计算设备分布。例如,实体标识符212、人标识符905、位置标识符906和状态标识符908中的一者或多者可在不同的计算设备上实现,同时仍然共同包括被配置成执行本文描述的功能的实体跟踪器。如上文所指示的,实体跟踪器的任何或所有功能可由各个体传感器220执行。此外,在一些示例中,实体跟踪器210可省略实体标识符212、人标识符905、位置标识符906和状态标识符908中的一者或多者,和/或包括本文未描述的一个或多个附加组件,但同时仍提供上下文信息910。
实体身份912、实体位置914和实体状态916中的每一者可采用任何合适的形式。例如,实体身份912、位置914和状态916中的每一者可采用包括描述由实体跟踪器收集的信息的一系列值和/或标签的离散数据分组的形式。实体身份912、位置914和状态916中的每一者可附加地包括定义信息准确的统计似然性的置信值。例如,如果实体标识符212接收到强烈指示特定实体是名为“约翰·史密斯(John Smith)”的男人的传感器数据,那么实体身份912可包括该信息以及对应的相对高的置信值(诸如90%置信度)。如果传感器数据有更多的歧义,则被包括在实体身份912中的置信值可对应地相对较低(诸如62%)。在一些示例中,可为单独的预测指派单独的置信值。例如,实体身份912可以以95%的置信度指示特定实体是男人,并且以70%的置信度指示该实体是约翰·史密斯(John Smith)。如下文更详细地描述的,成本函数可利用此类置信值(或概率)来生成针对向用户提供消息或其他通知和/或执行(诸)动作的成本计算。
在一些实现中,实体跟踪器210可被配置成组合或融合来自多个传感器的数据以便输出更准确的预测。作为示例,相机可定位特定房间中的人。基于相机数据,实体跟踪器210可以以70%的置信值标识该人。然而,实体跟踪器210可附加地从话筒接收录制的语音。仅基于录制的语音,实体跟踪器210可以以60%的置信值标识该人。通过将来自相机的数据与来自话筒的数据组合,实体跟踪器210可以以可能比单独使用来自任一传感器的数据的置信值更高的置信值标识该人。例如,实体跟踪器可确定从话筒接收到的录制语音与接收到语音时相机可见的人的嘴唇运动相对应,并从而以相对较高的置信度(诸如92%)得出相机可见的人就是正在说话的人的结论。以此方式,实体跟踪器210可组合两个或更多个预测的置信值以用经组合的、更高的置信值标识人。
在一些示例中,取决于传感器数据的可靠性,可对从各种传感器接收到的数据进行不同地加权。当多个传感器输出看起来不一致的数据时,这一点尤其重要。在一些示例中,传感器数据的可靠性可至少部分地基于由传感器生成的数据的类型。例如,在一些实现中,视频数据的可靠性可能比音频数据的可靠性更高地加权,因为相机上的实体的存在相较于推定源自该实体的录制的声音而言是对其身份、位置和/或状态的更好的指示符。将理解,传感器数据的可靠性是相较于与数据实例的预测准确度相关联的置信值而言不同的因素。例如,基于每个实例处存在的不同上下文因素,视频数据的若干实例可具有不同的置信值。然而,视频数据的这些实例中的每一者通常都可以与视频数据的单个可靠性值相关联。
在一个示例中,来自相机的数据可以以70%的置信值表明特定的人位于厨房中,诸如经由面部识别分析。来自话筒的数据可以以75%的置信值表明同一人位于附近的走廊,诸如经由话音识别分析。即使话筒数据的实例携带更高的置信值,实体跟踪器210仍可基于相机数据的可靠性比话筒数据的可靠性高而输出人位于厨房内的预测。以此方式并且在一些示例中,不同传感器数据的不同可靠性值可以与置信值一起被用于协调冲突的传感器数据并确定实体的身份、位置和/或状态。
附加地或替换地,可赋予具有更高精度、更高处理功率或更高能力的传感器更高的权重。例如,与膝上型计算机中找到的基本网络摄像头相比,专业级视频相机可具有显著改进的镜头、图像传感器和数字图像处理能力。相应地,因为从专业级相机接收到的视频数据可能更准确,因此与网络摄像头相比此类数据可被赋予更高的权重/可靠性值。
现在参考图10-16,解说了智能助理计算系统在单个计算设备和跨多个计算设备中的附加示例实现。
图10示出了一体化计算设备1000的示例,其中实现智能助理计算系统200的各组件一起被布置在独立设备中。在一些示例中,一体化计算设备1000可经由网络1066通信地耦合到一个或多个其他计算设备1062。在一些示例中,一体化计算设备1000可被通信地耦合到数据存储1064,数据存储164可存储诸如用户简档数据之类的各种数据。一体化计算设备1000包括至少一个传感器220、话音监听器230、解析器240、意图处理器250、承诺引擎260、实体跟踪器210和至少一个输出设备270。(诸)传感器220包括至少一个话筒以接收来自用户的自然语言输入。在一些示例中,还可包括一个或多个其他类型的传感器220。
如上所述,话音监听器230、解析器240和意图处理器250协同工作以将自然语言输入转换成可由一体化设备1000执行的承诺。承诺引擎260将这些承诺存储在承诺存储中。实体跟踪器210可向承诺引擎260和/或其他模块提供上下文信息。在上下文适当的时间,承诺引擎260可执行承诺并向(诸)输出设备270提供诸如音频信号之类的输出。
图11示出了其中一个或多个远程服务1110执行智能助理计算系统200的自然语言处理功能的示例实现。在该示例中,话音监听器230、解析器240、意图处理器250、实体跟踪器210和承诺引擎260驻留在位于远离支持云的用户设备A的诸如一个或多个服务器之类的一个或多个计算设备上。来自用户设备A的一个或多个传感器220的传感器数据经由网络被提供给(诸)远程服务1110。例如,用户发言的音频数据可被用户设备A的话筒捕捉并被提供给话音监听器230。
如上所述,话音监听器230、解析器240和意图处理器250协作以将音频数据转换成被存储在承诺引擎260中的承诺。在上下文适当的时间,承诺引擎260可执行承诺并向用户设备A的一个或多个输出设备270提供诸如音频信号之类的输出。
图12示出了其中一个或多个远程服务1110执行智能助理计算系统200的自然语言处理功能的另一示例实现。在该示例中,该一个或多个远程服务与多个不同的传感器和输出设备通信地耦合。在该示例中,传感器包括单独的独立传感器A和C,诸如话筒、相机等。输出设备包括单独的独立输出设备B和D,诸如扬声器。
一个或多个远程服务1110还通信地耦合到包括一个或多个传感器F和输出设备G的设备E。设备E可采用包括话筒、扬声器和网络连接组件的简单独立设备的形式。在其他示例中,设备E可以是移动电话、平板计算机、壁挂式显示器或其他合适的计算设备。在一些示例中,设备E、传感器A和C以及输出设备B和D可以是同一支持云的客户端的一部分。在其他示例中,任何数目的单独传感器和设备都可以与一个或多个远程服务1110一起使用。
如上所述,一个或多个远程服务1110执行智能助理计算系统200的自然语言处理功能。在一些示例中,远程服务1110中的一者或多者可包括智能助理计算系统200的所有自然语言处理模块。在其他示例中,一个或多个远程服务1110可包括少于所有的自然语言处理模块,并可通信地耦合到位于一个或多个其他服务处的其他模块。在本示例中,并且如下文更详细地描述的,远程服务1110中的一者或多者还可包括设备选择器1112,该设备选择器1112可利用传感器输入来选择输出设备B、D和/或G以接收来自承诺引擎260的输出。
现在参考图13,在一些示例中,本公开的智能助理计算系统200可利用设备选择器1112来使用户能够与其位置可能对该用户来说是未知的另一个人通信。在一些示例中,该系统可使用传感器数据和/或对应的上下文数据来检测该存在并确定另一个人的位置。在接收到来自该用户的对另一个人说话或定位另一个人的请求之际,设备选择器1112可选择适当的输出设备以在该用户和另一个人之间建立通信。
在图13的示例使用情形中,实现智能助理计算系统20的一个或多个远程服务1110与智能电话1390和膝上型计算机1392通信地耦合。在一个示例中,智能电话1390包含包括话筒的多个传感器A,以及采用扬声器形式的输出设备A。智能电话1390可以与用户一起被定位在她家的用户地下媒体室中。膝上型计算机1392包含包括话筒和网络摄像头的多个传感器B,以及采用扬声器形式的输出设备B。膝上型计算机1392可位于家的楼上卧室中。
智能电话1390的用户可能希望与她的女儿通信,但可能不知道她在家中的当前位置。女儿可能和另外两个朋友在楼上卧室中。该用户可以说出自然语言输入以指示她想要与她的女儿通信。例如,该用户可以说出“给我连接莎拉(Connect me to Sarah)”。该用户的智能电话1390中的话筒可接收自然语言输入并将其发送到远程服务1110以供上述话音监听器230和智能助理计算系统200的其他组件进行处理。
在确定该用户的意图之际,承诺引擎260可从实体跟踪器210请求包括用户的女儿Sarah(莎拉)的位置的上下文信息。作为响应,实体跟踪器210可利用来自膝上型计算机1392的网络摄像头的视频数据来标识该网络摄像头视野中的Sarah(莎拉)。实体跟踪器210可使用其他上下文信息来确定膝上型计算机1392并因此女儿Sarah(莎拉)位于楼上卧室中。
通过使用该信息,设备选择器1112可将用户的智能电话1390的话筒和扬声器与膝上型计算机1392的话筒和扬声器通信地耦合,并从而允许该用户与她的女儿交谈。
在其他示例中并且如上文讨论的,一个或多个其他类型的传感器和对应的数据可被用于定位人或其他实体。各示例包括仅音频数据、视频和音频数据的组合、设备登录数据、以及前述和其他传感器数据的其他组合。
现在参考图14,在一个示例中,采用话筒形式的一个或多个传感器220可接收用户说“嘿计算机,今晚的学校董事会几点开会?(Hey computer,what time is the schoolboard meeting tonight?)”的音频数据。如上所述,话音监听器230可将音频数据处理为文本和(诸)置信值,并将该信息传递给解析器240。解析器240中的关注激活器1432可标识文本中的关键词短语“嘿计算机(Hey computer)”。作为响应,解析器240可激活或修改智能助理计算系统200的其他组件和功能。例如,解析器240可增加语音识别模块的采样率以增加可能的后续用户的语音的识别准确度。
如以上所提及的,在处理用户的自然语言输入的音频数据之际,承诺引擎可向一个或多个输出设备(诸如扬声器和/或视频显示器)提供输出。在一些示例中,单个设备可包括捕捉用户的输入的话筒(其中此类输入被提供给智能助理计算系统200),以及接收并广播由系统响应于输入而生成的消息的扬声器。
在一些示例中,用户可处于具有可捕捉用户语音的两个或更多个话筒和/或可广播由系统响应于该语音而生成的消息的两个或更多个扬声器的环境中。例如,用户可能与他的移动电话、膝上型计算机、平板计算机和智能/连接电视一起位于他的媒体室中。这些设备中的每一者都可包含智能助理计算系统200或与智能助理计算系统200通信地耦合。
用户可以说出由多个设备中的每一者的话筒捕捉的关键词短语。因此,由智能助理计算系统200生成的相应消息可能被所有设备中的扬声器广播,这可能会使用户烦恼。如下文更详细地描述的,在涉及多个传感器、输出设备和/或其他设备的一些示例中,智能助理计算系统200可被配置成确定多个话筒中的哪个用于接收用户语音和/或多个扬声器中的哪个用于广播相应消息。在一些示例中并且如下所述,聚集器可评估和权衡多个度量以确定要利用哪些话筒和扬声器。
现在参考图15,提供了响应于多设备环境中的话音激活的传感器和输出设备选择的示例实现。在该示例中,实现智能助理计算系统200的一个或多个远程服务1110可从三个不同设备(诸如移动电话1576、平板计算机1578和一体化智能助理设备1580)的三个不同话筒A、B和C接收音频数据。
三个设备附近的用户可以说出关键词短语,诸如“嘿计算机(Hey Computer)”。话筒A、B和C中的每一者可捕捉说出该短语的用户的音频数据,并可将该音频数据发送给话音监听器230。如上所述,话音监听器230可利用语音识别技术将口述话语翻译成文本。话音监听器230还可将(诸)置信值指派给经翻译的文本。在一些示例中,话音监听器230可包括关键词检测算法,该关键词检测算法被配置成标识经翻译的文本中的关键词或关键词短语。话音监听器230可向文本指派置信值,该置信值指示该文本是关键词或关键词短语的似然性。
在一些示例中,聚集器1582可评估与从不同的各个体话筒和/或从不同的话筒阵列接收到的与多个用户音频数据流相关的多个度量。如下文更详细地描述的,聚集器1582可利用这些度量来选择音频数据流中的一者及其对应的(诸)话筒以用于与用户交互。在一些示例中,可选择被确定为最接近用户的(诸)话筒。在一些示例中,可选择被确定为提供最高质量音频数据的(诸)话筒。在一些示例中,提供最高质量音频数据的(诸)话筒可被确定为最接近用户的(诸)话筒,并因此可被选择。
当话筒已被选出时,设备选择器1112可选择与该话筒相关联的扬声器以向用户输出响应。例如,在话筒是包括扬声器的设备的组件的情况下,可选择该扬声器。在话筒是独立话筒的情况下,聚集器1582可选择用户附近的另一扬声器以输出响应。在图15的示例中,聚集器1582位于实现智能助理计算系统200的至少一部分的远程服务中的一者上。在其他示例中,聚集器1582可位于另一计算设备上,诸如在另一基于云的服务中。
在一个使用情形示例中,聚集器1582可利用四个度量来评估接收到的用户音频数据流:(1)接收到的音频信号的幅度(音量);(2)音频信号的信噪比(S/N);(3)指示数据流包含关键词或关键词短语的似然性的关键词置信值;以及(4)指示发言者是特定人的似然性的用户标识置信值。
在一些示例中,可利用音频数据流接收幅度和/或S/N值。在其他示例中,幅度和/或S/N值可由话音监听器230或智能助理计算系统200的其他组件确定。如上所述,关键词置信值可由话音监听器230确定。同样如上所述,用户标识置信值可由实体跟踪器210确定。在一些示例中,说出输入的用户可被话音识别标识为已知的发言者或未知的发言者,并被指派相应的置信水平。
可通过将用户话音的信号电平与背景噪声的电平进行比较来计算接收到的音频输入的S/N比。在一些示例中,输入的幅度可被用于确定用户与对应话筒的邻近度。将理解,本实现中讨论的度量是作为示例提供的,并不意味着是限制性的。
每个接收到的音频数据流还可包括标识提供该数据流的特定设备或独立传感器的设备ID。在一些示例中,在从第一设备或传感器接收到第一组度量之后,聚集器1582可停顿预定的时间段以确定是否一个或多个其他设备/传感器也从与第一组度量中标识的用户相同的人处接收到关键词或关键词短语。例如,聚集器1582可停顿0.5秒、1.0秒或不会对用户造成负面用户体验的任何其他时间段。
在本示例中并且如图15所示,聚集器1582评估从移动电话1576、平板计算机1578和一体化智能助理设备1580接收到的音频数据流的度量。对于每个设备,聚集器1582可将四个度量组合成单个可选择性分数,诸如通过对这四个度量取平均。在一些示例中并且在组合之前,可通过依经验确定的权重对每个度量进行加权,权重反映了度量在预测将提供最佳用户体验的设备/话筒和对应的音频数据流方面的准确度。通过比较每个设备/话筒及其数据流的可选性分数,聚集器1582可标识并选择期望的设备/数据流。
在一个示例中,对于四个度量中的每一者,聚集器1582可比较每个设备/话筒的分数并相应地根据每度量对设备/话筒进行排名。例如,聚集器1582可确定从移动电话1576的话筒A接收到的音频数据流的以下分数:1)90%(幅度);2)90%(S/N);3)30%(关键词置信度);4)90%(发言者ID)。从平板计算机1578的话筒B接收到的音频数据流的分数可以是:1)80%(幅度);2)80%(S/N);3)80%(关键词置信度);4)80%(发言者ID)。从智能助理设备1580的话筒C接收到的音频数据流的分数可以是:1)92%(幅度);2)88%(S/N);3)90%(关键词置信度);4)92%(发言者ID)。
在该示例中,针对四个度量中每一者的三个设备的排名如下:
幅度-1.智能助理设备;2.移动电话;3.平板计算机。
S/N比-1.移动电话;2.智能助理设备;3.平板计算机。
关键词置信度-1.智能助理设备;2.平板计算机;3.移动电话。
扬声器ID-1.智能助理设备;2.移动电话;3.平板计算机。
每个设备可基于其在每个度量类别中的排名来奖励积点。例如,排名第一名=1积点、第二名=2积点、而第三名=3积点。对于每个设备,其点数为四个度量的总计并取平均值。聚集器1582选择具有最低平均积点总数的设备(和相应的数据流)。在本示例中,最终积点总数和排名是:1.智能助理设备=>1.25;2.移动电话=>2.0;3.平板计算机=>2.75。因此,聚集器1582从智能助理设备1580中选择数据流以供智能助理计算系统200继续分析。附加地,并且基于上述排名,设备选择器1112可选择智能助理设备1580以接收由承诺引擎260生成的(诸)消息作为分析结果。
在一些示例中,在上述智能助理设备1580的聚集器1582的选择之际,聚集器还可使得其他两个设备禁止发送与同一发言者ID(即,人)相关联的音频数据流,该同一发言者ID与经分析的数据流相关联。以此方式,当同一用户在初始输入后提供更自然的语言输入时,仅所选智能助理设备1580将向(诸)远程服务1110提供相应的音频数据。在一些示例中,当同一人说出关键词或关键词短语时,其他两个设备可恢复发送音频数据流。在这些情形中,可再次执行上述选择过程以确定所选设备。
在一些示例中并且如以上所提及的,在对奖励积点取平均之前,每个积点奖励可被乘以依经验确定的加权值,该加权值反映了度量在预测将提供最佳用户体验的设备和相应音频数据流方面的准确度。在一些示例中,一个或多个机器学习技术可被用于构建用于计算不同度量的模型。
在一些示例实现中,信号幅度可能与用户与接收该用户的语音的话筒的距离高度相关。S/N比还可提供针对用户与话筒的距离的良好指示符,因为较低的噪声值可能与用户离话筒较接近有关。在信号幅度和信号的S/N比两者都相对较高的情况下,扬声器ID精度可相应地受益于强信号。
将理解,上述方法和使用情形仅仅是示例,并且许多变型是可能的。例如,上述4个度量的子集可被用于评估用户音频数据流。在其他示例中,还可利用一个或多个附加度量。
在一些示例中,先前已经经由多个设备中的所选设备与智能助理计算系统200建立了对话的用户在开始与同一设备的下一次对话之前可具有短暂的停顿。系统可将停顿的历时与预定时间段进行比较,并且可以在为下一次对话选择设备时考虑该比较。例如,在停顿的历时小于预定时期段(诸如5秒)的情况下,系统可包括最新近建立的扬声器ID和设备确定分析中存在的先前对话,作为为下一次对话选择同一设备的倾向。
上述示例包括识别可听关键词以激活智能助理计算系统的一个或多个功能。在一些示例中,可通过识别一个或多个其他信号来激活系统的功能。此类信号可包括例如由相机捕捉的用户姿势、用户眼睛注视、和用户的面部方向。
在一些示例中,用于设备选择的上述技术中的一者或多者可被用于基于一个或多个因素自动地更新所选设备。例如,在用户经由第一设备与智能助理计算系统200通信的情况下,当用户改变她的定位并且远离第一设备移动时,该系统可相应地将所选设备改变为更接近用户的新位置的第二设备。
在一些实现中,除了音频数据之外,可利用来自一个或多个图像传感器的成像数据来选择设备。例如,从实体跟踪器210接收到的上下文数据810可包括可被用于选择设备的成像数据。成像数据的各示例可包括来自RGB相机的视频、来自IR相机的红外图像、来自深度相机的深度图像、来自热相机的热图像等。例如,RGB相机可跟踪用户在房间内的位置。来自相机的图像可被用于选择(诸)适当的设备/话筒以接收用户的自然语言输入、和/或选择(诸)适当的扬声器以向用户广播消息。在一些示例中并且参考上述设备选择技术,可包括成像数据和相关参数作为由聚集器1582分析以确定设备选择的度量。
在一些示例中,捕捉到的用户图像可被用于标识用户说话时正面向哪个设备。在一些示例中,诸如面部检测之类的指示符可被用于标识用户。在一些示例中,经捕捉的视频可指示可被用于将口述关键词与用户相关联的用户的嘴唇移动。在具有多个用户的环境中,此类指示符还可标识正在处理设备的特定用户。如此,话音和物理识别两者都可被用作参数以将一用户与多个用户区分开来。
可用于选择设备/话筒和/或扬声器的输入的其他示例包括雷达信号和激光雷达信号。在一些示例中,来自经连接的设备的信号可指示用户正在与该设备进行交互。在一个示例中,用户可经由指纹识别来激活移动电话。此类交互可以是用户出现在电话位置处的强烈指示符。
在一些实施例中,本文中所描述的方法和过程可以与一个或多个计算设备的计算系统绑定。具体而言,此类方法和过程可被实现为计算机应用程序或服务、应用编程接口(API)、库、和/或其他计算机程序产品。
图16示意性地示出了可执行上述方法和过程中的一者或多者的计算系统1650的非限制性实施例。以简化形式示出了计算系统1650。计算系统1650可采用一个或多个下列各项的形式:个人计算机、服务器计算机、平板计算机、家庭娱乐计算机、网络计算设备、游戏设备、移动计算设备、移动通信设备(例如,智能电话)、和/或其他计算设备。
计算系统1650包括逻辑处理器1654、易失存储器1658以及非易失存储设备1662。计算系统1650可以可任选地包括显示子系统1666、输入子系统1670、通信子系统1674和/或在图16中未示出的其他组件。
逻辑处理器1654包括被配置成执行指令的一个或多个物理设备。例如,逻辑处理器可以被配置成执行指令,该指令是一个或多个应用、程序、例程、库、对象、组件、数据结构或其他逻辑构造的一部分。此类指令可被实现以执行任务、实现数据类型、变换一个或多个组件的状态、实现技术效果、或以其他方式得到期望的结果。
逻辑处理器1654可包括被配置成执行软件指令的一个或多个物理处理器(硬件)。附加地或替换地,逻辑处理器可包括被配置成执行硬件实现的逻辑或固件指令的一个或多个硬件逻辑电路或固件设备。逻辑处理器1654的各处理器可以是单核的或多核的,并且其上所执行的指令可被配置成用于串行、并行和/或分布式处理。逻辑处理器的各个体组件可以可任选地分布在两个或更多个分开的设备之间,这些设备可以位于远程和/或被配置成用于协同处理。逻辑处理器1654的各方面可以由以云计算配置进行配置的可远程访问的联网计算设备来虚拟化和执行。在此种情形中,这些虚拟化方面可以在各种不同机器的不同物理逻辑处理器上运行。
易失性存储器1658可包括包含随机存取存储器的物理设备。易失性存储器1658通常被逻辑处理器1654用来在软件指令的处理期间临时地存储信息。将理解,当切断给易失性存储器1658的功率时,该易失性存储器通常不继续存储指令。
非易失性存储设备1662包括被配置成保持可由逻辑处理器执行的指令以实现本文中所描述的方法和过程的一个或多个物理设备。当实现此类方法和过程时,非易失性存储设备1662的状态可以被变换-例如以保持不同的数据。非易失性设备1662还可以保存数据,包括本文描述的各种数据项。此类数据可被组织在共同形成数据库系统的一个或多个数据库中。存储设备1662的一个或多个数据保持设备可被统称为数据存储系统。虽然各种数据项被称为存储在数据存储系统或数据存储设备中,但是将理解,此类数据项可以跨两个或更多个数据存储设备分布。因此,例如,被称为与人员简档相关联的数据项可被存储在不同的数据存储设备中和/或使用共同形成数据库系统的两个或更多个数据库来存储。
非易失性存储设备1662可包括可移动和/或内置的物理设备。非易失性存储设备1662可包括光学存储器(例如,CD、DVD、HD-DVD、蓝光碟等)、半导体存储器(例如,ROM、EPROM、EEPROM、闪存存储器等)、和/或磁存储器(例如,硬盘驱动器、软盘驱动器、磁带驱动器、MRAM等)或者其他大容量存储设备技术。非易失性存储设备1662可包括非易失性、动态、静态、读/写、只读、顺序存取、位置可寻址、文件可寻址、和/或内容可寻址设备。将理解,非易失性存储设备1662被配置成即使当切断给该非易失性存储设备的功率时也保持指令。
逻辑处理器1654、易失性存储器1658和非易失性存储设备1662的各方面可以被一起集成到一个或多个硬件逻辑组件中。此类硬件逻辑组件可包括例如现场可编程门阵列(FPGA)、程序和应用专用集成电路(PASIC/ASIC)、程序和应用专用标准产品(PSSP/ASSP)、片上系统(SOC),以及复杂可编程逻辑器件(CPLD)。
术语“模块”、“程序”和“引擎”可被用来描述计算系统1650的被实现为执行特定功能的方面。在一些情形中,模块、程序或引擎可经由逻辑处理器1654执行由非易失性存储设备1662所保持的指令、使用易失性存储器1658的各部分来实例化。将理解,不同的模块、程序或引擎可以从相同的应用、服务、代码块、对象、库、例程、API、函数等实例化。类似地,相同的模块、程序和/或引擎可由不同的应用、服务、代码块、对象、例程、API、功能等来实例化。术语模块、程序和引擎涵盖单个或成群的可执行文件、数据文件、库、驱动程序、脚本、数据库记录等。
将理解,如本文中所使用的“服务”是可以是跨多个用户会话可执行的应用程序。服务可用于一个或多个系统组件、程序、和/或其他服务。在一些实现中,服务可以在一个或多个服务器计算设备上运行。
在包括显示子系统1666时,显示子系统1666可被用来呈现由非易失性存储设备1662保持的数据的视觉表示。由于本文中所描述的方法和过程改变了由非易失性存储设备保持的数据,并因而变换了非易失性存储设备的状态,因此同样可以变换显示子系统1666的状态以视觉地表示底层数据中的改变。显示子系统1666可包括利用实质上任何类型的技术的一个或多个显示设备。可将此类显示设备与逻辑处理器1654、易失性存储器1658和/或非易失性存储设备1412组合在共享外壳中,或此类显示设备可以是外围显示设备。
在包括输入子系统1670时,输入子系统1670可包括或对接于一个或多个用户输入设备。在一些实施例中,输入子系统可包括或对接于所选自然用户输入(NUI)部件。此类部件可以是集成的或外围的,并且输入动作的换能和/或处理可以在板上或板外被处置。示例NUI部件可包括用于语音和/或话音识别的话筒;用于机器视觉和/或姿势识别的红外、彩色、立体、和/或深度相机;用于运动检测、注视检测、和/或意图识别的头部跟踪器、眼睛跟踪器、加速度计、和/或陀螺仪;用于评估脑部活动的电场感测部件;关于以上讨论的示例使用情形和环境描述的任何传感器;和/或任何其他合适的传感器。
当包括通信子系统1674时,通信子系统1674可被配置成将计算系统1650与一个或多个其他计算设备通信地耦合。通信子系统1674可包括与一个或多个不同通信协议兼容的有线和/或无线通信设备。作为非限制性示例,通信子系统可被配置成用于经由无线电话网络、或者有线或无线局域网或广域网进行通信。在一些实施例中,通信子系统可允许计算系统1650经由诸如因特网之类的网络将数据发送至其他设备以及从其他设备接收数据。
根据本公开的示例实现,由计算系统执行的用于向智能助理计算机注册人的方法包括:获得经由一个或多个相机捕捉的描绘最初未注册人的一个或多个图像帧;从该一个或多个图像帧中提取该最初未注册人的面部识别数据;经由一个或多个话筒接收注册该最初未注册人的口述命令;确定该口述命令源自具有预建立的注册特权的注册人;在确定该口述命令源自具有预建立的注册特权的注册人之际,通过在新注册人的人员简档中将一个或多个附加特权与该面部识别数据相关联来将该最初未注册人注册为新注册人。在本文公开的该实现或任何其他实现中,在接收到注册该最初未注册人的口述命令之后捕捉一个或多个图像帧。在本文公开的该实现或任何其他实现中,该方法进一步包括:响应于接收到注册该最初未注册人的口述命令而引导该最初未注册人将其面部定位在一个或多个相机的视野内以捕捉用于面部识别的一个或多个图像帧。在本文公开的该实现或任何其他实现中,引导该最初未注册人包括经由音频扬声器输出听觉引导和/或经由图形显示设备输出视觉引导中的一者或多者。在本文公开的该实现或任何其他实现中,该方法进一步包括:响应于接收到注册该最初未注册人的口述命令而引导该最初未注册人说出一个或多个单词或短语;获得经由一个或多个话筒捕捉的包括由该最初未注册人说出的一个或多个单词或短语的一个或多个音频片段;从该一个或多个音频片段中提取该最初未注册人的发言者识别数据;以及将该发言者识别数据与该新注册人的人员简档相关联。在本文公开的该实现或任何其他实现中,在经由一个或多个相机捕捉到一个或多个图像帧之后接收注册该最初未注册人的口述命令。在本文公开的该实现或任何其他实现中,该方法进一步包括:在接收注册该最初未注册人的口述命令之前在数据存储系统中存储该一个或多个图像帧;从该数据存储系统中检索该一个或多个图像帧;经由图形显示设备呈现该一个或多个图像帧以供注册人审查;以及其中该口述命令在呈现该一个或多个图像帧期间或之后被接收。在本文公开的该实现或任何其他实现中,在最初未注册人离开该一个或多个相机的视野之后呈现该一个或多个图像帧。在本文公开的该实现或任何其他实现中,响应于由注册人发起的另一命令而呈现该一个或多个图像帧。在本文公开的该实现或任何其他实现中,该一个或多个图像帧形成经由该一个或多个相机捕捉的一个或多个视频片段的一部分;以及该方法进一步包括:标识该最初未注册人在该一个或多个视频片段内的发言活动;获得经由一个或多个话筒捕捉的与该一个或多个视频片段在时间上匹配的一个或多个音频片段;基于与该最初未注册人的发言活动相对应的一个或多个口述单词或短语来从该一个或多个音频片段中提取该最初未注册人的发言者识别数据;以及将该发言者识别数据与该人员简档相关联。在本文公开的该实现或任何其他实现中,该方法进一步包括:经由一个或多个话筒接收执行一个或多个操作的后续口述命令;基于该发言者识别数据来确定该后续口述命令源自具有一个或多个附加特权的新注册人;以及响应于该口述命令而执行由该一个或多个附加特权准许的一个或多个操作中的操作。在本文公开的该实现或任何其他实现中,该口述命令形成源自注册人的口述短语的一部分,其进一步包括新注册人的人标识符;以及该方法进一步包括将该人标识符与该新注册人的人员简档相关联。在本文公开的该实现或任何其他实现中,该口述命令形成源自注册人的口述短语的一部分,其进一步包括新注册人的标识与该人员简档相关联的一个或多个附加特权的特权标识符;其中该一个或多个附加特权中的每个特权准许将由智能助理计算机响应于源自该新注册人的命令而执行的先前在注册之前未被准许的一个或多个操作。在本文公开的该实现或任何其他实现中,该特权标识符指示新注册人是否被准许注册其他最初未注册人。
根据本公开的另一示例实现,计算系统包括捕捉图像数据的一个或多个相机;捕捉音频数据的一个或多个话筒;实现智能助理服务的一个或多个计算设备,其被配置成:获得经由该一个或多个相机捕捉的描绘最初未注册人的一个或多个图像帧;从该一个或多个图像帧中提取该最初未注册人的面部识别数据;经由该一个或多个话筒接收注册该最初未注册人的口述命令;确定该口述命令源自具有预建立的注册特权的注册人;在确定该口述命令源自具有预建立的注册特权的注册人之际,通过在存储在该一个或多个计算设备的数据存储系统中的新注册人的人员简档中将一个或多个附加特权与该面部识别数据相关联来将该最初未注册人注册为新注册人。在本文公开的该实现或任何其他实现中,在接收到注册该最初未注册人的口述命令之后捕捉一个或多个图像帧;以及其中该智能助理服务被进一步配置成:响应于接收到注册该最初未注册人的口述命令而引导该最初未注册人将其面部定位在一个或多个相机的视野内以捕捉用于面部识别的一个或多个图像帧。在本文公开的该实现或任何其他实现中,该智能助理服务被进一步配置成:响应于接收到注册该最初未注册人的口述命令而引导该最初未注册人说出一个或多个单词或短语;获得经由一个或多个话筒捕捉的包括由该最初未注册人说出的一个或多个单词或短语的一个或多个音频片段;从该一个或多个音频片段中提取该最初未注册人的发言者识别数据;以及将该发言者识别数据与该新注册人的人员简档相关联。在本文公开的该实现或任何其他实现中,在经由一个或多个相机捕捉到一个或多个图像帧之后接收注册该最初未注册人的口述命令;以及其中该智能助理服务被进一步配置成:在接收注册该最初未注册人的口述命令之前在数据存储系统中存储该一个或多个图像帧;在该最初未注册人离开该一个或多个相机的视野之后从该数据存储系统中检索该一个或多个图像帧;经由图形显示设备呈现该一个或多个图像帧以供注册人审查;以及其中该口述命令在呈现该一个或多个图像帧期间或之后被接收。在本文公开的该实现或任何其他实现中,该智能助理服务被进一步配置成:经由一个或多个话筒接收执行一个或多个操作的后续口述命令;确定该后续口述命令源自具有一个或多个附加特权的新注册人;以及响应于该口述命令而执行由该一个或多个附加特权准许的操作。
根据本公开的另一示例实现,由计算系统执行的用于向智能助理计算机注册一个人的方法包括:获得经由一个或多个相机捕捉的描绘最初未注册人的一个或多个图像帧;从该一个或多个图像帧中提取该最初未注册人的面部识别数据;获得经由一个或多个话筒捕捉的包括由该最初未注册人说出的一个或多个单词或短语的一个或多个音频片段;从该一个或多个音频片段中提取该最初未注册人的发言者识别数据;经由一个或多个话筒接收注册该最初未注册人的口述命令;确定该口述命令源自具有预建立的注册特权的注册人;在确定该口述命令源自具有预建立的注册特权的注册人之际,通过在新注册人的人员简档中将一个或多个附加特权与该面部识别数据和该话音识别数据相关联来将该最初未注册人注册为新注册人;在该新注册人的注册之后,经由该一个或多个话筒接收执行一个或多个操作的后续口述命令;基于该发言者识别数据来确定该后续口述命令源自具有一个或多个附加特权的新注册人;以及响应于该口述命令而执行由该一个或多个附加特权准许的操作。
将理解,本文中所描述的配置和/或办法本质上是示例性的,并且这些具体实施例或示例不应被视为具有限制意义,因为许多变体是可能的。本文中所描述的具体例程或方法可表示任何数目的处理策略中的一个或多个。由此,所解说和/或所描述的各种动作可以以所解说和/或所描述的顺序执行、以其他顺序执行、并行地执行,或者被省略。同样,以上所描述的过程的次序可被改变。
本公开的主题包括各种过程、系统和配置以及此处公开的其他特征、功能、动作和/或属性、以及它们的任一和全部等价物的所有新颖且非显而易见的组合和子组合。
Claims (15)
1.一种由计算系统执行的用于向智能助理计算机注册人的方法,所述方法包括:
获得经由一个或多个相机捕捉的描绘最初未注册人的一个或多个图像帧;
从所述一个或多个图像帧中提取所述最初未注册人的面部识别数据;
经由一个或多个话筒接收注册所述最初未注册人的口述命令;
确定所述口述命令源自具有预建立的注册特权的注册人;以及
在确定所述口述命令源自具有所述预建立的注册特权的所述注册人之际,通过在新注册人的人员简档中将一个或多个附加特权与所述面部识别数据相关联来将所述最初未注册人注册为新注册人。
2.根据权利要求1所述的方法,其特征在于,在接收到注册所述最初未注册人的所述口述命令之后捕捉所述一个或多个图像帧。
3.根据权利要求2所述的方法,其特征在于,进一步包括:
响应于接收到注册所述最初未注册人的所述口述命令而引导所述最初未注册人将其面部定位在所述一个或多个相机的视野内以捕捉用于面部识别的所述一个或多个图像帧。
4.根据权利要求3所述的方法,其特征在于,引导所述最初未注册人包括经由音频扬声器输出听觉引导和/或经由图形显示设备输出视觉引导中的一者或多者。
5.根据权利要求1所述的方法,其特征在于,进一步包括:
响应于接收到注册所述最初未注册人的所述口述命令而引导所述最初未注册人说出一个或多个单词或短语;
获得经由一个或多个话筒捕捉的包括由所述最初未注册人说出的所述一个或多个单词或短语的一个或多个音频片段;
从所述一个或多个音频片段中提取所述最初未注册人的发言者识别数据;以及
将所述发言者识别数据与所述新注册人的人员简档相关联。
6.根据权利要求1所述的方法,其特征在于,在经由所述一个或多个相机捕捉到所述一个或多个图像帧之后接收注册所述最初未注册人的所述口述命令。
7.根据权利要求6所述的方法,其特征在于,进一步包括:
在接收注册所述最初未注册人的所述口述命令之前在数据存储系统中存储所述一个或多个图像帧;
从所述数据存储系统中检索所述一个或多个图像帧;
经由图形显示设备呈现所述一个或多个图像帧以供所述注册人审查;以及
其中所述口述命令在呈现所述一个或多个图像帧期间或之后被接收。
8.根据权利要求7所述的方法,其特征在于,在所述最初未注册人离开所述一个或多个相机的视野之后呈现所述一个或多个图像帧。
9.根据权利要求7所述的方法,其特征在于,响应于由所述注册人发起的另一命令而呈现所述一个或多个图像帧。
10.根据权利要求1所述的方法,其特征在于,所述一个或多个图像帧形成经由所述一个或多个相机捕捉的一个或多个视频片段的一部分;以及
其中所述方法进一步包括:
标识所述最初未注册人在所述一个或多个视频片段内的发言活动;
获得经由一个或多个话筒捕捉的与所述一个或多个视频片段在时间上匹配的一个或多个音频片段;
基于与所述最初未注册人的发言活动相对应的一个或多个口述单词或短语来从所述一个或多个音频片段中提取所述最初未注册人的发言者识别数据;以及
将所述发言者识别数据与所述人员简档相关联。
11.根据权利要求10所述的方法,其特征在于,进一步包括:
经由一个或多个话筒接收执行一个或多个操作的后续口述命令;
基于所述发言者识别数据来确定所述后续口述命令源自具有所述一个或多个附加特权的所述新注册人;以及
响应于所述口述命令而执行由所述一个或多个附加特权准许的所述一个或多个操作中的操作。
12.根据权利要求1所述的方法,其特征在于,所述口述命令形成源自所述注册人的口述短语的一部分,所述口述命令进一步包括所述新注册人的人标识符;以及
其中所述方法进一步包括将所述人标识符与所述新注册人的人员简档相关联。
13.根据权利要求1所述的方法,其特征在于,所述口述命令形成源自所述注册人的口述短语的一部分,所述口述短语进一步包括所述新注册人的标识与所述人员简档相关联的所述一个或多个附加特权的特权标识符;
其中所述一个或多个附加特权中的每个特权准许将由所述智能助理计算机响应于源自所述新注册人的命令而执行的先前在注册之前未被准许的一个或多个操作。
14.根据权利要求13所述的方法,其特征在于,所述特权标识符指示所述新注册人是否被准许注册其他最初未注册人。
15.一种计算系统,包括:
捕捉图像数据的一个或多个相机;
捕捉音频数据的一个或多个话筒;
实现智能助理服务的一个或多个计算设备,所述一个或多个计算设备被配置成:
获得经由所述一个或多个相机捕捉的描绘最初未注册人的一个或多个图像帧;
从所述一个或多个图像帧中提取所述最初未注册人的面部识别数据;
经由所述一个或多个话筒接收注册所述最初未注册人的口述命令;
确定所述口述命令源自具有预建立的注册特权的注册人;以及
在确定所述口述命令源自具有所述预建立的注册特权的所述注册人之际,通过在存储在所述一个或多个计算设备的数据存储系统中的新注册人的人员简档中将一个或多个附加特权与所述面部识别数据相关联来将所述最初未注册人注册为新注册人。
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762459020P | 2017-02-14 | 2017-02-14 | |
US62/459,020 | 2017-02-14 | ||
US201762482165P | 2017-04-05 | 2017-04-05 | |
US62/482,165 | 2017-04-05 | ||
US15/682,425 US10579912B2 (en) | 2017-02-14 | 2017-08-21 | User registration for intelligent assistant computer |
US15/682,425 | 2017-08-21 | ||
PCT/US2018/017511 WO2018152010A1 (en) | 2017-02-14 | 2018-02-09 | User registration for intelligent assistant computer |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110313152A true CN110313152A (zh) | 2019-10-08 |
CN110313152B CN110313152B (zh) | 2021-10-22 |
Family
ID=63104544
Family Applications (11)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201880011578.3A Active CN110291760B (zh) | 2017-02-14 | 2018-02-07 | 用于导出用户意图的解析器 |
CN201880011716.8A Active CN110291489B (zh) | 2017-02-14 | 2018-02-07 | 计算上高效的人类标识智能助理计算机 |
CN201880011967.6A Active CN110301118B (zh) | 2017-02-14 | 2018-02-09 | 用于智能助理计算设备的位置校准 |
CN201880011910.6A Active CN110300946B (zh) | 2017-02-14 | 2018-02-09 | 智能助理 |
CN201880011970.8A Active CN110313153B (zh) | 2017-02-14 | 2018-02-09 | 智能数字助理系统 |
CN201880012028.3A Active CN110313154B (zh) | 2017-02-14 | 2018-02-09 | 具有基于意图的信息辨析的智能助理 |
CN201880011946.4A Active CN110313152B (zh) | 2017-02-14 | 2018-02-09 | 用于智能助理计算机的用户注册 |
CN201880011917.8A Pending CN110383235A (zh) | 2017-02-14 | 2018-02-09 | 多用户智能辅助 |
CN201880011885.1A Withdrawn CN110326261A (zh) | 2017-02-14 | 2018-02-09 | 确定音频输入中的说话者改变 |
CN202111348785.8A Active CN113986016B (zh) | 2017-02-14 | 2018-02-09 | 智能助理 |
CN201880011965.7A Active CN110326041B (zh) | 2017-02-14 | 2018-02-09 | 用于智能助理的自然语言交互 |
Family Applications Before (6)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201880011578.3A Active CN110291760B (zh) | 2017-02-14 | 2018-02-07 | 用于导出用户意图的解析器 |
CN201880011716.8A Active CN110291489B (zh) | 2017-02-14 | 2018-02-07 | 计算上高效的人类标识智能助理计算机 |
CN201880011967.6A Active CN110301118B (zh) | 2017-02-14 | 2018-02-09 | 用于智能助理计算设备的位置校准 |
CN201880011910.6A Active CN110300946B (zh) | 2017-02-14 | 2018-02-09 | 智能助理 |
CN201880011970.8A Active CN110313153B (zh) | 2017-02-14 | 2018-02-09 | 智能数字助理系统 |
CN201880012028.3A Active CN110313154B (zh) | 2017-02-14 | 2018-02-09 | 具有基于意图的信息辨析的智能助理 |
Family Applications After (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201880011917.8A Pending CN110383235A (zh) | 2017-02-14 | 2018-02-09 | 多用户智能辅助 |
CN201880011885.1A Withdrawn CN110326261A (zh) | 2017-02-14 | 2018-02-09 | 确定音频输入中的说话者改变 |
CN202111348785.8A Active CN113986016B (zh) | 2017-02-14 | 2018-02-09 | 智能助理 |
CN201880011965.7A Active CN110326041B (zh) | 2017-02-14 | 2018-02-09 | 用于智能助理的自然语言交互 |
Country Status (4)
Country | Link |
---|---|
US (17) | US10467509B2 (zh) |
EP (9) | EP3583485B1 (zh) |
CN (11) | CN110291760B (zh) |
WO (12) | WO2018151979A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI792693B (zh) * | 2021-11-18 | 2023-02-11 | 瑞昱半導體股份有限公司 | 用於進行人物重辨識的方法與裝置 |
Families Citing this family (595)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102006004197A1 (de) * | 2006-01-26 | 2007-08-09 | Klett, Rolf, Dr.Dr. | Verfahren und Vorrichtung zur Aufzeichnung von Körperbewegungen |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US8600120B2 (en) | 2008-01-03 | 2013-12-03 | Apple Inc. | Personal computing device control using face detection and recognition |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US9679255B1 (en) | 2009-02-20 | 2017-06-13 | Oneevent Technologies, Inc. | Event condition detection |
US20120309363A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Triggering notifications associated with tasks items that represent tasks to perform |
US9032565B2 (en) | 2009-12-16 | 2015-05-19 | Kohler Co. | Touchless faucet assembly and method of operation |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
WO2012154262A2 (en) * | 2011-02-21 | 2012-11-15 | TransRobotics, Inc. | System and method for sensing distance and/or movement |
US9002322B2 (en) | 2011-09-29 | 2015-04-07 | Apple Inc. | Authentication with secondary approver |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US10206610B2 (en) | 2012-10-05 | 2019-02-19 | TransRobotics, Inc. | Systems and methods for high resolution distance sensing and applications |
DE212014000045U1 (de) | 2013-02-07 | 2015-09-24 | Apple Inc. | Sprach-Trigger für einen digitalen Assistenten |
US10585568B1 (en) | 2013-02-22 | 2020-03-10 | The Directv Group, Inc. | Method and system of bookmarking content in a mobile device |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
AU2014278592B2 (en) | 2013-06-09 | 2017-09-07 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
WO2015020942A1 (en) | 2013-08-06 | 2015-02-12 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
US9633650B2 (en) * | 2013-08-28 | 2017-04-25 | Verint Systems Ltd. | System and method of automated model adaptation |
US9898642B2 (en) | 2013-09-09 | 2018-02-20 | Apple Inc. | Device, method, and graphical user interface for manipulating user interfaces based on fingerprint sensor inputs |
US10482461B2 (en) | 2014-05-29 | 2019-11-19 | Apple Inc. | User interface for payments |
AU2015266863B2 (en) | 2014-05-30 | 2018-03-15 | Apple Inc. | Multi-command single utterance input method |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US20190286713A1 (en) * | 2015-01-23 | 2019-09-19 | Conversica, Inc. | Systems and methods for enhanced natural language processing for machine learning conversations |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10200824B2 (en) | 2015-05-27 | 2019-02-05 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10740384B2 (en) | 2015-09-08 | 2020-08-11 | Apple Inc. | Intelligent automated assistant for media search and playback |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10956666B2 (en) | 2015-11-09 | 2021-03-23 | Apple Inc. | Unconventional virtual assistant interactions |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US11540009B2 (en) | 2016-01-06 | 2022-12-27 | Tvision Insights, Inc. | Systems and methods for assessing viewer engagement |
WO2017120469A1 (en) | 2016-01-06 | 2017-07-13 | Tvision Insights, Inc. | Systems and methods for assessing viewer engagement |
US10509626B2 (en) | 2016-02-22 | 2019-12-17 | Sonos, Inc | Handling of loss of pairing between networked devices |
US10264030B2 (en) | 2016-02-22 | 2019-04-16 | Sonos, Inc. | Networked microphone device control |
US9820039B2 (en) | 2016-02-22 | 2017-11-14 | Sonos, Inc. | Default playback devices |
US10095470B2 (en) | 2016-02-22 | 2018-10-09 | Sonos, Inc. | Audio response playback |
US9965247B2 (en) | 2016-02-22 | 2018-05-08 | Sonos, Inc. | Voice controlled media playback system based on user profile |
US9947316B2 (en) | 2016-02-22 | 2018-04-17 | Sonos, Inc. | Voice control of a media playback system |
US9978390B2 (en) | 2016-06-09 | 2018-05-22 | Sonos, Inc. | Dynamic player selection for audio signal processing |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
AU2017100670C4 (en) | 2016-06-12 | 2019-11-21 | Apple Inc. | User interfaces for retrieving contextually relevant media content |
US10853761B1 (en) | 2016-06-24 | 2020-12-01 | Amazon Technologies, Inc. | Speech-based inventory management system and method |
US11315071B1 (en) * | 2016-06-24 | 2022-04-26 | Amazon Technologies, Inc. | Speech-based storage tracking |
US10134399B2 (en) | 2016-07-15 | 2018-11-20 | Sonos, Inc. | Contextualization of voice inputs |
US10152969B2 (en) | 2016-07-15 | 2018-12-11 | Sonos, Inc. | Voice detection by multiple devices |
US10115400B2 (en) | 2016-08-05 | 2018-10-30 | Sonos, Inc. | Multiple voice services |
WO2018034169A1 (ja) * | 2016-08-17 | 2018-02-22 | ソニー株式会社 | 対話制御装置および方法 |
DK179978B1 (en) | 2016-09-23 | 2019-11-27 | Apple Inc. | IMAGE DATA FOR ENHANCED USER INTERACTIONS |
US9942678B1 (en) | 2016-09-27 | 2018-04-10 | Sonos, Inc. | Audio playback settings for voice interaction |
US9743204B1 (en) | 2016-09-30 | 2017-08-22 | Sonos, Inc. | Multi-orientation playback device microphones |
US10181323B2 (en) | 2016-10-19 | 2019-01-15 | Sonos, Inc. | Arbitration-based voice recognition |
US10430685B2 (en) * | 2016-11-16 | 2019-10-01 | Facebook, Inc. | Deep multi-scale video prediction |
US10249292B2 (en) * | 2016-12-14 | 2019-04-02 | International Business Machines Corporation | Using long short-term memory recurrent neural network for speaker diarization segmentation |
US10546575B2 (en) | 2016-12-14 | 2020-01-28 | International Business Machines Corporation | Using recurrent neural network for partitioning of audio data into segments that each correspond to a speech feature cluster identifier |
EP3352432A1 (en) * | 2017-01-20 | 2018-07-25 | Sentiance NV | Method and system for classifying an activity of a user |
EP3561643B1 (en) * | 2017-01-20 | 2023-07-19 | Huawei Technologies Co., Ltd. | Method and terminal for implementing voice control |
US10521448B2 (en) * | 2017-02-10 | 2019-12-31 | Microsoft Technology Licensing, Llc | Application of actionable task structures to disparate data sets for transforming data in the disparate data sets |
US10514827B2 (en) * | 2017-02-10 | 2019-12-24 | Microsoft Technology Licensing, Llc | Resequencing actionable task structures for transforming data |
US10481766B2 (en) * | 2017-02-10 | 2019-11-19 | Microsoft Technology Licensing, Llc | Interfaces and methods for generating and applying actionable task structures |
US11010601B2 (en) | 2017-02-14 | 2021-05-18 | Microsoft Technology Licensing, Llc | Intelligent assistant device communicating non-verbal cues |
US11100384B2 (en) | 2017-02-14 | 2021-08-24 | Microsoft Technology Licensing, Llc | Intelligent device user interactions |
US10467509B2 (en) | 2017-02-14 | 2019-11-05 | Microsoft Technology Licensing, Llc | Computationally-efficient human-identifying smart assistant computer |
US10657838B2 (en) * | 2017-03-15 | 2020-05-19 | International Business Machines Corporation | System and method to teach and evaluate image grading performance using prior learned expert knowledge base |
EP3599604A4 (en) * | 2017-03-24 | 2020-03-18 | Sony Corporation | INFORMATION PROCESSING DEVICE AND INFORMATION PROCESSING METHOD |
US11183181B2 (en) | 2017-03-27 | 2021-11-23 | Sonos, Inc. | Systems and methods of multiple voice services |
US11165723B2 (en) * | 2017-03-27 | 2021-11-02 | Seniorlink Inc. | Methods and systems for a bimodal auto-response mechanism for messaging applications |
US10839017B2 (en) * | 2017-04-06 | 2020-11-17 | AIBrain Corporation | Adaptive, interactive, and cognitive reasoner of an autonomous robotic system utilizing an advanced memory graph structure |
US10810371B2 (en) * | 2017-04-06 | 2020-10-20 | AIBrain Corporation | Adaptive, interactive, and cognitive reasoner of an autonomous robotic system |
US10929759B2 (en) | 2017-04-06 | 2021-02-23 | AIBrain Corporation | Intelligent robot software platform |
US11151992B2 (en) | 2017-04-06 | 2021-10-19 | AIBrain Corporation | Context aware interactive robot |
US10963493B1 (en) | 2017-04-06 | 2021-03-30 | AIBrain Corporation | Interactive game with robot system |
WO2018195391A1 (en) * | 2017-04-20 | 2018-10-25 | Tvision Insights, Inc. | Methods and apparatus for multi-television measurements |
US10887423B2 (en) * | 2017-05-09 | 2021-01-05 | Microsoft Technology Licensing, Llc | Personalization of virtual assistant skills based on user profile information |
DK180048B1 (en) | 2017-05-11 | 2020-02-04 | Apple Inc. | MAINTAINING THE DATA PROTECTION OF PERSONAL INFORMATION |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
US10769844B1 (en) * | 2017-05-12 | 2020-09-08 | Alarm.Com Incorporated | Marker aided three-dimensional mapping and object labeling |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770428A1 (en) | 2017-05-12 | 2019-02-18 | Apple Inc. | LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK201770411A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | MULTI-MODAL INTERFACES |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
CN107239139B (zh) * | 2017-05-18 | 2018-03-16 | 刘国华 | 基于正视的人机交互方法与系统 |
US11178280B2 (en) * | 2017-06-20 | 2021-11-16 | Lenovo (Singapore) Pte. Ltd. | Input during conversational session |
GB201710093D0 (en) * | 2017-06-23 | 2017-08-09 | Nokia Technologies Oy | Audio distance estimation for spatial audio processing |
WO2019002831A1 (en) | 2017-06-27 | 2019-01-03 | Cirrus Logic International Semiconductor Limited | REPRODUCTIVE ATTACK DETECTION |
GB2563953A (en) | 2017-06-28 | 2019-01-02 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
GB201713697D0 (en) | 2017-06-28 | 2017-10-11 | Cirrus Logic Int Semiconductor Ltd | Magnetic detection of replay attack |
US11170179B2 (en) * | 2017-06-30 | 2021-11-09 | Jpmorgan Chase Bank, N.A. | Systems and methods for natural language processing of structured documents |
GB201801530D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Methods, apparatus and systems for authentication |
GB201801527D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Method, apparatus and systems for biometric processes |
GB201801528D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Method, apparatus and systems for biometric processes |
GB201801526D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Methods, apparatus and systems for authentication |
GB201801532D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Methods, apparatus and systems for audio playback |
US11430437B2 (en) * | 2017-08-01 | 2022-08-30 | Sony Corporation | Information processor and information processing method |
US10475449B2 (en) | 2017-08-07 | 2019-11-12 | Sonos, Inc. | Wake-word detection suppression |
GB2565315B (en) * | 2017-08-09 | 2022-05-04 | Emotech Ltd | Robots, methods, computer programs, computer-readable media, arrays of microphones and controllers |
KR102389041B1 (ko) * | 2017-08-11 | 2022-04-21 | 엘지전자 주식회사 | 이동단말기 및 머신 러닝을 이용한 이동 단말기의 제어방법 |
US10339922B2 (en) * | 2017-08-23 | 2019-07-02 | Sap Se | Thematic segmentation of long content using deep learning and contextual cues |
EP3678385B1 (en) * | 2017-08-30 | 2023-01-04 | Panasonic Intellectual Property Management Co., Ltd. | Sound pickup device, sound pickup method, and program |
EP3451175A1 (en) * | 2017-08-31 | 2019-03-06 | Entit Software LLC | Chatbot version comparision |
US10515625B1 (en) * | 2017-08-31 | 2019-12-24 | Amazon Technologies, Inc. | Multi-modal natural language processing |
US10224033B1 (en) * | 2017-09-05 | 2019-03-05 | Motorola Solutions, Inc. | Associating a user voice query with head direction |
US11074911B2 (en) * | 2017-09-05 | 2021-07-27 | First Advantage Corporation | Digital assistant |
US10537244B1 (en) * | 2017-09-05 | 2020-01-21 | Amazon Technologies, Inc. | Using eye tracking to label computer vision datasets |
US10623199B2 (en) * | 2017-09-07 | 2020-04-14 | Lenovo (Singapore) Pte Ltd | Outputting audio based on user location |
US11475254B1 (en) * | 2017-09-08 | 2022-10-18 | Snap Inc. | Multimodal entity identification |
US10048930B1 (en) | 2017-09-08 | 2018-08-14 | Sonos, Inc. | Dynamic computation of system response volume |
US10438594B2 (en) * | 2017-09-08 | 2019-10-08 | Amazon Technologies, Inc. | Administration of privileges by speech for voice assistant system |
EP4156129A1 (en) | 2017-09-09 | 2023-03-29 | Apple Inc. | Implementation of biometric enrollment |
US11037554B1 (en) * | 2017-09-12 | 2021-06-15 | Wells Fargo Bank, N.A. | Network of domain knowledge based conversational agents |
US10083006B1 (en) * | 2017-09-12 | 2018-09-25 | Google Llc | Intercom-style communication using multiple computing devices |
US11170208B2 (en) * | 2017-09-14 | 2021-11-09 | Nec Corporation Of America | Physical activity authentication systems and methods |
US10531157B1 (en) * | 2017-09-21 | 2020-01-07 | Amazon Technologies, Inc. | Presentation and management of audio and visual content across devices |
US11238855B1 (en) * | 2017-09-26 | 2022-02-01 | Amazon Technologies, Inc. | Voice user interface entity resolution |
US10446165B2 (en) | 2017-09-27 | 2019-10-15 | Sonos, Inc. | Robust short-time fourier transform acoustic echo cancellation during audio playback |
US10621981B2 (en) | 2017-09-28 | 2020-04-14 | Sonos, Inc. | Tone interference cancellation |
US10482868B2 (en) | 2017-09-28 | 2019-11-19 | Sonos, Inc. | Multi-channel acoustic echo cancellation |
US10051366B1 (en) | 2017-09-28 | 2018-08-14 | Sonos, Inc. | Three-dimensional beam forming with a microphone array |
US10466962B2 (en) | 2017-09-29 | 2019-11-05 | Sonos, Inc. | Media playback system with voice assistance |
WO2019069529A1 (ja) * | 2017-10-02 | 2019-04-11 | ソニー株式会社 | 情報処理装置、情報処理方法、および、プログラム |
US20190103111A1 (en) * | 2017-10-03 | 2019-04-04 | Rupert Labs Inc. ( DBA Passage AI) | Natural Language Processing Systems and Methods |
US10542072B1 (en) * | 2017-10-04 | 2020-01-21 | Parallels International Gmbh | Utilities toolbox for remote session and client architecture |
EP3695392B1 (en) * | 2017-10-11 | 2024-02-28 | OneEvent Technologies, Inc. | Fire detection system |
GB201803570D0 (en) | 2017-10-13 | 2018-04-18 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
GB201804843D0 (en) | 2017-11-14 | 2018-05-09 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
JP2019072787A (ja) * | 2017-10-13 | 2019-05-16 | シャープ株式会社 | 制御装置、ロボット、制御方法、および制御プログラム |
GB201801661D0 (en) * | 2017-10-13 | 2018-03-21 | Cirrus Logic International Uk Ltd | Detection of liveness |
GB2567503A (en) | 2017-10-13 | 2019-04-17 | Cirrus Logic Int Semiconductor Ltd | Analysing speech signals |
GB201801874D0 (en) | 2017-10-13 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Improving robustness of speech processing system against ultrasound and dolphin attacks |
GB201801664D0 (en) | 2017-10-13 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Detection of liveness |
GB201801663D0 (en) * | 2017-10-13 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Detection of liveness |
KR102421255B1 (ko) * | 2017-10-17 | 2022-07-18 | 삼성전자주식회사 | 음성 신호를 제어하기 위한 전자 장치 및 방법 |
US10884597B2 (en) * | 2017-10-17 | 2021-01-05 | Paypal, Inc. | User interface customization based on facial recognition |
CN108305615B (zh) * | 2017-10-23 | 2020-06-16 | 腾讯科技(深圳)有限公司 | 一种对象识别方法及其设备、存储介质、终端 |
US10715604B1 (en) | 2017-10-26 | 2020-07-14 | Amazon Technologies, Inc. | Remote system processing based on a previously identified user |
US10567515B1 (en) * | 2017-10-26 | 2020-02-18 | Amazon Technologies, Inc. | Speech processing performed with respect to first and second user profiles in a dialog session |
WO2019087811A1 (ja) * | 2017-11-02 | 2019-05-09 | ソニー株式会社 | 情報処理装置、及び情報処理方法 |
KR101932263B1 (ko) * | 2017-11-03 | 2018-12-26 | 주식회사 머니브레인 | 적시에 실질적 답변을 제공함으로써 자연어 대화를 제공하는 방법, 컴퓨터 장치 및 컴퓨터 판독가능 기록 매체 |
US10546003B2 (en) | 2017-11-09 | 2020-01-28 | Adobe Inc. | Intelligent analytics interface |
CN107833264B (zh) * | 2017-11-13 | 2019-02-01 | 百度在线网络技术(北京)有限公司 | 一种图片处理方法、装置、设备和计算机可读存储介质 |
GB201801659D0 (en) | 2017-11-14 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Detection of loudspeaker playback |
GB201802309D0 (en) * | 2017-11-14 | 2018-03-28 | Cirrus Logic Int Semiconductor Ltd | Enrolment in speaker recognition system |
CN107886948A (zh) * | 2017-11-16 | 2018-04-06 | 百度在线网络技术(北京)有限公司 | 语音交互方法及装置,终端,服务器及可读存储介质 |
US11663182B2 (en) | 2017-11-21 | 2023-05-30 | Maria Emma | Artificial intelligence platform with improved conversational ability and personality development |
US10747968B2 (en) | 2017-11-22 | 2020-08-18 | Jeffrey S. Melcher | Wireless device and selective user control and management of a wireless device and data |
KR20190061706A (ko) * | 2017-11-28 | 2019-06-05 | 현대자동차주식회사 | 복수의도를 포함하는 명령어를 분석하는 음성 인식 시스템 및 방법 |
US10832683B2 (en) * | 2017-11-29 | 2020-11-10 | ILLUMA Labs LLC. | System and method for efficient processing of universal background models for speaker recognition |
US10950244B2 (en) * | 2017-11-29 | 2021-03-16 | ILLUMA Labs LLC. | System and method for speaker authentication and identification |
US10950243B2 (en) * | 2017-11-29 | 2021-03-16 | ILLUMA Labs Inc. | Method for reduced computation of t-matrix training for speaker recognition |
CN109887494B (zh) * | 2017-12-01 | 2022-08-16 | 腾讯科技(深圳)有限公司 | 重构语音信号的方法和装置 |
US10091554B1 (en) * | 2017-12-06 | 2018-10-02 | Echostar Technologies L.L.C. | Apparatus, systems and methods for generating an emotional-based content recommendation list |
US10475451B1 (en) * | 2017-12-06 | 2019-11-12 | Amazon Technologies, Inc. | Universal and user-specific command processing |
KR102518543B1 (ko) * | 2017-12-07 | 2023-04-07 | 현대자동차주식회사 | 사용자의 발화 에러 보정 장치 및 그 방법 |
US11182122B2 (en) | 2017-12-08 | 2021-11-23 | Amazon Technologies, Inc. | Voice control of computing devices |
US10503468B2 (en) | 2017-12-08 | 2019-12-10 | Amazon Technologies, Inc. | Voice enabling applications |
US10880650B2 (en) | 2017-12-10 | 2020-12-29 | Sonos, Inc. | Network microphone devices with automatic do not disturb actuation capabilities |
US10818290B2 (en) | 2017-12-11 | 2020-10-27 | Sonos, Inc. | Home graph |
KR102008267B1 (ko) * | 2017-12-12 | 2019-08-07 | 엘지전자 주식회사 | 라이팅 장치 및 이를 포함하는 공연 시스템 |
US10867129B1 (en) | 2017-12-12 | 2020-12-15 | Verisign, Inc. | Domain-name based operating environment for digital assistants and responders |
US10665230B1 (en) * | 2017-12-12 | 2020-05-26 | Verisign, Inc. | Alias-based access of entity information over voice-enabled digital assistants |
US10783013B2 (en) | 2017-12-15 | 2020-09-22 | Google Llc | Task-related sorting, application discovery, and unified bookmarking for application managers |
US11568003B2 (en) | 2017-12-15 | 2023-01-31 | Google Llc | Refined search with machine learning |
US10402986B2 (en) * | 2017-12-20 | 2019-09-03 | Facebook, Inc. | Unsupervised video segmentation |
US10846109B2 (en) | 2017-12-20 | 2020-11-24 | Google Llc | Suggesting actions based on machine learning |
WO2019129511A1 (en) * | 2017-12-26 | 2019-07-04 | Robert Bosch Gmbh | Speaker identification with ultra-short speech segments for far and near field voice assistance applications |
CN108346107B (zh) * | 2017-12-28 | 2020-11-10 | 创新先进技术有限公司 | 一种社交内容风险识别方法、装置以及设备 |
US11507172B2 (en) * | 2017-12-29 | 2022-11-22 | Google Llc | Smart context subsampling on-device system |
US10555024B2 (en) * | 2017-12-29 | 2020-02-04 | Facebook, Inc. | Generating a feed of content for presentation by a client device to users identified in video data captured by the client device |
KR102385263B1 (ko) * | 2018-01-04 | 2022-04-12 | 삼성전자주식회사 | 이동형 홈 로봇 및 이동형 홈 로봇의 제어 방법 |
US10878808B1 (en) * | 2018-01-09 | 2020-12-29 | Amazon Technologies, Inc. | Speech processing dialog management |
KR20190084789A (ko) * | 2018-01-09 | 2019-07-17 | 엘지전자 주식회사 | 전자 장치 및 그 제어 방법 |
US10845937B2 (en) * | 2018-01-11 | 2020-11-24 | International Business Machines Corporation | Semantic representation and realization for conversational systems |
US20190213284A1 (en) | 2018-01-11 | 2019-07-11 | International Business Machines Corporation | Semantic representation and realization for conversational systems |
US10795332B2 (en) * | 2018-01-16 | 2020-10-06 | Resilience Magnum IP, LLC | Facilitating automating home control |
EP3514564B1 (en) * | 2018-01-19 | 2023-05-31 | Centre National D'etudes Spatiales | Indoor positioning system |
US11475899B2 (en) | 2018-01-23 | 2022-10-18 | Cirrus Logic, Inc. | Speaker identification |
US11264037B2 (en) | 2018-01-23 | 2022-03-01 | Cirrus Logic, Inc. | Speaker identification |
US11735189B2 (en) | 2018-01-23 | 2023-08-22 | Cirrus Logic, Inc. | Speaker identification |
US20190235831A1 (en) * | 2018-01-31 | 2019-08-01 | Amazon Technologies, Inc. | User input processing restriction in a speech processing system |
US10991369B1 (en) * | 2018-01-31 | 2021-04-27 | Progress Software Corporation | Cognitive flow |
US11941114B1 (en) * | 2018-01-31 | 2024-03-26 | Vivint, Inc. | Deterrence techniques for security and automation systems |
WO2019152722A1 (en) | 2018-01-31 | 2019-08-08 | Sonos, Inc. | Device designation of playback and network microphone device arrangements |
US10431207B2 (en) * | 2018-02-06 | 2019-10-01 | Robert Bosch Gmbh | Methods and systems for intent detection and slot filling in spoken dialogue systems |
US20190251961A1 (en) * | 2018-02-15 | 2019-08-15 | Lenovo (Singapore) Pte. Ltd. | Transcription of audio communication to identify command to device |
US20190259500A1 (en) * | 2018-02-20 | 2019-08-22 | International Business Machines Corporation | Health Behavior Change for Intelligent Personal Assistants |
JP2019144790A (ja) * | 2018-02-20 | 2019-08-29 | 富士ゼロックス株式会社 | 情報処理装置及びプログラム |
US10878824B2 (en) * | 2018-02-21 | 2020-12-29 | Valyant Al, Inc. | Speech-to-text generation using video-speech matching from a primary speaker |
EP3723084A1 (en) | 2018-03-07 | 2020-10-14 | Google LLC | Facilitating end-to-end communications with automated assistants in multiple languages |
US11567182B2 (en) | 2018-03-09 | 2023-01-31 | Innovusion, Inc. | LiDAR safety systems and methods |
US10777203B1 (en) * | 2018-03-23 | 2020-09-15 | Amazon Technologies, Inc. | Speech interface device with caching component |
US10600408B1 (en) * | 2018-03-23 | 2020-03-24 | Amazon Technologies, Inc. | Content output management based on speech quality |
US20190295541A1 (en) * | 2018-03-23 | 2019-09-26 | Polycom, Inc. | Modifying spoken commands |
US10818288B2 (en) * | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US11132504B1 (en) * | 2018-03-27 | 2021-09-28 | Soundhound, Inc. | Framework for understanding complex natural language queries in a dialog context |
US11115630B1 (en) * | 2018-03-28 | 2021-09-07 | Amazon Technologies, Inc. | Custom and automated audio prompts for devices |
US10733996B2 (en) * | 2018-03-30 | 2020-08-04 | Qualcomm Incorporated | User authentication |
US20190311713A1 (en) * | 2018-04-05 | 2019-10-10 | GM Global Technology Operations LLC | System and method to fulfill a speech request |
US10720166B2 (en) * | 2018-04-09 | 2020-07-21 | Synaptics Incorporated | Voice biometrics systems and methods |
US10943606B2 (en) * | 2018-04-12 | 2021-03-09 | Qualcomm Incorporated | Context-based detection of end-point of utterance |
KR102443052B1 (ko) * | 2018-04-13 | 2022-09-14 | 삼성전자주식회사 | 공기 조화기 및 공기 조화기의 제어 방법 |
US11676220B2 (en) | 2018-04-20 | 2023-06-13 | Meta Platforms, Inc. | Processing multimodal user input for assistant systems |
US11386342B2 (en) * | 2018-04-20 | 2022-07-12 | H2O.Ai Inc. | Model interpretation |
US11922283B2 (en) | 2018-04-20 | 2024-03-05 | H2O.Ai Inc. | Model interpretation |
US11010436B1 (en) | 2018-04-20 | 2021-05-18 | Facebook, Inc. | Engaging users by personalized composing-content recommendation |
US11715042B1 (en) | 2018-04-20 | 2023-08-01 | Meta Platforms Technologies, Llc | Interpretability of deep reinforcement learning models in assistant systems |
US11886473B2 (en) | 2018-04-20 | 2024-01-30 | Meta Platforms, Inc. | Intent identification for agent matching by assistant systems |
EP3562090B1 (en) * | 2018-04-25 | 2020-07-01 | Siemens Aktiengesellschaft | Data processing device for processing a radio signal |
USD960177S1 (en) | 2018-05-03 | 2022-08-09 | CACI, Inc.—Federal | Display screen or portion thereof with graphical user interface |
US11256548B2 (en) | 2018-05-03 | 2022-02-22 | LGS Innovations LLC | Systems and methods for cloud computing data processing |
US10890969B2 (en) * | 2018-05-04 | 2021-01-12 | Google Llc | Invoking automated assistant function(s) based on detected gesture and gaze |
EP3859494B1 (en) | 2018-05-04 | 2023-12-27 | Google LLC | Adapting automated assistant based on detected mouth movement and/or gaze |
CN112639718B (zh) | 2018-05-04 | 2024-09-03 | 谷歌有限责任公司 | 自动化助手功能的免热词调配 |
CN112204500B (zh) * | 2018-05-04 | 2024-09-10 | 谷歌有限责任公司 | 根据用户和自动化助理界面之间的距离来生成和/或适应自动化助理内容 |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11487720B2 (en) * | 2018-05-08 | 2022-11-01 | Palantir Technologies Inc. | Unified data model and interface for databases storing disparate types of data |
US11308950B2 (en) * | 2018-05-09 | 2022-04-19 | 4PLAN Corporation | Personal location system for virtual assistant |
US11175880B2 (en) | 2018-05-10 | 2021-11-16 | Sonos, Inc. | Systems and methods for voice-assisted media content selection |
US10847178B2 (en) | 2018-05-18 | 2020-11-24 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection |
CN108877791B (zh) * | 2018-05-23 | 2021-10-08 | 百度在线网络技术(北京)有限公司 | 基于视图的语音交互方法、装置、服务器、终端和介质 |
US11704533B2 (en) * | 2018-05-23 | 2023-07-18 | Ford Global Technologies, Llc | Always listening and active voice assistant and vehicle operation |
US10959029B2 (en) | 2018-05-25 | 2021-03-23 | Sonos, Inc. | Determining and adapting to changes in microphone performance of playback devices |
US20190362318A1 (en) * | 2018-05-28 | 2019-11-28 | Open Invention Network Llc | Audio-based notifications |
US11556897B2 (en) | 2018-05-31 | 2023-01-17 | Microsoft Technology Licensing, Llc | Job-post budget recommendation based on performance |
JP7151181B2 (ja) * | 2018-05-31 | 2022-10-12 | トヨタ自動車株式会社 | 音声対話システム、その処理方法及びプログラム |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
DK179822B1 (da) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | VIRTUAL ASSISTANT OPERATION IN MULTI-DEVICE ENVIRONMENTS |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
US11170085B2 (en) | 2018-06-03 | 2021-11-09 | Apple Inc. | Implementation of biometric authentication |
US10979242B2 (en) * | 2018-06-05 | 2021-04-13 | Sap Se | Intelligent personal assistant controller where a voice command specifies a target appliance based on a confidence score without requiring uttering of a wake-word |
WO2019244455A1 (ja) * | 2018-06-21 | 2019-12-26 | ソニー株式会社 | 情報処理装置及び情報処理方法 |
US10818296B2 (en) * | 2018-06-21 | 2020-10-27 | Intel Corporation | Method and system of robust speaker recognition activation |
JP7326707B2 (ja) | 2018-06-21 | 2023-08-16 | カシオ計算機株式会社 | ロボット、ロボットの制御方法及びプログラム |
US11048782B2 (en) * | 2018-06-26 | 2021-06-29 | Lenovo (Singapore) Pte. Ltd. | User identification notification for non-personal device |
US11658926B2 (en) | 2018-06-27 | 2023-05-23 | Microsoft Technology Licensing, Llc | Generating smart replies involving image files |
US10777196B2 (en) * | 2018-06-27 | 2020-09-15 | The Travelers Indemnity Company | Systems and methods for cooperatively-overlapped and artificial intelligence managed interfaces |
US11062084B2 (en) * | 2018-06-27 | 2021-07-13 | Microsoft Technology Licensing, Llc | Generating diverse smart replies using synonym hierarchy |
US11188194B2 (en) | 2018-06-27 | 2021-11-30 | Microsoft Technology Licensing, Llc | Personalization and synonym hierarchy for smart replies |
US10681460B2 (en) | 2018-06-28 | 2020-06-09 | Sonos, Inc. | Systems and methods for associating playback devices with voice assistant services |
DE112019003383T5 (de) * | 2018-07-03 | 2021-04-08 | Sony Corporation | Informationsverarbeitungsvorrichtung undinformationsverarbeitungsverfahren |
CN109101801B (zh) * | 2018-07-12 | 2021-04-27 | 北京百度网讯科技有限公司 | 用于身份认证的方法、装置、设备和计算机可读存储介质 |
US11099753B2 (en) * | 2018-07-27 | 2021-08-24 | EMC IP Holding Company LLC | Method and apparatus for dynamic flow control in distributed storage systems |
US10692490B2 (en) | 2018-07-31 | 2020-06-23 | Cirrus Logic, Inc. | Detection of replay attack |
US11164037B2 (en) * | 2018-08-01 | 2021-11-02 | International Business Machines Corporation | Object instance ambiguity resolution |
CN109243435B (zh) * | 2018-08-07 | 2022-01-11 | 北京云迹科技有限公司 | 语音指令执行方法及系统 |
US20240095544A1 (en) * | 2018-08-07 | 2024-03-21 | Meta Platforms, Inc. | Augmenting Conversational Response with Volatility Information for Assistant Systems |
US11614526B1 (en) | 2018-08-24 | 2023-03-28 | Innovusion, Inc. | Virtual windows for LIDAR safety systems and methods |
US20200065513A1 (en) * | 2018-08-24 | 2020-02-27 | International Business Machines Corporation | Controlling content and content sources according to situational context |
MX2021002187A (es) * | 2018-08-24 | 2021-08-11 | Lutron Tech Co Llc | Dispositivo para detectar ocupantes. |
TWI682292B (zh) * | 2018-08-24 | 2020-01-11 | 內秋應智能科技股份有限公司 | 遞迴式整合對話之智能語音裝置 |
CN109242090B (zh) * | 2018-08-28 | 2020-06-26 | 电子科技大学 | 一种基于gan网络的视频描述及描述一致性判别方法 |
US10461710B1 (en) | 2018-08-28 | 2019-10-29 | Sonos, Inc. | Media playback system with maximum volume setting |
US11076035B2 (en) | 2018-08-28 | 2021-07-27 | Sonos, Inc. | Do not disturb feature for audio notifications |
US11402499B1 (en) | 2018-08-29 | 2022-08-02 | Amazon Technologies, Inc. | Processing audio signals for presence detection |
US10795018B1 (en) * | 2018-08-29 | 2020-10-06 | Amazon Technologies, Inc. | Presence detection using ultrasonic signals |
TWI676136B (zh) * | 2018-08-31 | 2019-11-01 | 雲云科技股份有限公司 | 使用雙重分析之影像偵測方法以及影像偵測裝置 |
US10915614B2 (en) | 2018-08-31 | 2021-02-09 | Cirrus Logic, Inc. | Biometric authentication |
US11037574B2 (en) | 2018-09-05 | 2021-06-15 | Cirrus Logic, Inc. | Speaker recognition and speaker change detection |
EP3620909B1 (en) * | 2018-09-06 | 2022-11-02 | Infineon Technologies AG | Method for a virtual assistant, data processing system hosting a virtual assistant for a user and agent device for enabling a user to interact with a virtual assistant |
CN109255181B (zh) * | 2018-09-07 | 2019-12-24 | 百度在线网络技术(北京)有限公司 | 一种基于多模型的障碍物分布仿真方法、装置以及终端 |
US10757207B1 (en) * | 2018-09-07 | 2020-08-25 | Amazon Technologies, Inc. | Presence detection |
US10891949B2 (en) * | 2018-09-10 | 2021-01-12 | Ford Global Technologies, Llc | Vehicle language processing |
US11163981B2 (en) * | 2018-09-11 | 2021-11-02 | Apple Inc. | Periocular facial recognition switching |
US10878811B2 (en) | 2018-09-14 | 2020-12-29 | Sonos, Inc. | Networked devices, systems, and methods for intelligently deactivating wake-word engines |
US10587430B1 (en) | 2018-09-14 | 2020-03-10 | Sonos, Inc. | Networked devices, systems, and methods for associating playback devices based on sound codes |
CN110908289A (zh) * | 2018-09-17 | 2020-03-24 | 珠海格力电器股份有限公司 | 智能家居的控制方法及装置 |
US11040441B2 (en) * | 2018-09-20 | 2021-06-22 | Sony Group Corporation | Situation-aware robot |
US11024331B2 (en) | 2018-09-21 | 2021-06-01 | Sonos, Inc. | Voice detection optimization using sound metadata |
AU2018443902B2 (en) | 2018-09-24 | 2021-05-13 | Google Llc | Controlling a device based on processing of image data that captures the device and/or an installation environment of the device |
US11049501B2 (en) | 2018-09-25 | 2021-06-29 | International Business Machines Corporation | Speech-to-text transcription with multiple languages |
US11308939B1 (en) * | 2018-09-25 | 2022-04-19 | Amazon Technologies, Inc. | Wakeword detection using multi-word model |
US10811015B2 (en) | 2018-09-25 | 2020-10-20 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US11100349B2 (en) | 2018-09-28 | 2021-08-24 | Apple Inc. | Audio assisted enrollment |
US11100923B2 (en) | 2018-09-28 | 2021-08-24 | Sonos, Inc. | Systems and methods for selective wake word detection using neural network models |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11210756B2 (en) * | 2018-09-28 | 2021-12-28 | Ford Global Technologies, Llc | Ride request interactions |
US10902208B2 (en) | 2018-09-28 | 2021-01-26 | International Business Machines Corporation | Personalized interactive semantic parsing using a graph-to-sequence model |
US10860096B2 (en) * | 2018-09-28 | 2020-12-08 | Apple Inc. | Device control using gaze information |
US10692518B2 (en) | 2018-09-29 | 2020-06-23 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection via multiple network microphone devices |
US10846105B2 (en) * | 2018-09-29 | 2020-11-24 | ILAN Yehuda Granot | User interface advisor |
US11238294B2 (en) * | 2018-10-08 | 2022-02-01 | Google Llc | Enrollment with an automated assistant |
CN112313741A (zh) * | 2018-10-08 | 2021-02-02 | 谷歌有限责任公司 | 选择性注册到自动助理 |
US11409961B2 (en) * | 2018-10-10 | 2022-08-09 | Verint Americas Inc. | System for minimizing repetition in intelligent virtual assistant conversations |
US11899519B2 (en) | 2018-10-23 | 2024-02-13 | Sonos, Inc. | Multiple stage network microphone device with reduced power consumption and processing load |
CN109376669A (zh) * | 2018-10-30 | 2019-02-22 | 南昌努比亚技术有限公司 | 智能助手的控制方法、移动终端及计算机可读存储介质 |
EP3647910A1 (en) * | 2018-10-30 | 2020-05-06 | Infineon Technologies AG | An improved apparatus for user interaction |
US10594837B1 (en) * | 2018-11-02 | 2020-03-17 | International Business Machines Corporation | Predictive service scaling for conversational computing |
US11004454B1 (en) * | 2018-11-06 | 2021-05-11 | Amazon Technologies, Inc. | Voice profile updating |
US11200884B1 (en) * | 2018-11-06 | 2021-12-14 | Amazon Technologies, Inc. | Voice profile updating |
US11138374B1 (en) * | 2018-11-08 | 2021-10-05 | Amazon Technologies, Inc. | Slot type authoring |
US11308281B1 (en) * | 2018-11-08 | 2022-04-19 | Amazon Technologies, Inc. | Slot type resolution process |
US11281857B1 (en) * | 2018-11-08 | 2022-03-22 | Amazon Technologies, Inc. | Composite slot type resolution |
US10896034B2 (en) * | 2018-11-14 | 2021-01-19 | Babu Vinod | Methods and systems for automated screen display generation and configuration |
US11288733B2 (en) * | 2018-11-14 | 2022-03-29 | Mastercard International Incorporated | Interactive 3D image projection systems and methods |
EP3654249A1 (en) | 2018-11-15 | 2020-05-20 | Snips | Dilated convolutions and gating for efficient keyword spotting |
US11037576B2 (en) * | 2018-11-15 | 2021-06-15 | International Business Machines Corporation | Distributed machine-learned emphatic communication for machine-to-human and machine-to-machine interactions |
US11423073B2 (en) | 2018-11-16 | 2022-08-23 | Microsoft Technology Licensing, Llc | System and management of semantic indicators during document presentations |
US10657968B1 (en) * | 2018-11-19 | 2020-05-19 | Google Llc | Controlling device output according to a determined condition of a user |
GB201819429D0 (en) * | 2018-11-29 | 2019-01-16 | Holovis International Ltd | Apparatus and method |
CN118609546A (zh) * | 2018-12-03 | 2024-09-06 | 谷歌有限责任公司 | 文本无关的说话者识别 |
US10839167B2 (en) * | 2018-12-04 | 2020-11-17 | Verizon Patent And Licensing Inc. | Systems and methods for dynamically expanding natural language processing agent capacity |
US10720150B2 (en) * | 2018-12-05 | 2020-07-21 | Bank Of America Corporation | Augmented intent and entity extraction using pattern recognition interstitial regular expressions |
JP7194897B2 (ja) * | 2018-12-06 | 2022-12-23 | パナソニックIpマネジメント株式会社 | 信号処理装置及び信号処理方法 |
US11183183B2 (en) | 2018-12-07 | 2021-11-23 | Sonos, Inc. | Systems and methods of operating media playback systems having multiple voice assistant services |
US10783901B2 (en) * | 2018-12-10 | 2020-09-22 | Amazon Technologies, Inc. | Alternate response generation |
US10853576B2 (en) * | 2018-12-13 | 2020-12-01 | Hong Kong Applied Science and Technology Research Institute Company Limited | Efficient and accurate named entity recognition method and apparatus |
US11132989B2 (en) | 2018-12-13 | 2021-09-28 | Sonos, Inc. | Networked microphone devices, systems, and methods of localized arbitration |
US10891336B2 (en) * | 2018-12-13 | 2021-01-12 | International Business Machines Corporation | Collaborative learned scoping to extend data reach for a search request |
US10602268B1 (en) | 2018-12-20 | 2020-03-24 | Sonos, Inc. | Optimization of network microphone devices using noise classification |
US11417236B2 (en) * | 2018-12-28 | 2022-08-16 | Intel Corporation | Real-time language learning within a smart space |
US11615793B1 (en) | 2019-01-02 | 2023-03-28 | Centene Corporation | Voice assistant configured to leverage information from sensing devices |
US11604832B2 (en) * | 2019-01-03 | 2023-03-14 | Lucomm Technologies, Inc. | System for physical-virtual environment fusion |
US11613010B2 (en) * | 2019-01-03 | 2023-03-28 | Lucomm Technologies, Inc. | Flux sensing system |
US11562565B2 (en) * | 2019-01-03 | 2023-01-24 | Lucomm Technologies, Inc. | System for physical-virtual environment fusion |
CN109800294B (zh) * | 2019-01-08 | 2020-10-13 | 中国科学院自动化研究所 | 基于物理环境博弈的自主进化智能对话方法、系统、装置 |
US11164562B2 (en) * | 2019-01-10 | 2021-11-02 | International Business Machines Corporation | Entity-level clarification in conversation services |
US10860864B2 (en) * | 2019-01-16 | 2020-12-08 | Charter Communications Operating, Llc | Surveillance and image analysis in a monitored environment |
US10867447B2 (en) * | 2019-01-21 | 2020-12-15 | Capital One Services, Llc | Overlaying 3D augmented reality content on real-world objects using image segmentation |
DE102019200733A1 (de) * | 2019-01-22 | 2020-07-23 | Carl Zeiss Industrielle Messtechnik Gmbh | Verfahren und Vorrichtung zur Bestimmung von mindestens einer räumlichen Position und Orientierung mindestens einer getrackten Messvorrichtung |
US11069081B1 (en) | 2019-01-25 | 2021-07-20 | Google Llc | Location discovery |
US10867604B2 (en) | 2019-02-08 | 2020-12-15 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing |
US11315556B2 (en) | 2019-02-08 | 2022-04-26 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification |
JP6851565B2 (ja) * | 2019-02-12 | 2021-03-31 | 三菱電機株式会社 | 機器制御装置、機器制御システム、機器制御方法、及び機器制御プログラム |
CN109767769B (zh) * | 2019-02-21 | 2020-12-22 | 珠海格力电器股份有限公司 | 一种语音识别方法、装置、存储介质及空调 |
KR20210134741A (ko) * | 2019-03-01 | 2021-11-10 | 구글 엘엘씨 | 어시스턴트 응답을 동적으로 적응시키는 방법, 시스템 및 매체 |
US11488063B2 (en) * | 2019-03-05 | 2022-11-01 | Honeywell International Inc. | Systems and methods for cognitive services of a connected FMS or avionics SaaS platform |
US11455987B1 (en) * | 2019-03-06 | 2022-09-27 | Amazon Technologies, Inc. | Multiple skills processing |
US20220129905A1 (en) * | 2019-03-08 | 2022-04-28 | [24]7.ai, Inc. | Agent console for facilitating assisted customer engagement |
CN110060389A (zh) * | 2019-03-13 | 2019-07-26 | 佛山市云米电器科技有限公司 | 智能门锁识别家庭成员的方法 |
CN110012266A (zh) * | 2019-03-14 | 2019-07-12 | 中电海康集团有限公司 | 一种规范派出所执法管理的系统和方法 |
US11346938B2 (en) | 2019-03-15 | 2022-05-31 | Msa Technology, Llc | Safety device for providing output to an individual associated with a hazardous environment |
EP3709194A1 (en) | 2019-03-15 | 2020-09-16 | Spotify AB | Ensemble-based data comparison |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US20200304375A1 (en) * | 2019-03-19 | 2020-09-24 | Microsoft Technology Licensing, Llc | Generation of digital twins of physical environments |
US10984783B2 (en) * | 2019-03-27 | 2021-04-20 | Intel Corporation | Spoken keyword detection based utterance-level wake on intent system |
US11698440B2 (en) * | 2019-04-02 | 2023-07-11 | Universal City Studios Llc | Tracking aggregation and alignment |
EP3719532B1 (en) | 2019-04-04 | 2022-12-28 | Transrobotics, Inc. | Technologies for acting based on object tracking |
DE102019205040A1 (de) * | 2019-04-09 | 2020-10-15 | Sivantos Pte. Ltd. | Hörgerät und Verfahren zum Betreiben eines solchen Hörgeräts |
US11222625B2 (en) * | 2019-04-15 | 2022-01-11 | Ademco Inc. | Systems and methods for training devices to recognize sound patterns |
WO2020213996A1 (en) * | 2019-04-17 | 2020-10-22 | Samsung Electronics Co., Ltd. | Method and apparatus for interrupt detection |
US11069346B2 (en) * | 2019-04-22 | 2021-07-20 | International Business Machines Corporation | Intent recognition model creation from randomized intent vector proximities |
CN110096191B (zh) * | 2019-04-24 | 2021-06-29 | 北京百度网讯科技有限公司 | 一种人机对话方法、装置及电子设备 |
CN111951782B (zh) * | 2019-04-30 | 2024-09-10 | 京东方科技集团股份有限公司 | 语音问答方法及装置、计算机可读存储介质和电子设备 |
CN110111787B (zh) * | 2019-04-30 | 2021-07-09 | 华为技术有限公司 | 一种语义解析方法及服务器 |
US11120794B2 (en) | 2019-05-03 | 2021-09-14 | Sonos, Inc. | Voice assistant persistence across multiple network microphone devices |
US11281862B2 (en) * | 2019-05-03 | 2022-03-22 | Sap Se | Significant correlation framework for command translation |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11240560B2 (en) | 2019-05-06 | 2022-02-01 | Google Llc | Assigning priority for an automated assistant according to a dynamic user queue and/or multi-modality presence detection |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
CN113785354A (zh) * | 2019-05-06 | 2021-12-10 | 谷歌有限责任公司 | 选择性地激活设备上语音识别并且在选择性地激活设备上的nlu和/或设备上履行中使用识别的文本 |
GB2583742B (en) * | 2019-05-08 | 2023-10-25 | Jaguar Land Rover Ltd | Activity identification method and apparatus |
CN110082723B (zh) * | 2019-05-16 | 2022-03-15 | 浙江大华技术股份有限公司 | 一种声源定位方法、装置、设备及存储介质 |
CN111984766B (zh) * | 2019-05-21 | 2023-02-24 | 华为技术有限公司 | 缺失语义补全方法及装置 |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
CN110176024B (zh) * | 2019-05-21 | 2023-06-02 | 腾讯科技(深圳)有限公司 | 在视频中对目标进行检测的方法、装置、设备和存储介质 |
US11272171B1 (en) * | 2019-05-24 | 2022-03-08 | Facebook Technologies, Llc | Systems and methods for fallback tracking based on real-time tracking performance |
US11482210B2 (en) * | 2019-05-29 | 2022-10-25 | Lg Electronics Inc. | Artificial intelligence device capable of controlling other devices based on device information |
CN110191320B (zh) * | 2019-05-29 | 2021-03-16 | 合肥学院 | 基于像素时序运动分析的视频抖动与冻结检测方法及装置 |
US10728384B1 (en) * | 2019-05-29 | 2020-07-28 | Intuit Inc. | System and method for redaction of sensitive audio events of call recordings |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | USER ACTIVITY SHORTCUT SUGGESTIONS |
DK201970510A1 (en) | 2019-05-31 | 2021-02-11 | Apple Inc | Voice identification in digital assistant systems |
JP6648876B1 (ja) * | 2019-05-31 | 2020-02-14 | 富士通株式会社 | 会話制御プログラム、会話制御方法および情報処理装置 |
US11468890B2 (en) | 2019-06-01 | 2022-10-11 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
US11256868B2 (en) * | 2019-06-03 | 2022-02-22 | Microsoft Technology Licensing, Llc | Architecture for resolving ambiguous user utterance |
US11302330B2 (en) * | 2019-06-03 | 2022-04-12 | Microsoft Technology Licensing, Llc | Clarifying questions for rewriting ambiguous user utterance |
WO2020246640A1 (ko) * | 2019-06-05 | 2020-12-10 | 엘지전자 주식회사 | 사용자의 위치를 결정하는 인공 지능 장치 및 그 방법 |
US11996098B2 (en) * | 2019-06-05 | 2024-05-28 | Hewlett-Packard Development Company, L.P. | Missed utterance resolutions |
WO2020246975A1 (en) * | 2019-06-05 | 2020-12-10 | Google Llc | Action validation for digital assistant-based applications |
US20200388280A1 (en) | 2019-06-05 | 2020-12-10 | Google Llc | Action validation for digital assistant-based applications |
US11200894B2 (en) | 2019-06-12 | 2021-12-14 | Sonos, Inc. | Network microphone device with command keyword eventing |
JP7279205B2 (ja) | 2019-06-12 | 2023-05-22 | ライブパーソン, インコーポレイテッド | 外部システム統合のためのシステムおよび方法 |
US10586540B1 (en) * | 2019-06-12 | 2020-03-10 | Sonos, Inc. | Network microphone device with command keyword conditioning |
US11361756B2 (en) | 2019-06-12 | 2022-06-14 | Sonos, Inc. | Conditional wake word eventing based on environment |
KR20210001082A (ko) * | 2019-06-26 | 2021-01-06 | 삼성전자주식회사 | 사용자 발화를 처리하는 전자 장치와 그 동작 방법 |
US11281727B2 (en) | 2019-07-03 | 2022-03-22 | International Business Machines Corporation | Methods and systems for managing virtual assistants in multiple device environments based on user movements |
US20220284920A1 (en) * | 2019-07-05 | 2022-09-08 | Gn Audio A/S | A method and a noise indicator system for identifying one or more noisy persons |
KR20220027935A (ko) | 2019-07-08 | 2022-03-08 | 삼성전자주식회사 | 전자 장치와 사용자 사이의 대화를 처리하는 방법 및 시스템 |
WO2021012263A1 (en) * | 2019-07-25 | 2021-01-28 | Baidu.Com Times Technology (Beijing) Co., Ltd. | Systems and methods for end-to-end deep reinforcement learning based coreference resolution |
CN110196914B (zh) * | 2019-07-29 | 2019-12-27 | 上海肇观电子科技有限公司 | 一种将人脸信息录入数据库的方法和装置 |
US11138975B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US11269872B1 (en) | 2019-07-31 | 2022-03-08 | Splunk Inc. | Intent-based natural language processing system |
US10871943B1 (en) | 2019-07-31 | 2020-12-22 | Sonos, Inc. | Noise classification for event detection |
US11138969B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US12008119B1 (en) * | 2019-07-31 | 2024-06-11 | United Services Automobile Association (Usaa) | Intelligent voice assistant privacy regulating system |
US20220245520A1 (en) * | 2019-08-02 | 2022-08-04 | Google Llc | Systems and Methods for Generating and Providing Suggested Actions |
GB2586242B (en) * | 2019-08-13 | 2022-07-06 | Innovative Tech Ltd | A method of enrolling a new member to a facial image database |
KR20210024861A (ko) * | 2019-08-26 | 2021-03-08 | 삼성전자주식회사 | 대화 서비스를 제공하는 방법 및 전자 디바이스 |
US11184298B2 (en) * | 2019-08-28 | 2021-11-23 | International Business Machines Corporation | Methods and systems for improving chatbot intent training by correlating user feedback provided subsequent to a failed response to an initial user intent |
US11094319B2 (en) | 2019-08-30 | 2021-08-17 | Spotify Ab | Systems and methods for generating a cleaned version of ambient sound |
WO2021045181A1 (ja) * | 2019-09-04 | 2021-03-11 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 通信装置、および通信方法 |
US12086541B2 (en) | 2019-09-04 | 2024-09-10 | Brain Technologies, Inc. | Natural query completion for a real-time morphing interface |
WO2021046449A1 (en) * | 2019-09-04 | 2021-03-11 | Brain Technologies, Inc. | Real-time morphing interface for display on a computer screen |
US11514911B2 (en) * | 2019-09-12 | 2022-11-29 | Oracle International Corporation | Reduced training for dialog systems using a database |
US11694032B2 (en) * | 2019-09-12 | 2023-07-04 | Oracle International Corporation | Template-based intent classification for chatbots |
CN112542157A (zh) * | 2019-09-23 | 2021-03-23 | 北京声智科技有限公司 | 语音处理方法、装置、电子设备及计算机可读存储介质 |
CN110798506B (zh) * | 2019-09-27 | 2023-03-10 | 华为技术有限公司 | 执行命令的方法、装置及设备 |
US11070721B2 (en) * | 2019-09-27 | 2021-07-20 | Gm Cruise Holdings Llc | Intent-based dynamic change of compute resources of vehicle perception system |
US11037000B2 (en) | 2019-09-27 | 2021-06-15 | Gm Cruise Holdings Llc | Intent-based dynamic change of resolution and region of interest of vehicle perception system |
US11238863B2 (en) * | 2019-09-30 | 2022-02-01 | Lenovo (Singapore) Pte. Ltd. | Query disambiguation using environmental audio |
US11223922B2 (en) * | 2019-10-17 | 2022-01-11 | Gulfstream Aerospace Corporation | Directional sound system for a vehicle |
US11308284B2 (en) | 2019-10-18 | 2022-04-19 | Facebook Technologies, Llc. | Smart cameras enabled by assistant systems |
US11567788B1 (en) | 2019-10-18 | 2023-01-31 | Meta Platforms, Inc. | Generating proactive reminders for assistant systems |
US11189286B2 (en) | 2019-10-22 | 2021-11-30 | Sonos, Inc. | VAS toggle based on device orientation |
TWI705383B (zh) * | 2019-10-25 | 2020-09-21 | 緯創資通股份有限公司 | 人物追蹤系統及人物追蹤方法 |
US11289086B2 (en) * | 2019-11-01 | 2022-03-29 | Microsoft Technology Licensing, Llc | Selective response rendering for virtual assistants |
EP4055779A4 (en) * | 2019-11-05 | 2023-08-23 | Qualcomm Incorporated | SENSOR PERFORMANCE INDICATION |
US11227583B2 (en) * | 2019-11-05 | 2022-01-18 | International Business Machines Corporation | Artificial intelligence voice response system having variable modes for interaction with user |
KR20210055347A (ko) | 2019-11-07 | 2021-05-17 | 엘지전자 주식회사 | 인공 지능 장치 |
JP7418563B2 (ja) * | 2019-11-08 | 2024-01-19 | グーグル エルエルシー | オンデバイスの機械学習モデルの訓練のための自動化アシスタントの機能の訂正の使用 |
US11687802B2 (en) * | 2019-11-13 | 2023-06-27 | Walmart Apollo, Llc | Systems and methods for proactively predicting user intents in personal agents |
US12026357B2 (en) * | 2019-11-13 | 2024-07-02 | Walmart Apollo, Llc | Personalizing user interface displays in real-time |
CN110928993B (zh) * | 2019-11-26 | 2023-06-30 | 重庆邮电大学 | 基于深度循环神经网络的用户位置预测方法及系统 |
KR102650488B1 (ko) | 2019-11-29 | 2024-03-25 | 삼성전자주식회사 | 전자장치와 그의 제어방법 |
CN112988986B (zh) * | 2019-12-02 | 2024-05-31 | 阿里巴巴集团控股有限公司 | 人机交互方法、装置与设备 |
CN113261056B (zh) * | 2019-12-04 | 2024-08-02 | 谷歌有限责任公司 | 使用说话者相关语音模型的说话者感知 |
US11676586B2 (en) * | 2019-12-10 | 2023-06-13 | Rovi Guides, Inc. | Systems and methods for providing voice command recommendations |
US11095578B2 (en) * | 2019-12-11 | 2021-08-17 | International Business Machines Corporation | Technology for chat bot translation |
US20230169959A1 (en) * | 2019-12-11 | 2023-06-01 | Google Llc | Processing concurrently received utterances from multiple users |
US11586677B2 (en) | 2019-12-12 | 2023-02-21 | International Business Machines Corporation | Resolving user expression having dependent intents |
US11481442B2 (en) | 2019-12-12 | 2022-10-25 | International Business Machines Corporation | Leveraging intent resolvers to determine multiple intents |
US11444893B1 (en) | 2019-12-13 | 2022-09-13 | Wells Fargo Bank, N.A. | Enhanced chatbot responses during conversations with unknown users based on maturity metrics determined from history of chatbot interactions |
EP3839802A1 (en) | 2019-12-16 | 2021-06-23 | Jetpack | Anonymized multi-sensor people tracking |
US11200900B2 (en) | 2019-12-20 | 2021-12-14 | Sonos, Inc. | Offline voice control |
CN111160002B (zh) * | 2019-12-27 | 2022-03-01 | 北京百度网讯科技有限公司 | 用于输出口语理解中解析异常信息的方法和装置 |
CN111161746B (zh) * | 2019-12-31 | 2022-04-15 | 思必驰科技股份有限公司 | 声纹注册方法及系统 |
US11687778B2 (en) | 2020-01-06 | 2023-06-27 | The Research Foundation For The State University Of New York | Fakecatcher: detection of synthetic portrait videos using biological signals |
US20230037085A1 (en) * | 2020-01-07 | 2023-02-02 | Google Llc | Preventing non-transient storage of assistant interaction data and/or wiping of stored assistant interaction data |
CN111274368B (zh) * | 2020-01-07 | 2024-04-16 | 北京声智科技有限公司 | 槽位填充方法及装置 |
US11562740B2 (en) | 2020-01-07 | 2023-01-24 | Sonos, Inc. | Voice verification for media playback |
CN111274815B (zh) * | 2020-01-15 | 2024-04-12 | 北京百度网讯科技有限公司 | 用于挖掘文本中的实体关注点的方法和装置 |
US20220310089A1 (en) * | 2020-01-17 | 2022-09-29 | Google Llc | Selectively invoking an automated assistant based on detected environmental conditions without necessitating voice-based invocation of the automated assistant |
CN113138557B (zh) * | 2020-01-17 | 2023-11-28 | 北京小米移动软件有限公司 | 家居设备控制方法、装置以及存储介质 |
US20210234823A1 (en) * | 2020-01-27 | 2021-07-29 | Antitoxin Technologies Inc. | Detecting and identifying toxic and offensive social interactions in digital communications |
EP3855348A1 (en) * | 2020-01-27 | 2021-07-28 | Microsoft Technology Licensing, LLC | Error management |
US11556307B2 (en) | 2020-01-31 | 2023-01-17 | Sonos, Inc. | Local voice data processing |
US11308958B2 (en) | 2020-02-07 | 2022-04-19 | Sonos, Inc. | Localized wakeword verification |
US11308959B2 (en) | 2020-02-11 | 2022-04-19 | Spotify Ab | Dynamic adjustment of wake word acceptance tolerance thresholds in voice-controlled devices |
US11328722B2 (en) * | 2020-02-11 | 2022-05-10 | Spotify Ab | Systems and methods for generating a singular voice audio stream |
WO2021162489A1 (en) * | 2020-02-12 | 2021-08-19 | Samsung Electronics Co., Ltd. | Method and voice assistance apparatus for providing an intelligence response |
US11816942B2 (en) * | 2020-02-19 | 2023-11-14 | TruU, Inc. | Detecting intent of a user requesting access to a secured asset |
CN111368046B (zh) * | 2020-02-24 | 2021-07-16 | 北京百度网讯科技有限公司 | 人机对话方法、装置、电子设备及存储介质 |
CN111281358A (zh) * | 2020-02-24 | 2020-06-16 | 湘潭大学 | 一种婴儿机器人实时监护系统 |
US11423232B2 (en) * | 2020-03-17 | 2022-08-23 | NFL Enterprises LLC | Systems and methods for deploying computerized conversational agents |
US11551685B2 (en) * | 2020-03-18 | 2023-01-10 | Amazon Technologies, Inc. | Device-directed utterance detection |
US11645563B2 (en) * | 2020-03-26 | 2023-05-09 | International Business Machines Corporation | Data filtering with fuzzy attribute association |
CN113448829B (zh) * | 2020-03-27 | 2024-06-04 | 来也科技(北京)有限公司 | 对话机器人测试方法、装置、设备及存储介质 |
CN111540350B (zh) * | 2020-03-31 | 2024-03-01 | 北京小米移动软件有限公司 | 一种智能语音控制设备的控制方法、装置及存储介质 |
KR102693431B1 (ko) * | 2020-04-01 | 2024-08-09 | 삼성전자주식회사 | 전자 장치 및 그의 오디오 출력을 제어하는 방법 |
CN111449637B (zh) * | 2020-04-07 | 2023-08-18 | 江西济民可信集团有限公司 | 一种动静脉内瘘血管的评价系统及方法 |
CN111488443B (zh) * | 2020-04-08 | 2022-07-12 | 思必驰科技股份有限公司 | 技能选择方法及装置 |
US11548158B2 (en) * | 2020-04-17 | 2023-01-10 | Abb Schweiz Ag | Automatic sensor conflict resolution for sensor fusion system |
CN113488035A (zh) * | 2020-04-28 | 2021-10-08 | 海信集团有限公司 | 一种语音信息的处理方法、装置、设备及介质 |
CN113658596A (zh) * | 2020-04-29 | 2021-11-16 | 扬智科技股份有限公司 | 语意辨识方法与语意辨识装置 |
US11617035B2 (en) * | 2020-05-04 | 2023-03-28 | Shure Acquisition Holdings, Inc. | Intelligent audio system using multiple sensor modalities |
US11590929B2 (en) * | 2020-05-05 | 2023-02-28 | Nvidia Corporation | Systems and methods for performing commands in a vehicle using speech and image recognition |
US11823082B2 (en) | 2020-05-06 | 2023-11-21 | Kore.Ai, Inc. | Methods for orchestrating an automated conversation in one or more networks and devices thereof |
US11038934B1 (en) | 2020-05-11 | 2021-06-15 | Apple Inc. | Digital assistant hardware abstraction |
US11238217B2 (en) * | 2020-05-11 | 2022-02-01 | International Business Machines Corporation | Task based self exploration of cognitive systems |
US11061543B1 (en) | 2020-05-11 | 2021-07-13 | Apple Inc. | Providing relevant data items based on context |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
US11736767B2 (en) | 2020-05-13 | 2023-08-22 | Roku, Inc. | Providing energy-efficient features using human presence detection |
US11395232B2 (en) * | 2020-05-13 | 2022-07-19 | Roku, Inc. | Providing safety and environmental features using human presence detection |
CN111640436B (zh) * | 2020-05-15 | 2024-04-19 | 北京青牛技术股份有限公司 | 向坐席提供通话对象的动态客户画像的方法 |
US11727919B2 (en) | 2020-05-20 | 2023-08-15 | Sonos, Inc. | Memory allocation for keyword spotting engines |
US11482224B2 (en) | 2020-05-20 | 2022-10-25 | Sonos, Inc. | Command keywords with input detection windowing |
US11308962B2 (en) | 2020-05-20 | 2022-04-19 | Sonos, Inc. | Input detection windowing |
CN111641875A (zh) * | 2020-05-21 | 2020-09-08 | 广州欢网科技有限责任公司 | 一种智能电视分析家庭成员的方法、装置和系统 |
CN111816173B (zh) * | 2020-06-01 | 2024-06-07 | 珠海格力电器股份有限公司 | 对话数据处理方法、装置、存储介质及计算机设备 |
US12033258B1 (en) | 2020-06-05 | 2024-07-09 | Meta Platforms Technologies, Llc | Automated conversation content items from natural language |
US11715326B2 (en) * | 2020-06-17 | 2023-08-01 | Microsoft Technology Licensing, Llc | Skin tone correction for body temperature estimation |
EP3925521A1 (en) * | 2020-06-18 | 2021-12-22 | Rockwell Collins, Inc. | Contact-less passenger screening and identification system |
US20210393148A1 (en) * | 2020-06-18 | 2021-12-23 | Rockwell Collins, Inc. | Physiological state screening system |
US12008985B2 (en) * | 2020-06-22 | 2024-06-11 | Amazon Technologies, Inc. | Natural language processing of declarative statements |
US11256484B2 (en) * | 2020-06-22 | 2022-02-22 | Accenture Global Solutions Limited | Utilizing natural language understanding and machine learning to generate an application |
US11289089B1 (en) * | 2020-06-23 | 2022-03-29 | Amazon Technologies, Inc. | Audio based projector control |
US11676368B2 (en) | 2020-06-30 | 2023-06-13 | Optum Services (Ireland) Limited | Identifying anomalous activity from thermal images |
CN111737417B (zh) * | 2020-07-03 | 2020-11-17 | 支付宝(杭州)信息技术有限公司 | 修正自然语言生成结果的方法和装置 |
US11574640B2 (en) * | 2020-07-13 | 2023-02-07 | Google Llc | User-assigned custom assistant responses to queries being submitted by another user |
US11490204B2 (en) | 2020-07-20 | 2022-11-01 | Apple Inc. | Multi-device audio adjustment coordination |
US11438683B2 (en) | 2020-07-21 | 2022-09-06 | Apple Inc. | User identification using headphones |
CN111951787A (zh) * | 2020-07-31 | 2020-11-17 | 北京小米松果电子有限公司 | 语音输出方法、装置、存储介质和电子设备 |
US11971957B2 (en) | 2020-08-08 | 2024-04-30 | Analog Devices International Unlimited Company | Aggregating sensor profiles of objects |
US12051239B2 (en) | 2020-08-11 | 2024-07-30 | Disney Enterprises, Inc. | Item location tracking via image analysis and projection |
CN111967273B (zh) * | 2020-08-16 | 2023-11-21 | 云知声智能科技股份有限公司 | 对话管理系统、方法和规则引擎设备 |
US11698771B2 (en) | 2020-08-25 | 2023-07-11 | Sonos, Inc. | Vocal guidance engines for playback devices |
CA3191100A1 (en) * | 2020-08-27 | 2022-03-03 | Dorian J. Cougias | Automatically identifying multi-word expressions |
US11562028B2 (en) * | 2020-08-28 | 2023-01-24 | International Business Machines Corporation | Concept prediction to create new intents and assign examples automatically in dialog systems |
CN111985249A (zh) * | 2020-09-03 | 2020-11-24 | 贝壳技术有限公司 | 语义分析方法、装置、计算机可读存储介质及电子设备 |
US11797610B1 (en) * | 2020-09-15 | 2023-10-24 | Elemental Cognition Inc. | Knowledge acquisition tool |
US20230325520A1 (en) * | 2020-09-21 | 2023-10-12 | Visa International Service Association | Alias directory |
US11568135B1 (en) | 2020-09-23 | 2023-01-31 | Amazon Technologies, Inc. | Identifying chat correction pairs for training models to automatically correct chat inputs |
US11922123B2 (en) * | 2020-09-30 | 2024-03-05 | Oracle International Corporation | Automatic out of scope transition for chatbot |
GB2594536B (en) * | 2020-10-12 | 2022-05-18 | Insphere Ltd | Photogrammetry system |
US11468900B2 (en) * | 2020-10-15 | 2022-10-11 | Google Llc | Speaker identification accuracy |
US11564036B1 (en) | 2020-10-21 | 2023-01-24 | Amazon Technologies, Inc. | Presence detection using ultrasonic signals with concurrent audio playback |
EP3989218A1 (de) * | 2020-10-21 | 2022-04-27 | Deutsche Telekom AG | Bedienungsfreundlicher virtueller sprachassistent |
US12062361B2 (en) * | 2020-11-02 | 2024-08-13 | Aondevices, Inc. | Wake word method to prolong the conversational state between human and a machine in edge devices |
US11984123B2 (en) | 2020-11-12 | 2024-05-14 | Sonos, Inc. | Network device interaction by range |
US11558546B2 (en) * | 2020-11-24 | 2023-01-17 | Google Llc | Conditional camera control via automated assistant commands |
CN112581955B (zh) * | 2020-11-30 | 2024-03-08 | 广州橙行智动汽车科技有限公司 | 语音控制方法、服务器、语音控制系统和可读存储介质 |
US11503090B2 (en) | 2020-11-30 | 2022-11-15 | At&T Intellectual Property I, L.P. | Remote audience feedback mechanism |
US11645465B2 (en) | 2020-12-10 | 2023-05-09 | International Business Machines Corporation | Anaphora resolution for enhanced context switching |
CN112417894B (zh) * | 2020-12-10 | 2023-04-07 | 上海方立数码科技有限公司 | 一种基于多任务学习的对话意图识别方法及识别系统 |
WO2022130011A1 (en) * | 2020-12-15 | 2022-06-23 | Orcam Technologies Ltd. | Wearable apparatus and methods |
US11816437B2 (en) * | 2020-12-15 | 2023-11-14 | International Business Machines Corporation | Automatical process application generation |
US12019720B2 (en) * | 2020-12-16 | 2024-06-25 | International Business Machines Corporation | Spatiotemporal deep learning for behavioral biometrics |
WO2022133125A1 (en) | 2020-12-16 | 2022-06-23 | Truleo, Inc. | Audio analysis of body worn camera |
CN112537582B (zh) * | 2020-12-18 | 2021-07-02 | 江苏华谊广告设备科技有限公司 | 一种视频智能垃圾分类设备 |
US11741400B1 (en) | 2020-12-18 | 2023-08-29 | Beijing Didi Infinity Technology And Development Co., Ltd. | Machine learning-based real-time guest rider identification |
US11250855B1 (en) | 2020-12-23 | 2022-02-15 | Nuance Communications, Inc. | Ambient cooperative intelligence system and method |
CN114697713B (zh) * | 2020-12-29 | 2024-02-06 | 深圳Tcl新技术有限公司 | 语音助手控制方法、装置、存储介质及智能电视 |
US11431766B2 (en) * | 2021-01-04 | 2022-08-30 | International Business Machines Corporation | Setting timers based on processing of group communications using natural language processing |
CN112802118B (zh) * | 2021-01-05 | 2022-04-08 | 湖北工业大学 | 一种光学卫星传感器在轨分时几何定标方法 |
US20220217442A1 (en) * | 2021-01-06 | 2022-07-07 | Lenovo (Singapore) Pte. Ltd. | Method and device to generate suggested actions based on passive audio |
CN112863511B (zh) * | 2021-01-15 | 2024-06-04 | 北京小米松果电子有限公司 | 信号处理方法、装置以及存储介质 |
US11893985B2 (en) * | 2021-01-15 | 2024-02-06 | Harman International Industries, Incorporated | Systems and methods for voice exchange beacon devices |
US20220230631A1 (en) * | 2021-01-18 | 2022-07-21 | PM Labs, Inc. | System and method for conversation using spoken language understanding |
US11551700B2 (en) | 2021-01-25 | 2023-01-10 | Sonos, Inc. | Systems and methods for power-efficient keyword detection |
US11922141B2 (en) * | 2021-01-29 | 2024-03-05 | Walmart Apollo, Llc | Voice and chatbot conversation builder |
US20220245489A1 (en) * | 2021-01-29 | 2022-08-04 | Salesforce.Com, Inc. | Automatic intent generation within a virtual agent platform |
CN112463945B (zh) * | 2021-02-02 | 2021-04-23 | 贝壳找房(北京)科技有限公司 | 会话语境划分方法与系统、交互方法与交互系统 |
US20220272124A1 (en) * | 2021-02-19 | 2022-08-25 | Intuit Inc. | Using machine learning for detecting solicitation of personally identifiable information (pii) |
WO2022182933A1 (en) | 2021-02-25 | 2022-09-01 | Nagpal Sumit Kumar | Technologies for tracking objects within defined areas |
US11715469B2 (en) | 2021-02-26 | 2023-08-01 | Walmart Apollo, Llc | Methods and apparatus for improving search retrieval using inter-utterance context |
US11580100B2 (en) * | 2021-03-05 | 2023-02-14 | Comcast Cable Communications, Llc | Systems and methods for advanced query generation |
US20220293096A1 (en) * | 2021-03-09 | 2022-09-15 | Sony Group Corporation | User-oriented actions based on audio conversation |
US20220293128A1 (en) * | 2021-03-10 | 2022-09-15 | Comcast Cable Communications, Llc | Systems and methods for improved speech and command detection |
US11727726B2 (en) * | 2021-03-11 | 2023-08-15 | Kemtai Ltd. | Evaluating movements of a person |
US11921831B2 (en) * | 2021-03-12 | 2024-03-05 | Intellivision Technologies Corp | Enrollment system with continuous learning and confirmation |
US12028178B2 (en) | 2021-03-19 | 2024-07-02 | Shure Acquisition Holdings, Inc. | Conferencing session facilitation systems and methods using virtual assistant systems and artificial intelligence algorithms |
US11493995B2 (en) * | 2021-03-24 | 2022-11-08 | International Business Machines Corporation | Tactile user interactions for personalized interactions |
US11790568B2 (en) | 2021-03-29 | 2023-10-17 | Kyndryl, Inc | Image entity extraction and granular interactivity articulation |
US11710479B1 (en) * | 2021-03-31 | 2023-07-25 | Amazon Technologies, Inc. | Contextual biasing of neural language models using metadata from a natural language understanding component and embedded recent history |
WO2022211737A1 (en) * | 2021-03-31 | 2022-10-06 | Emo Technologies Pte. Ltd. | Automatic detection of intention of natural language input text |
JP2022161078A (ja) * | 2021-04-08 | 2022-10-21 | 京セラドキュメントソリューションズ株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
US11617952B1 (en) | 2021-04-13 | 2023-04-04 | Electronic Arts Inc. | Emotion based music style change using deep learning |
TWI758162B (zh) * | 2021-04-15 | 2022-03-11 | 和碩聯合科技股份有限公司 | 生物形體的追蹤系統及方法 |
US12079884B2 (en) | 2021-04-19 | 2024-09-03 | Meta Platforms Technologies, Llc | Automated memory creation and retrieval from moment content items |
US11698780B2 (en) * | 2021-04-21 | 2023-07-11 | Hewlett Packard Enterprise Development Lp | Deployment and configuration of an edge site based on declarative intents indicative of a use case |
US11934787B2 (en) | 2021-04-29 | 2024-03-19 | International Business Machines Corporation | Intent determination in a messaging dialog manager system |
CN113380240B (zh) * | 2021-05-07 | 2022-04-12 | 荣耀终端有限公司 | 语音交互方法和电子设备 |
US12050261B2 (en) * | 2021-05-12 | 2024-07-30 | California State University Fresno Foundation | System and method for human and animal detection in low visibility |
US20220382819A1 (en) * | 2021-05-28 | 2022-12-01 | Google Llc | Search Results Based Triggering For Understanding User Intent On Assistant |
US11663024B2 (en) * | 2021-06-07 | 2023-05-30 | International Business Machines Corporation | Efficient collaboration using a virtual assistant |
US20220398428A1 (en) * | 2021-06-11 | 2022-12-15 | Disney Enterprises, Inc. | Situationally Aware Social Agent |
US11907273B2 (en) | 2021-06-18 | 2024-02-20 | International Business Machines Corporation | Augmenting user responses to queries |
US11908463B1 (en) * | 2021-06-29 | 2024-02-20 | Amazon Technologies, Inc. | Multi-session context |
US20230035941A1 (en) * | 2021-07-15 | 2023-02-02 | Apple Inc. | Speech interpretation based on environmental context |
US20230053267A1 (en) * | 2021-08-11 | 2023-02-16 | Rovi Guides, Inc. | Systems and methods for multi-agent conversations |
US11875792B2 (en) * | 2021-08-17 | 2024-01-16 | International Business Machines Corporation | Holographic interface for voice commands |
US20230077283A1 (en) * | 2021-09-07 | 2023-03-09 | Qualcomm Incorporated | Automatic mute and unmute for audio conferencing |
EP4338039A4 (en) * | 2021-09-21 | 2024-08-21 | Samsung Electronics Co Ltd | METHOD OF PROVIDING A PERSONALIZED RESPONSE FOR AN ELECTRONIC DEVICE |
US20230087896A1 (en) * | 2021-09-23 | 2023-03-23 | International Business Machines Corporation | Leveraging knowledge records for chatbot local search |
US12092753B2 (en) | 2021-09-24 | 2024-09-17 | International Business Machines Corporation | Measuring distance between two devices |
US12033162B2 (en) | 2021-10-13 | 2024-07-09 | Fmr Llc | Automated analysis of customer interaction text to generate customer intent information and hierarchy of customer issues |
KR20230054182A (ko) * | 2021-10-15 | 2023-04-24 | 주식회사 알체라 | 인공신경망을 이용한 사람 재식별 방법 및 이를 수행하기 위한 컴퓨팅 장치 |
US20230138741A1 (en) * | 2021-10-29 | 2023-05-04 | Kyndryl, Inc. | Social network adapted response |
CN114255557A (zh) * | 2021-11-30 | 2022-03-29 | 歌尔科技有限公司 | 智能安防控制方法、智能安防设备及控制器 |
US20230169540A1 (en) * | 2021-12-01 | 2023-06-01 | Walmart Apollo, Llc | Systems and methods of providing enhanced contextual intelligent information |
CN114385178B (zh) * | 2021-12-14 | 2024-07-23 | 厦门大学 | 基于抽象语法树结构信息增强的代码生成方法 |
CN114282530B (zh) * | 2021-12-24 | 2024-06-07 | 厦门大学 | 一种基于语法结构与连接信息触发的复杂句情感分析方法 |
US12044810B2 (en) | 2021-12-28 | 2024-07-23 | Samsung Electronics Co., Ltd. | On-device user presence detection using low power acoustics in the presence of multi-path sound propagation |
US11929845B2 (en) | 2022-01-07 | 2024-03-12 | International Business Machines Corporation | AI-based virtual proxy nodes for intent resolution in smart audio devices |
US12020704B2 (en) | 2022-01-19 | 2024-06-25 | Google Llc | Dynamic adaptation of parameter set used in hot word free adaptation of automated assistant |
US20230274091A1 (en) * | 2022-02-25 | 2023-08-31 | Robert Bosch Gmbh | Dialogue system with slot-filling strategies |
CN114712835B (zh) * | 2022-03-25 | 2022-10-14 | 中国地质大学(武汉) | 一种基于双目人体位姿识别的辅助训练系统 |
US20230316001A1 (en) * | 2022-03-29 | 2023-10-05 | Robert Bosch Gmbh | System and method with entity type clarification for fine-grained factual knowledge retrieval |
TWI837629B (zh) * | 2022-03-31 | 2024-04-01 | 謝基生 | 步態障礙的多光譜影像評估分析系統與靜壓水療鍛鍊裝置 |
US11464573B1 (en) * | 2022-04-27 | 2022-10-11 | Ix Innovation Llc | Methods and systems for real-time robotic surgical assistance in an operating room |
US20230351099A1 (en) * | 2022-05-02 | 2023-11-02 | Optum, Inc. | Supervised and unsupervised machine learning techniques for communication summarization |
US11546323B1 (en) | 2022-08-17 | 2023-01-03 | strongDM, Inc. | Credential management for distributed services |
US11736531B1 (en) | 2022-08-31 | 2023-08-22 | strongDM, Inc. | Managing and monitoring endpoint activity in secured networks |
US11765159B1 (en) | 2022-09-28 | 2023-09-19 | strongDM, Inc. | Connection revocation in overlay networks |
US20240112383A1 (en) * | 2022-10-04 | 2024-04-04 | Snap Inc. | Generating user interfaces in augmented reality environments |
US11916885B1 (en) | 2023-01-09 | 2024-02-27 | strongDM, Inc. | Tunnelling with support for dynamic naming resolution |
US11907673B1 (en) * | 2023-02-28 | 2024-02-20 | Fmr, Llc | Enhancing chatbot recognition of user intent through graph analysis |
US11765207B1 (en) * | 2023-03-17 | 2023-09-19 | strongDM, Inc. | Declaring network policies using natural language |
CN116311477B (zh) * | 2023-05-15 | 2023-08-01 | 华中科技大学 | 一种面向跨身份一致性的面部运动单元检测模型构建方法 |
CN116962196B (zh) * | 2023-06-08 | 2024-07-30 | 中国人民解放军国防科技大学 | 一种基于关系推理的机动通信网网络智能规划方法及系统 |
CN117883076B (zh) * | 2024-01-23 | 2024-06-18 | 北京邦尼营策科技有限公司 | 一种基于大数据的人体运动能量消耗监测系统及方法 |
CN117789740B (zh) * | 2024-02-23 | 2024-04-19 | 腾讯科技(深圳)有限公司 | 音频数据处理方法、装置、介质、设备及程序产品 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102510426A (zh) * | 2011-11-29 | 2012-06-20 | 安徽科大讯飞信息科技股份有限公司 | 个人助理应用访问方法及系统 |
CN104423563A (zh) * | 2013-09-10 | 2015-03-18 | 智高实业股份有限公司 | 非接触式实时互动方法及其系统 |
CN104462175A (zh) * | 2013-09-20 | 2015-03-25 | 国际商业机器公司 | 创建使用关联数据的集成用户接口的方法和系统 |
US20150172285A1 (en) * | 2013-12-17 | 2015-06-18 | Mei Ling LO | Method for Accessing E-Mail System |
Family Cites Families (295)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6073101A (en) | 1996-02-02 | 2000-06-06 | International Business Machines Corporation | Text independent speaker recognition for transparent command ambiguity resolution and continuous access control |
WO1999004286A1 (en) | 1997-07-18 | 1999-01-28 | Kohler Company | Bathroom fixture using radar detector having leaky transmission line to control fluid flow |
US6119088A (en) | 1998-03-03 | 2000-09-12 | Ciluffo; Gary | Appliance control programmer using voice recognition |
US6574601B1 (en) | 1999-01-13 | 2003-06-03 | Lucent Technologies Inc. | Acoustic speech recognizer system and method |
US6442524B1 (en) | 1999-01-29 | 2002-08-27 | Sony Corporation | Analyzing inflectional morphology in a spoken language translation system |
US6332122B1 (en) | 1999-06-23 | 2001-12-18 | International Business Machines Corporation | Transcription system for multiple speakers, using and establishing identification |
US7050110B1 (en) | 1999-10-29 | 2006-05-23 | Intel Corporation | Method and system for generating annotations video |
US6727925B1 (en) | 1999-12-20 | 2004-04-27 | Michelle Lyn Bourdelais | Browser-based room designer |
GB9930731D0 (en) | 1999-12-22 | 2000-02-16 | Ibm | Voice processing apparatus |
JP2001188784A (ja) * | 1999-12-28 | 2001-07-10 | Sony Corp | 会話処理装置および方法、並びに記録媒体 |
US8374875B2 (en) | 2000-01-31 | 2013-02-12 | Intel Corporation | Providing programming information in response to spoken requests |
US6757362B1 (en) * | 2000-03-06 | 2004-06-29 | Avaya Technology Corp. | Personal virtual assistant |
US6873953B1 (en) | 2000-05-22 | 2005-03-29 | Nuance Communications | Prosody based endpoint detection |
GB0023181D0 (en) | 2000-09-20 | 2000-11-01 | Koninkl Philips Electronics Nv | Message handover for networked beacons |
US6728679B1 (en) | 2000-10-30 | 2004-04-27 | Koninklijke Philips Electronics N.V. | Self-updating user interface/entertainment device that simulates personal interaction |
US7257537B2 (en) | 2001-01-12 | 2007-08-14 | International Business Machines Corporation | Method and apparatus for performing dialog management in a computer conversational interface |
US7610365B1 (en) | 2001-02-14 | 2009-10-27 | International Business Machines Corporation | Automatic relevance-based preloading of relevant information in portable devices |
US7171365B2 (en) | 2001-02-16 | 2007-01-30 | International Business Machines Corporation | Tracking time using portable recorders and speech recognition |
US7130446B2 (en) | 2001-12-03 | 2006-10-31 | Microsoft Corporation | Automatic detection and tracking of multiple individuals using multiple cues |
MXPA04006312A (es) | 2001-12-28 | 2004-11-10 | Simdesk Technologies Inc | Sistema de mensajes instantaneos. |
US7019749B2 (en) | 2001-12-28 | 2006-03-28 | Microsoft Corporation | Conversational interface agent |
US8374879B2 (en) | 2002-02-04 | 2013-02-12 | Microsoft Corporation | Systems and methods for managing interactions from multiple speech-enabled applications |
EP1376999A1 (en) | 2002-06-21 | 2004-01-02 | BRITISH TELECOMMUNICATIONS public limited company | Spoken alpha-numeric sequence entry system with repair mode |
US7803050B2 (en) | 2002-07-27 | 2010-09-28 | Sony Computer Entertainment Inc. | Tracking device with sound emitter for use in obtaining information for controlling game program execution |
US7783486B2 (en) | 2002-11-22 | 2010-08-24 | Roy Jonathan Rosser | Response generator for mimicking human-computer natural language conversation |
US7330566B2 (en) | 2003-05-15 | 2008-02-12 | Microsoft Corporation | Video-based gait recognition |
US7475010B2 (en) | 2003-09-03 | 2009-01-06 | Lingospot, Inc. | Adaptive and scalable method for resolving natural language ambiguities |
WO2005050621A2 (en) | 2003-11-21 | 2005-06-02 | Philips Intellectual Property & Standards Gmbh | Topic specific models for text formatting and speech recognition |
JP2005202014A (ja) | 2004-01-14 | 2005-07-28 | Sony Corp | 音声信号処理装置、音声信号処理方法および音声信号処理プログラム |
US7460052B2 (en) | 2004-01-20 | 2008-12-02 | Bae Systems Information And Electronic Systems Integration Inc. | Multiple frequency through-the-wall motion detection and ranging using a difference-based estimation technique |
US8965460B1 (en) | 2004-01-30 | 2015-02-24 | Ip Holdings, Inc. | Image and augmented reality based networks using mobile devices and intelligent electronic glasses |
US7061366B2 (en) | 2004-04-12 | 2006-06-13 | Microsoft Corporation | Finding location and ranging explorer |
US7071867B2 (en) | 2004-06-25 | 2006-07-04 | The Boeing Company | Method, apparatus, and computer program product for radar detection of moving target |
US7929017B2 (en) | 2004-07-28 | 2011-04-19 | Sri International | Method and apparatus for stereo, multi-camera tracking and RF and video track fusion |
WO2006034135A2 (en) | 2004-09-17 | 2006-03-30 | Proximex | Adaptive multi-modal integrated biometric identification detection and surveillance system |
US7716056B2 (en) | 2004-09-27 | 2010-05-11 | Robert Bosch Corporation | Method and system for interactive conversational dialogue for cognitively overloaded device users |
US20060067536A1 (en) | 2004-09-27 | 2006-03-30 | Michael Culbert | Method and system for time synchronizing multiple loudspeakers |
US8494855B1 (en) | 2004-10-06 | 2013-07-23 | West Interactive Corporation Ii | Method, system, and computer readable medium for comparing phonetic similarity of return words to resolve ambiguities during voice recognition |
US8170875B2 (en) | 2005-06-15 | 2012-05-01 | Qnx Software Systems Limited | Speech end-pointer |
KR20070016280A (ko) | 2005-08-02 | 2007-02-08 | 주식회사 팬택 | 이동 통신 단말기의 카메라 장치 및 그 조도제어방법 |
EP1920432A4 (en) | 2005-08-09 | 2011-03-16 | Mobile Voice Control Llc | LANGUAGE-CONTROLLED WIRELESS COMMUNICATION DEVICE SYSTEM |
US7319908B2 (en) | 2005-10-28 | 2008-01-15 | Microsoft Corporation | Multi-modal device power/mode management |
US20070152157A1 (en) | 2005-11-04 | 2007-07-05 | Raydon Corporation | Simulation arena entity tracking system |
JP2007220045A (ja) | 2006-02-20 | 2007-08-30 | Toshiba Corp | コミュニケーション支援装置、コミュニケーション支援方法およびコミュニケーション支援プログラム |
US20080119716A1 (en) | 2006-05-17 | 2008-05-22 | Olga Boric-Lubecke | Determining presence and/or physiological motion of one or more subjects with quadrature doppler radar receiver systems |
US7916897B2 (en) | 2006-08-11 | 2011-03-29 | Tessera Technologies Ireland Limited | Face tracking for controlling imaging parameters |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8214219B2 (en) | 2006-09-15 | 2012-07-03 | Volkswagen Of America, Inc. | Speech communications system for a vehicle and method of operating a speech communications system for a vehicle |
US7822605B2 (en) | 2006-10-19 | 2010-10-26 | Nice Systems Ltd. | Method and apparatus for large population speaker identification in telephone interactions |
US8139945B1 (en) | 2007-01-20 | 2012-03-20 | Centrak, Inc. | Methods and systems for synchronized infrared real time location |
AU2008209307B2 (en) | 2007-01-22 | 2010-12-02 | Auraya Pty Ltd | Voice recognition system and methods |
WO2008106655A1 (en) | 2007-03-01 | 2008-09-04 | Apapx, Inc. | System and method for dynamic learning |
US7518502B2 (en) | 2007-05-24 | 2009-04-14 | Smith & Nephew, Inc. | System and method for tracking surgical assets |
US8180029B2 (en) | 2007-06-28 | 2012-05-15 | Voxer Ip Llc | Telecommunication and multimedia management method and apparatus |
US8165087B2 (en) | 2007-06-30 | 2012-04-24 | Microsoft Corporation | Location context service handoff |
US8712758B2 (en) | 2007-08-31 | 2014-04-29 | Microsoft Corporation | Coreference resolution in an ambiguity-sensitive natural language processing system |
US8644842B2 (en) | 2007-09-04 | 2014-02-04 | Nokia Corporation | Personal augmented reality advertising |
US8902227B2 (en) | 2007-09-10 | 2014-12-02 | Sony Computer Entertainment America Llc | Selective interactive mapping of real-world objects to create interactive virtual-world objects |
US8140335B2 (en) | 2007-12-11 | 2012-03-20 | Voicebox Technologies, Inc. | System and method for providing a natural language voice user interface in an integrated voice navigation services environment |
CN101216885A (zh) | 2008-01-04 | 2008-07-09 | 中山大学 | 一种基于视频的行人人脸检测与跟踪算法 |
JP5075664B2 (ja) | 2008-02-15 | 2012-11-21 | 株式会社東芝 | 音声対話装置及び支援方法 |
US8265252B2 (en) | 2008-04-11 | 2012-09-11 | Palo Alto Research Center Incorporated | System and method for facilitating cognitive processing of simultaneous remote voice conversations |
US20090319269A1 (en) | 2008-06-24 | 2009-12-24 | Hagai Aronowitz | Method of Trainable Speaker Diarization |
US8213689B2 (en) | 2008-07-14 | 2012-07-03 | Google Inc. | Method and system for automated annotation of persons in video content |
US8639666B2 (en) | 2008-09-05 | 2014-01-28 | Cast Group Of Companies Inc. | System and method for real-time environment tracking and coordination |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US20100100851A1 (en) | 2008-10-16 | 2010-04-22 | International Business Machines Corporation | Mapping a real-world object in a personal virtual world |
US20100195906A1 (en) | 2009-02-03 | 2010-08-05 | Aricent Inc. | Automatic image enhancement |
US9031216B1 (en) | 2009-03-05 | 2015-05-12 | Google Inc. | In-conversation search |
US20100226487A1 (en) | 2009-03-09 | 2010-09-09 | Polycom, Inc. | Method & apparatus for controlling the state of a communication system |
US8700776B2 (en) | 2009-03-23 | 2014-04-15 | Google Inc. | System and method for editing a conversation in a hosted conversation system |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US20120265535A1 (en) | 2009-09-07 | 2012-10-18 | Donald Ray Bryant-Rich | Personal voice operated reminder system |
US8554562B2 (en) | 2009-11-15 | 2013-10-08 | Nuance Communications, Inc. | Method and system for speaker diarization |
US8779965B2 (en) | 2009-12-18 | 2014-07-15 | L-3 Communications Cyterra Corporation | Moving-entity detection |
US8676581B2 (en) | 2010-01-22 | 2014-03-18 | Microsoft Corporation | Speech recognition analysis via identification information |
US8683387B2 (en) | 2010-03-03 | 2014-03-25 | Cast Group Of Companies Inc. | System and method for visualizing virtual objects on a mobile device |
KR101135186B1 (ko) | 2010-03-03 | 2012-04-16 | 광주과학기술원 | 상호작용형 실시간 증강현실 시스템과 그 방법, 및 상기 방법을 구현하는 프로그램이 기록된 기록매체 |
US8543402B1 (en) | 2010-04-30 | 2013-09-24 | The Intellisis Corporation | Speaker segmentation in noisy conversational speech |
FR2960986A1 (fr) | 2010-06-04 | 2011-12-09 | Thomson Licensing | Procede de selection d’un objet dans un environnement virtuel |
US9113190B2 (en) | 2010-06-04 | 2015-08-18 | Microsoft Technology Licensing, Llc | Controlling power levels of electronic devices through user interaction |
CN101894553A (zh) * | 2010-07-23 | 2010-11-24 | 四川长虹电器股份有限公司 | 电视机语音控制的实现方法 |
US9134399B2 (en) | 2010-07-28 | 2015-09-15 | International Business Machines Corporation | Attribute-based person tracking across multiple cameras |
US8532994B2 (en) | 2010-08-27 | 2013-09-10 | Cisco Technology, Inc. | Speech recognition using a personal vocabulary and language model |
US8762150B2 (en) | 2010-09-16 | 2014-06-24 | Nuance Communications, Inc. | Using codec parameters for endpoint detection in speech recognition |
US8786625B2 (en) | 2010-09-30 | 2014-07-22 | Apple Inc. | System and method for processing image data using an image signal processor having back-end processing logic |
GB201020138D0 (en) | 2010-11-29 | 2011-01-12 | Third Sight Ltd | A memory aid device |
US9842299B2 (en) | 2011-01-25 | 2017-12-12 | Telepathy Labs, Inc. | Distributed, predictive, dichotomous decision engine for an electronic personal assistant |
US8903128B2 (en) | 2011-02-16 | 2014-12-02 | Siemens Aktiengesellschaft | Object recognition for security screening and long range video surveillance |
EP2684059B1 (en) * | 2011-03-10 | 2015-08-26 | Shockwatch, Inc. | Impact indicator |
US9842168B2 (en) | 2011-03-31 | 2017-12-12 | Microsoft Technology Licensing, Llc | Task driven user intents |
PL394570A1 (pl) | 2011-04-15 | 2012-10-22 | Robotics Inventions Spólka Z Ograniczona Odpowiedzialnoscia | Robot do podlóg podniesionych i sposób serwisowania podlóg podniesionych |
US9440144B2 (en) | 2011-04-21 | 2016-09-13 | Sony Interactive Entertainment Inc. | User identified to a controller |
US20120268604A1 (en) | 2011-04-25 | 2012-10-25 | Evan Tree | Dummy security device that mimics an active security device |
US8453402B2 (en) | 2011-04-29 | 2013-06-04 | Rong-Jun Huang | Frame unit of a curtain wall |
US8885882B1 (en) | 2011-07-14 | 2014-11-11 | The Research Foundation For The State University Of New York | Real time eye tracking for human computer interaction |
US9009142B2 (en) | 2011-07-27 | 2015-04-14 | Google Inc. | Index entries configured to support both conversation and message based searching |
US10387536B2 (en) | 2011-09-19 | 2019-08-20 | Personetics Technologies Ltd. | Computerized data-aware agent systems for retrieving data to serve a dialog between human user and computerized system |
US20130073293A1 (en) * | 2011-09-20 | 2013-03-21 | Lg Electronics Inc. | Electronic device and method for controlling the same |
US8401569B1 (en) | 2011-09-23 | 2013-03-19 | Sonic Notify, Inc. | System effective to demodulate a modulated code and provide content to a user |
US8762156B2 (en) | 2011-09-28 | 2014-06-24 | Apple Inc. | Speech recognition repair using contextual information |
US9268406B2 (en) | 2011-09-30 | 2016-02-23 | Microsoft Technology Licensing, Llc | Virtual spectator experience with a personal audio/visual apparatus |
US8340975B1 (en) | 2011-10-04 | 2012-12-25 | Theodore Alfred Rosenberger | Interactive speech recognition device and system for hands-free building control |
WO2013061268A2 (en) | 2011-10-26 | 2013-05-02 | Ariel-University Research And Development Company, Ltd. | Method and device for accurate location determination in a specified area |
CN104011788B (zh) | 2011-10-28 | 2016-11-16 | 奇跃公司 | 用于增强和虚拟现实的系统和方法 |
US8358903B1 (en) | 2011-10-31 | 2013-01-22 | iQuest, Inc. | Systems and methods for recording information on a mobile computing device |
US9214157B2 (en) | 2011-12-06 | 2015-12-15 | At&T Intellectual Property I, L.P. | System and method for machine-mediated human-human conversation |
US9389681B2 (en) | 2011-12-19 | 2016-07-12 | Microsoft Technology Licensing, Llc | Sensor fusion interface for multiple sensor input |
US8752145B1 (en) | 2011-12-30 | 2014-06-10 | Emc Corporation | Biometric authentication with smart mobile device |
WO2013101157A1 (en) | 2011-12-30 | 2013-07-04 | Intel Corporation | Range based user identification and profile determination |
CN103209030B (zh) | 2012-01-12 | 2015-05-13 | 宏碁股份有限公司 | 电子装置及其数据传输方法 |
US8693731B2 (en) | 2012-01-17 | 2014-04-08 | Leap Motion, Inc. | Enhanced contrast for object detection and characterization by optical imaging |
US8913103B1 (en) | 2012-02-01 | 2014-12-16 | Google Inc. | Method and apparatus for focus-of-attention control |
AU2012368731A1 (en) | 2012-02-03 | 2014-08-21 | Nec Corporation | Communication draw-in system, communication draw-in method, and communication draw-in program |
US20130212501A1 (en) | 2012-02-10 | 2013-08-15 | Glen J. Anderson | Perceptual computing with conversational agent |
US9500377B2 (en) | 2012-04-01 | 2016-11-22 | Mahesh Viswanathan | Extensible networked multi-modal environment conditioning system |
US9342143B1 (en) | 2012-04-17 | 2016-05-17 | Imdb.Com, Inc. | Determining display orientations for portable devices |
US9204095B2 (en) | 2012-05-04 | 2015-12-01 | Hong Jiang | Instant communications system having established communication channels between communication devices |
US9008688B2 (en) | 2012-05-07 | 2015-04-14 | Qualcomm Incorporated | Calendar matching of inferred contexts and label propagation |
US9423870B2 (en) | 2012-05-08 | 2016-08-23 | Google Inc. | Input determination method |
US20130342568A1 (en) | 2012-06-20 | 2013-12-26 | Tony Ambrus | Low light scene augmentation |
WO2014000129A1 (en) | 2012-06-30 | 2014-01-03 | Intel Corporation | 3d graphical user interface |
CN102760434A (zh) | 2012-07-09 | 2012-10-31 | 华为终端有限公司 | 一种声纹特征模型更新方法及终端 |
US9424233B2 (en) | 2012-07-20 | 2016-08-23 | Veveo, Inc. | Method of and system for inferring user intent in search input in a conversational interaction system |
US9669296B1 (en) | 2012-07-31 | 2017-06-06 | Niantic, Inc. | Linking real world activities with a parallel reality game |
US8953757B2 (en) | 2012-08-06 | 2015-02-10 | Angel.Com Incorporated | Preloading contextual information for applications using a conversation assistant |
AU2013221923A1 (en) | 2012-08-28 | 2014-03-20 | Solink Corporation | Transaction verification system |
US9424840B1 (en) | 2012-08-31 | 2016-08-23 | Amazon Technologies, Inc. | Speech recognition platforms |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
WO2014040124A1 (en) | 2012-09-11 | 2014-03-20 | Auraya Pty Ltd | Voice authentication system and method |
US8983383B1 (en) * | 2012-09-25 | 2015-03-17 | Rawles Llc | Providing hands-free service to multiple devices |
US10096316B2 (en) | 2013-11-27 | 2018-10-09 | Sri International | Sharing intents to provide virtual assistance in a multi-person dialog |
US9449343B2 (en) | 2012-10-05 | 2016-09-20 | Sap Se | Augmented-reality shopping using a networked mobile device |
US11099652B2 (en) | 2012-10-05 | 2021-08-24 | Microsoft Technology Licensing, Llc | Data and user interaction based on device proximity |
JP6066471B2 (ja) | 2012-10-12 | 2017-01-25 | 本田技研工業株式会社 | 対話システム及び対話システム向け発話の判別方法 |
US9031293B2 (en) | 2012-10-19 | 2015-05-12 | Sony Computer Entertainment Inc. | Multi-modal sensor based emotion recognition and emotional interface |
US9245497B2 (en) | 2012-11-01 | 2016-01-26 | Google Technology Holdings LLC | Systems and methods for configuring the display resolution of an electronic device based on distance and user presbyopia |
KR101709187B1 (ko) | 2012-11-14 | 2017-02-23 | 한국전자통신연구원 | 계층적 대화 태스크 라이브러리를 이용한 이중 대화관리 기반 음성대화시스템 |
US9085303B2 (en) | 2012-11-15 | 2015-07-21 | Sri International | Vehicle personal assistant |
US9633652B2 (en) | 2012-11-30 | 2017-04-25 | Stmicroelectronics Asia Pacific Pte Ltd. | Methods, systems, and circuits for speaker dependent voice recognition with a single lexicon |
CN103076095B (zh) | 2012-12-11 | 2015-09-09 | 广州飒特红外股份有限公司 | 一种以平板电脑无线操控红外热像仪的机动载体夜间驾驶辅助系统 |
US9271111B2 (en) * | 2012-12-14 | 2016-02-23 | Amazon Technologies, Inc. | Response endpoint selection |
CN104782121A (zh) | 2012-12-18 | 2015-07-15 | 英特尔公司 | 多区域视频会议编码 |
US9098467B1 (en) * | 2012-12-19 | 2015-08-04 | Rawles Llc | Accepting voice commands based on user identity |
US9070366B1 (en) | 2012-12-19 | 2015-06-30 | Amazon Technologies, Inc. | Architecture for multi-domain utterance processing |
US20140180629A1 (en) | 2012-12-22 | 2014-06-26 | Ecole Polytechnique Federale De Lausanne Epfl | Method and a system for determining the geometry and/or the localization of an object |
US9466286B1 (en) | 2013-01-16 | 2016-10-11 | Amazong Technologies, Inc. | Transitioning an electronic device between device states |
DE102013001219B4 (de) | 2013-01-25 | 2019-08-29 | Inodyn Newmedia Gmbh | Verfahren und System zur Sprachaktivierung eines Software-Agenten aus einem Standby-Modus |
US9761247B2 (en) | 2013-01-31 | 2017-09-12 | Microsoft Technology Licensing, Llc | Prosodic and lexical addressee detection |
US9292492B2 (en) | 2013-02-04 | 2016-03-22 | Microsoft Technology Licensing, Llc | Scaling statistical language understanding systems across domains and intents |
US9159116B2 (en) | 2013-02-13 | 2015-10-13 | Google Inc. | Adaptive screen interfaces based on viewing distance |
US10585568B1 (en) | 2013-02-22 | 2020-03-10 | The Directv Group, Inc. | Method and system of bookmarking content in a mobile device |
US9460715B2 (en) * | 2013-03-04 | 2016-10-04 | Amazon Technologies, Inc. | Identification using audio signatures and additional characteristics |
US9171542B2 (en) | 2013-03-11 | 2015-10-27 | Nuance Communications, Inc. | Anaphora resolution using linguisitic cues, dialogue context, and general knowledge |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
AU2014227586C1 (en) | 2013-03-15 | 2020-01-30 | Apple Inc. | User training by intelligent digital assistant |
WO2014169287A1 (en) | 2013-04-12 | 2014-10-16 | Sciometrics Llc | The identity caddy: a tool for real-time determination of identity in the mobile environment |
US20160086018A1 (en) | 2013-04-26 | 2016-03-24 | West Virginia High Technology Consortium Foundation, Inc. | Facial recognition method and apparatus |
US9123330B1 (en) | 2013-05-01 | 2015-09-01 | Google Inc. | Large-scale speaker identification |
US9472205B2 (en) | 2013-05-06 | 2016-10-18 | Honeywell International Inc. | Device voice recognition systems and methods |
US9207772B2 (en) | 2013-05-13 | 2015-12-08 | Ohio University | Motion-based identity authentication of an individual with a communications device |
CN105122353B (zh) | 2013-05-20 | 2019-07-09 | 英特尔公司 | 用于语音识别的计算装置和用于计算装置上的语音识别的方法 |
NZ719940A (en) | 2013-06-03 | 2017-03-31 | Machine Zone Inc | Systems and methods for multi-user multi-lingual communications |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
US10262462B2 (en) | 2014-04-18 | 2019-04-16 | Magic Leap, Inc. | Systems and methods for augmented and virtual reality |
US9307355B2 (en) | 2013-06-27 | 2016-04-05 | Bluecats Australia Pty Limited | Location enabled service for enhancement of smart device and enterprise software applications |
US9871865B2 (en) | 2013-07-11 | 2018-01-16 | Neura, Inc. | Physical environment profiling through internet of things integration platform |
WO2015008162A2 (en) | 2013-07-15 | 2015-01-22 | Vocavu Solutions Ltd. | Systems and methods for textual content creation from sources of audio that contain speech |
WO2015009748A1 (en) | 2013-07-15 | 2015-01-22 | Dts, Inc. | Spatial calibration of surround sound systems including listener position estimation |
US9460722B2 (en) | 2013-07-17 | 2016-10-04 | Verint Systems Ltd. | Blind diarization of recorded calls with arbitrary number of speakers |
US9431014B2 (en) | 2013-07-25 | 2016-08-30 | Haier Us Appliance Solutions, Inc. | Intelligent placement of appliance response to voice command |
KR102158208B1 (ko) | 2013-07-26 | 2020-10-23 | 엘지전자 주식회사 | 전자기기 및 그것의 제어 방법 |
US9558749B1 (en) | 2013-08-01 | 2017-01-31 | Amazon Technologies, Inc. | Automatic speaker identification using speech recognition features |
JP6468725B2 (ja) | 2013-08-05 | 2019-02-13 | キヤノン株式会社 | 画像処理装置、画像処理方法、及びコンピュータプログラム |
CN104423537B (zh) | 2013-08-19 | 2017-11-24 | 联想(北京)有限公司 | 信息处理方法及电子设备 |
US9668052B2 (en) * | 2013-09-25 | 2017-05-30 | Google Technology Holdings LLC | Audio routing system for routing audio data to and from a mobile device |
KR20150041972A (ko) | 2013-10-10 | 2015-04-20 | 삼성전자주식회사 | 디스플레이 장치 및 이의 절전 처리 방법 |
US20150134547A1 (en) | 2013-11-09 | 2015-05-14 | Artases OIKONOMIDIS | Belongings visualization and record system |
US9892723B2 (en) | 2013-11-25 | 2018-02-13 | Rovi Guides, Inc. | Systems and methods for presenting social network communications in audible form based on user engagement with a user device |
US20150162000A1 (en) | 2013-12-10 | 2015-06-11 | Harman International Industries, Incorporated | Context aware, proactive digital assistant |
CN103761505A (zh) * | 2013-12-18 | 2014-04-30 | 微软公司 | 对象跟踪 |
EP3084714A4 (en) | 2013-12-20 | 2017-08-02 | Robert Bosch GmbH | System and method for dialog-enabled context-dependent and user-centric content presentation |
EP3089158B1 (en) | 2013-12-26 | 2018-08-08 | Panasonic Intellectual Property Management Co., Ltd. | Speech recognition processing |
US9451377B2 (en) | 2014-01-07 | 2016-09-20 | Howard Massey | Device, method and software for measuring distance to a sound generator by using an audible impulse signal |
US10360907B2 (en) | 2014-01-14 | 2019-07-23 | Toyota Motor Engineering & Manufacturing North America, Inc. | Smart necklace with stereo vision and onboard processing |
US9606977B2 (en) | 2014-01-22 | 2017-03-28 | Google Inc. | Identifying tasks in messages |
US9311932B2 (en) | 2014-01-23 | 2016-04-12 | International Business Machines Corporation | Adaptive pause detection in speech recognition |
IN2014DE00332A (zh) | 2014-02-05 | 2015-08-07 | Nitin Vats | |
GB2522922A (en) | 2014-02-11 | 2015-08-12 | High Mead Developments Ltd | Electronic guard systems |
US9318112B2 (en) | 2014-02-14 | 2016-04-19 | Google Inc. | Recognizing speech in the presence of additional audio |
KR20150101088A (ko) | 2014-02-26 | 2015-09-03 | (주) 에핀 | 3차원 영상 획득 및 제공방법 |
JP2015184563A (ja) * | 2014-03-25 | 2015-10-22 | シャープ株式会社 | 対話型家電システム、サーバ装置、対話型家電機器、家電システムが対話を行なうための方法、当該方法をコンピュータに実現させるためのプログラム |
US9293141B2 (en) | 2014-03-27 | 2016-03-22 | Storz Endoskop Produktions Gmbh | Multi-user voice control system for medical devices |
US9710546B2 (en) | 2014-03-28 | 2017-07-18 | Microsoft Technology Licensing, Llc | Explicit signals personalized search |
US9372851B2 (en) | 2014-04-01 | 2016-06-21 | Microsoft Technology Licensing, Llc | Creating a calendar event using context |
WO2015162458A1 (en) | 2014-04-24 | 2015-10-29 | Singapore Telecommunications Limited | Knowledge model for personalization and location services |
US10235567B2 (en) | 2014-05-15 | 2019-03-19 | Fenwal, Inc. | Head mounted display device for use in a medical facility |
KR102216048B1 (ko) | 2014-05-20 | 2021-02-15 | 삼성전자주식회사 | 음성 명령 인식 장치 및 방법 |
US10726831B2 (en) | 2014-05-20 | 2020-07-28 | Amazon Technologies, Inc. | Context interpretation in natural language processing using previous dialog acts |
WO2015183014A1 (en) | 2014-05-28 | 2015-12-03 | Samsung Electronics Co., Ltd. | Apparatus and method for controlling internet of things devices |
US9384738B2 (en) | 2014-06-24 | 2016-07-05 | Google Inc. | Dynamic threshold for speaker verification |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9257120B1 (en) | 2014-07-18 | 2016-02-09 | Google Inc. | Speaker verification using co-location information |
EP3170062B1 (en) | 2014-07-18 | 2019-08-21 | Apple Inc. | Raise gesture detection in a device |
US20170194000A1 (en) | 2014-07-23 | 2017-07-06 | Mitsubishi Electric Corporation | Speech recognition device and speech recognition method |
US9916520B2 (en) * | 2014-09-03 | 2018-03-13 | Sri International | Automated food recognition and nutritional estimation with a personal mobile electronic device |
US9508341B1 (en) | 2014-09-03 | 2016-11-29 | Amazon Technologies, Inc. | Active learning for lexical annotations |
US10789041B2 (en) * | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
WO2016043005A1 (ja) | 2014-09-17 | 2016-03-24 | 富士フイルム株式会社 | パターン形成方法、電子デバイスの製造方法、電子デバイス、ブロック共重合体、及び、ブロック共重合体の製造方法 |
US10216996B2 (en) | 2014-09-29 | 2019-02-26 | Sony Interactive Entertainment Inc. | Schemes for retrieving and associating content items with real-world objects using augmented reality and object recognition |
US9378740B1 (en) | 2014-09-30 | 2016-06-28 | Amazon Technologies, Inc. | Command suggestions during automatic speech recognition |
US9812128B2 (en) | 2014-10-09 | 2017-11-07 | Google Inc. | Device leadership negotiation among voice interface devices |
US9318107B1 (en) * | 2014-10-09 | 2016-04-19 | Google Inc. | Hotword detection on multiple devices |
EP3207467A4 (en) | 2014-10-15 | 2018-05-23 | VoiceBox Technologies Corporation | System and method for providing follow-up responses to prior natural language inputs of a user |
CN105575392A (zh) * | 2014-10-28 | 2016-05-11 | 福特全球技术公司 | 用于用户交互的系统和方法 |
US9507977B1 (en) | 2014-11-03 | 2016-11-29 | Vsn Technologies, Inc. | Enabling proximate host assisted location tracking of a short range wireless low power locator tag |
EP3021178B1 (en) | 2014-11-14 | 2020-02-19 | Caterpillar Inc. | System using radar apparatus for assisting a user of a machine of a kind comprising a body and an implement |
KR102332752B1 (ko) | 2014-11-24 | 2021-11-30 | 삼성전자주식회사 | 지도 서비스를 제공하는 전자 장치 및 방법 |
WO2016084071A1 (en) | 2014-11-24 | 2016-06-02 | Isityou Ltd. | Systems and methods for recognition of faces e.g. from mobile-device-generated images of faces |
US9812126B2 (en) | 2014-11-28 | 2017-11-07 | Microsoft Technology Licensing, Llc | Device arbitration for listening devices |
US9626352B2 (en) | 2014-12-02 | 2017-04-18 | International Business Machines Corporation | Inter thread anaphora resolution |
US10091015B2 (en) | 2014-12-16 | 2018-10-02 | Microsoft Technology Licensing, Llc | 3D mapping of internet of things devices |
US9690361B2 (en) | 2014-12-24 | 2017-06-27 | Intel Corporation | Low-power context-aware control for analog frontend |
US9959129B2 (en) | 2015-01-09 | 2018-05-01 | Microsoft Technology Licensing, Llc | Headless task completion within digital personal assistants |
US20160202957A1 (en) | 2015-01-13 | 2016-07-14 | Microsoft Technology Licensing, Llc | Reactive agent development environment |
US10169535B2 (en) | 2015-01-16 | 2019-01-01 | The University Of Maryland, Baltimore County | Annotation of endoscopic video using gesture and voice commands |
US10142484B2 (en) | 2015-02-09 | 2018-11-27 | Dolby Laboratories Licensing Corporation | Nearby talker obscuring, duplicate dialogue amelioration and automatic muting of acoustically proximate participants |
US9691391B2 (en) | 2015-02-10 | 2017-06-27 | Knuedge Incorporated | Clustering of audio files using graphs |
US9769564B2 (en) | 2015-02-11 | 2017-09-19 | Google Inc. | Methods, systems, and media for ambient background noise modification based on mood and/or behavior information |
WO2016132729A1 (ja) | 2015-02-17 | 2016-08-25 | 日本電気株式会社 | ロボット制御装置、ロボット、ロボット制御方法およびプログラム記録媒体 |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US10133538B2 (en) | 2015-03-27 | 2018-11-20 | Sri International | Semi-supervised speaker diarization |
JP6669162B2 (ja) | 2015-03-31 | 2020-03-18 | ソニー株式会社 | 情報処理装置、制御方法、およびプログラム |
GB201505864D0 (en) | 2015-04-07 | 2015-05-20 | Ipv Ltd | Live markers |
US9300925B1 (en) * | 2015-05-04 | 2016-03-29 | Jack Ke Zhang | Managing multi-user access to controlled locations in a facility |
US10097973B2 (en) | 2015-05-27 | 2018-10-09 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
CN107851311B (zh) | 2015-06-15 | 2023-01-13 | 前视红外系统股份公司 | 对比度增强的结合图像生成系统和方法 |
US10178301B1 (en) | 2015-06-25 | 2019-01-08 | Amazon Technologies, Inc. | User identification based on voice and face |
CN105070288B (zh) | 2015-07-02 | 2018-08-07 | 百度在线网络技术(北京)有限公司 | 车载语音指令识别方法和装置 |
US10206068B2 (en) | 2015-07-09 | 2019-02-12 | OneMarket Network LLC | Systems and methods to determine a location of a mobile device |
US10867256B2 (en) | 2015-07-17 | 2020-12-15 | Knoema Corporation | Method and system to provide related data |
US20170032021A1 (en) | 2015-07-27 | 2017-02-02 | Investor's Forum | Chat room for managing multiple conversation streams |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US10026399B2 (en) * | 2015-09-11 | 2018-07-17 | Amazon Technologies, Inc. | Arbitration between voice-enabled devices |
US9875081B2 (en) * | 2015-09-21 | 2018-01-23 | Amazon Technologies, Inc. | Device selection for providing a response |
US9653075B1 (en) | 2015-11-06 | 2017-05-16 | Google Inc. | Voice commands across devices |
US9940934B2 (en) | 2015-11-18 | 2018-04-10 | Uniphone Software Systems | Adaptive voice authentication system and method |
US11144964B2 (en) * | 2015-11-20 | 2021-10-12 | Voicemonk Inc. | System for assisting in marketing |
US20170078573A1 (en) | 2015-11-27 | 2017-03-16 | Mediatek Inc. | Adaptive Power Saving For Multi-Frame Processing |
CN105389307A (zh) | 2015-12-02 | 2016-03-09 | 上海智臻智能网络科技股份有限公司 | 语句意图类别识别方法及装置 |
CN105611500A (zh) | 2015-12-07 | 2016-05-25 | 苏州触达信息技术有限公司 | 一种预定空间内的定位系统和方法 |
WO2017112813A1 (en) | 2015-12-22 | 2017-06-29 | Sri International | Multi-lingual virtual personal assistant |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
TWI571833B (zh) | 2015-12-23 | 2017-02-21 | 群暉科技股份有限公司 | 監測服務裝置、電腦程式產品、藉由影像監測提供服務之方法及藉由影像監測啓用服務之方法 |
US10599390B1 (en) | 2015-12-28 | 2020-03-24 | Amazon Technologies, Inc. | Methods and systems for providing multi-user recommendations |
KR102392113B1 (ko) * | 2016-01-20 | 2022-04-29 | 삼성전자주식회사 | 전자 장치 및 전자 장치의 음성 명령 처리 방법 |
US9912977B2 (en) | 2016-02-04 | 2018-03-06 | The Directv Group, Inc. | Method and system for controlling a user receiving device using voice commands |
US9898250B1 (en) | 2016-02-12 | 2018-02-20 | Amazon Technologies, Inc. | Controlling distributed audio outputs to enable voice output |
US9858927B2 (en) | 2016-02-12 | 2018-01-02 | Amazon Technologies, Inc | Processing spoken commands to control distributed audio outputs |
US10095470B2 (en) | 2016-02-22 | 2018-10-09 | Sonos, Inc. | Audio response playback |
US9965247B2 (en) | 2016-02-22 | 2018-05-08 | Sonos, Inc. | Voice controlled media playback system based on user profile |
US20170249309A1 (en) | 2016-02-29 | 2017-08-31 | Microsoft Technology Licensing, Llc | Interpreting and Resolving Conditional Natural Language Queries |
US20190057703A1 (en) | 2016-02-29 | 2019-02-21 | Faraday&Future Inc. | Voice assistance system for devices of an ecosystem |
US20170255450A1 (en) | 2016-03-04 | 2017-09-07 | Daqri, Llc | Spatial cooperative programming language |
US10133612B2 (en) | 2016-03-17 | 2018-11-20 | Nuance Communications, Inc. | Session processing interaction between two or more virtual assistants |
KR102537543B1 (ko) | 2016-03-24 | 2023-05-26 | 삼성전자주식회사 | 지능형 전자 장치 및 그 동작 방법 |
JP6409206B2 (ja) | 2016-03-28 | 2018-10-24 | Groove X株式会社 | お出迎え行動する自律行動型ロボット |
US9972322B2 (en) | 2016-03-29 | 2018-05-15 | Intel Corporation | Speaker recognition using adaptive thresholding |
US9749583B1 (en) | 2016-03-31 | 2017-08-29 | Amazon Technologies, Inc. | Location based device grouping with voice control |
US20170315208A1 (en) | 2016-05-02 | 2017-11-02 | Mojix, Inc. | Joint Entity and Object Tracking Using an RFID and Detection Network |
US10430426B2 (en) | 2016-05-03 | 2019-10-01 | International Business Machines Corporation | Response effectiveness determination in a question/answer system |
CN105810194B (zh) * | 2016-05-11 | 2019-07-05 | 北京奇虎科技有限公司 | 待机状态下语音控制信息获取方法和智能终端 |
US11210324B2 (en) | 2016-06-03 | 2021-12-28 | Microsoft Technology Licensing, Llc | Relation extraction across sentence boundaries |
DK179309B1 (en) | 2016-06-09 | 2018-04-23 | Apple Inc | Intelligent automated assistant in a home environment |
US9584946B1 (en) | 2016-06-10 | 2017-02-28 | Philip Scott Lyren | Audio diarization system that segments audio input |
JP2018008489A (ja) * | 2016-07-15 | 2018-01-18 | 富士ゼロックス株式会社 | 情報処理装置、情報処理システム、及び情報処理プログラム |
US10462545B2 (en) | 2016-07-27 | 2019-10-29 | Amazon Technologies, Inc. | Voice activated electronic device |
US10026403B2 (en) | 2016-08-12 | 2018-07-17 | Paypal, Inc. | Location based voice association system |
CN106157952B (zh) | 2016-08-30 | 2019-09-17 | 北京小米移动软件有限公司 | 声音识别方法及装置 |
CN106340299A (zh) | 2016-09-21 | 2017-01-18 | 成都创慧科达科技有限公司 | 一种复杂环境下的说话人识别系统及方法 |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US10455200B2 (en) | 2016-09-26 | 2019-10-22 | 3 Strike, Llc | Storage container with inventory control |
US10283138B2 (en) | 2016-10-03 | 2019-05-07 | Google Llc | Noise mitigation for a voice interface device |
US10552742B2 (en) | 2016-10-14 | 2020-02-04 | Google Llc | Proactive virtual assistant |
US10482885B1 (en) * | 2016-11-15 | 2019-11-19 | Amazon Technologies, Inc. | Speaker based anaphora resolution |
US10332523B2 (en) * | 2016-11-18 | 2019-06-25 | Google Llc | Virtual assistant identification of nearby computing devices |
US10134396B2 (en) | 2016-12-07 | 2018-11-20 | Google Llc | Preventing of audio attacks |
US10276149B1 (en) | 2016-12-21 | 2019-04-30 | Amazon Technologies, Inc. | Dynamic text-to-speech output |
US10713317B2 (en) | 2017-01-30 | 2020-07-14 | Adobe Inc. | Conversational agent for search |
US20180293221A1 (en) | 2017-02-14 | 2018-10-11 | Microsoft Technology Licensing, Llc | Speech parsing with intelligent assistant |
US11010601B2 (en) | 2017-02-14 | 2021-05-18 | Microsoft Technology Licensing, Llc | Intelligent assistant device communicating non-verbal cues |
US11100384B2 (en) | 2017-02-14 | 2021-08-24 | Microsoft Technology Licensing, Llc | Intelligent device user interactions |
US10467509B2 (en) | 2017-02-14 | 2019-11-05 | Microsoft Technology Licensing, Llc | Computationally-efficient human-identifying smart assistant computer |
US20190236416A1 (en) | 2018-01-31 | 2019-08-01 | Microsoft Technology Licensing, Llc | Artificial intelligence system utilizing microphone array and fisheye camera |
-
2017
- 2017-06-28 US US15/636,422 patent/US10467509B2/en active Active
- 2017-06-28 US US15/636,559 patent/US10467510B2/en active Active
- 2017-06-30 US US15/640,113 patent/US10957311B2/en active Active
- 2017-06-30 US US15/640,201 patent/US11004446B2/en active Active
- 2017-06-30 US US15/640,251 patent/US10984782B2/en active Active
- 2017-07-11 US US15/646,871 patent/US20180233140A1/en not_active Abandoned
- 2017-07-21 US US15/657,031 patent/US10496905B2/en active Active
- 2017-07-21 US US15/656,994 patent/US10460215B2/en active Active
- 2017-07-24 US US15/657,822 patent/US11194998B2/en active Active
- 2017-08-21 US US15/682,407 patent/US10628714B2/en active Active
- 2017-08-21 US US15/682,425 patent/US10579912B2/en active Active
- 2017-12-05 US US15/832,656 patent/US10817760B2/en active Active
- 2017-12-05 US US15/832,672 patent/US10824921B2/en active Active
-
2018
- 2018-02-07 CN CN201880011578.3A patent/CN110291760B/zh active Active
- 2018-02-07 EP EP18706104.9A patent/EP3583485B1/en active Active
- 2018-02-07 WO PCT/US2018/017139 patent/WO2018151979A1/en active Application Filing
- 2018-02-07 CN CN201880011716.8A patent/CN110291489B/zh active Active
- 2018-02-07 WO PCT/US2018/017140 patent/WO2018151980A1/en unknown
- 2018-02-09 EP EP18706619.6A patent/EP3583746A1/en not_active Withdrawn
- 2018-02-09 CN CN201880011967.6A patent/CN110301118B/zh active Active
- 2018-02-09 EP EP18708508.9A patent/EP3583749B1/en active Active
- 2018-02-09 EP EP18706621.2A patent/EP3583595A1/en not_active Withdrawn
- 2018-02-09 WO PCT/US2018/017509 patent/WO2018152008A1/en unknown
- 2018-02-09 WO PCT/US2018/017508 patent/WO2018152007A1/en unknown
- 2018-02-09 CN CN201880011910.6A patent/CN110300946B/zh active Active
- 2018-02-09 EP EP18706370.6A patent/EP3583489A1/en not_active Ceased
- 2018-02-09 WO PCT/US2018/017513 patent/WO2018152012A1/en active Application Filing
- 2018-02-09 WO PCT/US2018/017515 patent/WO2018152014A1/en active Application Filing
- 2018-02-09 EP EP18707800.1A patent/EP3583748B1/en active Active
- 2018-02-09 CN CN201880011970.8A patent/CN110313153B/zh active Active
- 2018-02-09 CN CN201880012028.3A patent/CN110313154B/zh active Active
- 2018-02-09 EP EP22153942.2A patent/EP4027234A1/en active Pending
- 2018-02-09 EP EP18707798.7A patent/EP3583497B1/en active Active
- 2018-02-09 CN CN201880011946.4A patent/CN110313152B/zh active Active
- 2018-02-09 WO PCT/US2018/017517 patent/WO2018152016A1/en unknown
- 2018-02-09 WO PCT/US2018/017510 patent/WO2018152009A1/en active Application Filing
- 2018-02-09 CN CN201880011917.8A patent/CN110383235A/zh active Pending
- 2018-02-09 EP EP18706620.4A patent/EP3583747B1/en active Active
- 2018-02-09 CN CN201880011885.1A patent/CN110326261A/zh not_active Withdrawn
- 2018-02-09 CN CN202111348785.8A patent/CN113986016B/zh active Active
- 2018-02-09 WO PCT/US2018/017511 patent/WO2018152010A1/en unknown
- 2018-02-09 WO PCT/US2018/017512 patent/WO2018152011A1/en unknown
- 2018-02-09 CN CN201880011965.7A patent/CN110326041B/zh active Active
- 2018-02-09 WO PCT/US2018/017514 patent/WO2018152013A1/en unknown
- 2018-02-09 WO PCT/US2018/017506 patent/WO2018152006A1/en unknown
-
2019
- 2019-09-17 US US16/573,677 patent/US10621478B2/en active Active
- 2019-10-11 US US16/599,426 patent/US11126825B2/en active Active
- 2019-12-02 US US16/700,308 patent/US11017765B2/en active Active
-
2021
- 2021-09-27 US US17/449,054 patent/US20220012470A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102510426A (zh) * | 2011-11-29 | 2012-06-20 | 安徽科大讯飞信息科技股份有限公司 | 个人助理应用访问方法及系统 |
CN104423563A (zh) * | 2013-09-10 | 2015-03-18 | 智高实业股份有限公司 | 非接触式实时互动方法及其系统 |
CN104462175A (zh) * | 2013-09-20 | 2015-03-25 | 国际商业机器公司 | 创建使用关联数据的集成用户接口的方法和系统 |
US20150172285A1 (en) * | 2013-12-17 | 2015-06-18 | Mei Ling LO | Method for Accessing E-Mail System |
Non-Patent Citations (1)
Title |
---|
JOHN PATRICK PULLEN: "《Amazon Echo Tip: HoW to Add Multiple USers》", 《TIME》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI792693B (zh) * | 2021-11-18 | 2023-02-11 | 瑞昱半導體股份有限公司 | 用於進行人物重辨識的方法與裝置 |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110313152A (zh) | 用于智能助理计算机的用户注册 | |
US11100384B2 (en) | Intelligent device user interactions | |
US20180293221A1 (en) | Speech parsing with intelligent assistant | |
EP3776173A1 (en) | Intelligent device user interactions | |
WO2019118147A1 (en) | Speech parsing with intelligent assistant |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |