US20200302970A1 - Automatic narration of signal segment - Google Patents

Automatic narration of signal segment Download PDF

Info

Publication number
US20200302970A1
US20200302970A1 US16/895,725 US202016895725A US2020302970A1 US 20200302970 A1 US20200302970 A1 US 20200302970A1 US 202016895725 A US202016895725 A US 202016895725A US 2020302970 A1 US2020302970 A1 US 2020302970A1
Authority
US
United States
Prior art keywords
signal segment
physical
physical entities
sensed
computing system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/895,725
Inventor
Vijay Mital
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Priority to US16/895,725 priority Critical patent/US20200302970A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MITAL, VIJAY
Publication of US20200302970A1 publication Critical patent/US20200302970A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • G06K9/00751
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • G06V20/47Detecting features for summarising video content
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/04Electrically-operated educational appliances with audible presentation of the material to be studied
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B21/00Teaching, or communicating with, the blind, deaf or mute
    • G09B21/001Teaching or communicating with blind persons
    • G09B21/006Teaching or communicating with blind persons using audible presentation of the information

Definitions

  • computing systems and associated networks have greatly revolutionized our world. At first, computing systems were only able to perform simple tasks. However, as processing power has increased and become increasingly available, the complexity of tasks performed by a computing system has greatly increased. Likewise, the hardware complexity and capability of computing systems has greatly increased, as exemplified with cloud computing that is supported by large data centers.
  • computing systems just did essentially what they were told by their instructions or software.
  • software and the employment of hardware is becoming so advanced that computing systems are now, more than ever before, capable of some level of decision making at higher levels.
  • the level of decision making can approach, rival, or even exceed the capability of the human brain to make decisions.
  • computing systems are now capable of employing some level of artificial intelligence.
  • artificial intelligence is the recognition of external stimuli from the physical world.
  • voice recognition technology has improved greatly allowing for high degree of accuracy in detecting words that are being spoken, and even the identity of the person that is speaking.
  • computer vision allows computing systems to automatically identify objects within a particular picture or frame of video, or recognize human activity across a series of video frames.
  • face recognition technology allows computing systems to recognize faces
  • activity recognition technology allows computing systems to know whether two proximate people are working together.
  • Each of these technologies may employ deep learning (Deep Neural Network-based and reinforcement-based learning mechanisms) and machine learning algorithms to learn from experience what is making a sound, and objects or people that are within an image, thereby improving accuracy of recognition over time.
  • deep learning Deep Neural Network-based and reinforcement-based learning mechanisms
  • machine learning algorithms to learn from experience what is making a sound, and objects or people that are within an image, thereby improving accuracy of recognition over time.
  • advanced computer vision technology now exceeds the capability of a human being to quickly and accurately recognize objects of interest within that scene.
  • Hardware such as matrix transformation hardware in conventional graphical processing units (GPUs), may also contribute to the rapid speed in object recognition in the context of deep neural networks.
  • At least some embodiments described herein relate to automatic generation of a narration of what is happening in a signal segment (live or recorded).
  • the signal segment that is to be narrated is accessed from a physical graph.
  • the signal segment evidences state of one or more physical entities, and thus has a semantic understanding of what is depicted in the signal segment.
  • the system then automatically determines how one or more physical entities are acting within the signal segment based on that semantic understanding, and builds a narration of the activities based on the determined actions.
  • the system may determine what is interesting for narration based on a wide variety of criteria, such as whether an action is happening repeatedly, what has not changed in the signal segment, what is occurring constantly in the signal, what portion(s) of the signal segment have been shared in the past (or currently), user instructions, and so forth.
  • the system could use machine learning to determine what will be interesting to narrate.
  • FIG. 1 illustrates an example computer system in which the principles described herein may be employed
  • FIG. 2 illustrates an environment in which the principles described herein may operate, which includes a physical space that includes multiple physical entities and multiple sensors, a recognition component that senses features of physical entities within the physical space, and a feature store that stores sensed features of such physical entities, such that computation and querying may be performed against those features;
  • FIG. 3 illustrates a flowchart of a method for tracking physical entities within a location and may be performed in the environment of FIG. 2 ;
  • FIG. 4 illustrates an entity tracking data structure that may be used to assist in performing the method of FIG. 3 , and which may be used to later perform queries on the tracked physical entities;
  • FIG. 5 illustrates a flowchart of a method for efficiently rendering signal segments of interest;
  • FIG. 6 illustrates a flowchart of a method for controlling creation of or access to information sensed by one or more sensors in a physical space
  • FIG. 7 illustrates a recurring flow showing that in addition to creating a computer-navigable graph of sensed features in the physical space, there may also be pruning of the computer-navigable graph to thereby keep the computer-navigable graph of the real world at a manageable size;
  • FIG. 8 illustrates a flowchart of a method for sharing at least a portion of a signal segment
  • FIG. 9 illustrates a flowchart of a method for automatically generating a narration of what is happening in a signal segment.
  • FIG. 10 illustrates a flowchart of a method for automatically training an actor that is performing or is about to perform an activity.
  • At least some embodiments described herein relate to automatic generation of a narration of what is happening in a signal segment (live or recorded).
  • the signal segment that is to be narrated is accessed from a physical graph.
  • the signal segment evidences state of one or more physical entities, and thus has a semantic understanding of what is depicted in the signal segment.
  • the system then automatically determines how one or more physical entities are acting within the signal segment based on that semantic understanding, and builds a narration of the activities based on the determined actions.
  • the system may determine what is interesting for narration based on a wide variety of criteria, such as whether an action is happening repeatedly, what has not changed in the signal segment, what is occurring constantly in the signal, what portion(s) of the signal segment have been shared in the past (or currently), user instructions, and so forth.
  • the system could use machine learning to determine what will be interesting to narrate.
  • FIG. 1 Because the principles described herein operate in the context of a computing system, a computing system will be described with respect to FIG. 1 . Then, the principles of the foundation upon which ambient computing may be performed will then be described with respect to FIGS. 2 through 4 . The obtaining of signal segments from the computer-navigable graph will then be described with respect to FIG. 5 . Thereafter, the application of security in the context of ambient computing will be described with respect to FIG. 6 . Finally, the managing of the size of the computer-navigable graph will be described with respect to FIG. 7 . Then three implementations that use the semantic understanding provided by the computer-navigable graph (also called herein a “physical graph” will be described with respect to FIGS. 8 through 10 .
  • Computing systems are now increasingly taking a wide variety of forms.
  • Computing systems may, for example, be handheld devices, appliances, laptop computers, desktop computers, mainframes, distributed computing systems, datacenters, or even devices that have not conventionally been considered a computing system, such as wearables (e.g., glasses, watches, bands, and so forth).
  • wearables e.g., glasses, watches, bands, and so forth.
  • the term “computing system” is defined broadly as including any device or system (or combination thereof) that includes at least one physical and tangible processor, and a physical and tangible memory capable of having thereon computer-executable instructions that may be executed by a processor.
  • the memory may take any form and may depend on the nature and form of the computing system.
  • a computing system may be distributed over a network environment and may include multiple constituent computing systems.
  • a computing system 100 typically includes at least one hardware processing unit 102 and memory 104 .
  • the memory 104 may be physical system memory, which may be volatile, non-volatile, or some combination of the two.
  • the term “memory” may also be used herein to refer to non-volatile mass storage such as physical storage media. If the computing system is distributed, the processing, memory and/or storage capability may be distributed as well.
  • the computing system 100 has thereon multiple structures often referred to as an “executable component”.
  • the memory 104 of the computing system 100 is illustrated as including executable component 106 .
  • executable component is the name for a structure that is well understood to one of ordinary skill in the art in the field of computing as being a structure that can be software, hardware, or a combination thereof.
  • the structure of an executable component may include software objects, routines, methods that may be executed on the computing system, whether such an executable component exists in the heap of a computing system, or whether the executable component exists on computer-readable storage media.
  • the structure of the executable component exists on a computer-readable medium such that, when interpreted by one or more processors of a computing system (e.g., by a processor thread), the computing system is caused to perform a function.
  • Such structure may be computer-readable directly by the processors (as is the case if the executable component were binary).
  • the structure may be structured to be interpretable and/or compiled (whether in a single stage or in multiple stages) so as to generate such binary that is directly interpretable by the processors.
  • executable component is also well understood by one of ordinary skill as including structures that are implemented exclusively or near-exclusively in hardware, such as within a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), or any other specialized circuit. Accordingly, the term “executable component” is a term for a structure that is well understood by those of ordinary skill in the art of computing, whether implemented in software, hardware, or a combination. In this description, the term “component” may also be used.
  • FPGA field programmable gate array
  • ASIC application specific integrated circuit
  • embodiments are described with reference to acts that are performed by one or more computing systems. If such acts are implemented in software, one or more processors (of the associated computing system that performs the act) direct the operation of the computing system in response to having executed computer-executable instructions that constitute an executable component.
  • processors of the associated computing system that performs the act
  • Such computer-executable instructions may be embodied on one or more computer-readable media that form a computer program product.
  • An example of such an operation involves the manipulation of data.
  • the computer-executable instructions may be stored in the memory 104 of the computing system 100 .
  • Computing system 100 may also contain communication channels 108 that allow the computing system 100 to communicate with other computing systems over, for example, network 110 .
  • the computing system 100 includes a user interface 112 for use in interfacing with a user.
  • the user interface 112 may include output mechanisms 112 A as well as input mechanisms 112 B.
  • output mechanisms 112 A might include, for instance, speakers, displays, tactile output, holograms, virtual reality, and so forth.
  • input mechanisms 112 B might include, for instance, microphones, touchscreens, holograms, virtual reality, cameras, keyboards, mouse of other pointer input, sensors of any type, and so forth.
  • Embodiments described herein may comprise or utilize a special purpose or general-purpose computing system including computer hardware, such as, for example, one or more processors and system memory, as discussed in greater detail below.
  • Embodiments described herein also include physical and other computer-readable media for carrying or storing computer-executable instructions and/or data structures.
  • Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computing system.
  • Computer-readable media that store computer-executable instructions are physical storage media.
  • Computer-readable media that carry computer-executable instructions are transmission media.
  • embodiments can comprise at least two distinctly different kinds of computer-readable media: storage media and transmission media.
  • Computer-readable storage media includes RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other physical and tangible storage medium which can be used to store desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computing system.
  • a “network” is defined as one or more data links that enable the transport of electronic data between computing systems and/or modules and/or other electronic devices.
  • a network or another communications connection can include a network and/or data links which can be used to carry desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computing system. Combinations of the above should also be included within the scope of computer-readable media.
  • program code means in the form of computer-executable instructions or data structures can be transferred automatically from transmission media to storage media (or vice versa).
  • computer-executable instructions or data structures received over a network or data link can be buffered in RAM within a network interface module (e.g., a “NIC”), and then eventually transferred to computing system RAM and/or to less volatile storage media at a computing system.
  • a network interface module e.g., a “NIC”
  • readable media can be included in computing system components that also (or even primarily) utilize transmission media.
  • Computer-executable instructions comprise, for example, instructions and data which, when executed at a processor, cause a general purpose computing system, special purpose computing system, or special purpose processing device to perform a certain function or group of functions. Alternatively, or in addition, the computer-executable instructions may configure the computing system to perform a certain function or group of functions.
  • the computer executable instructions may be, for example, binaries or even instructions that undergo some translation (such as compilation) before direct execution by the processors, such as intermediate format instructions such as assembly language, or even source code.
  • the invention may be practiced in network computing environments with many types of computing system configurations, including, personal computers, desktop computers, laptop computers, message processors, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, mobile telephones, PDAs, pagers, routers, switches, datacenters, wearables (such as glasses or watches) and the like.
  • the invention may also be practiced in distributed system environments where local and remote computing systems, which are linked (either by hardwired data links, wireless data links, or by a combination of hardwired and wireless data links) through a network, both perform tasks.
  • program modules may be located in both local and remote memory storage devices.
  • Cloud computing environments may be distributed, although this is not required. When distributed, cloud computing environments may be distributed internationally within an organization and/or have components possessed across multiple organizations.
  • cloud computing is defined as a model for enabling on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services). The definition of “cloud computing” is not limited to any of the other numerous advantages that can be obtained from such a model when properly deployed.
  • cloud computing is currently employed in the marketplace so as to offer ubiquitous and convenient on-demand access to the shared pool of configurable computing resources.
  • the shared pool of configurable computing resources can be rapidly provisioned via virtualization and released with low management effort or service provider interaction, and then scaled accordingly.
  • a cloud computing model can be composed of various characteristics such as on-demand self-service, broad network access, resource pooling, rapid elasticity, measured service, and so forth.
  • a cloud computing model may also come in the form of various service models such as, for example, Software as a Service (“SaaS”), Platform as a Service (“PaaS”), and Infrastructure as a Service (“IaaS”).
  • SaaS Software as a Service
  • PaaS Platform as a Service
  • IaaS Infrastructure as a Service
  • the cloud computing model may also be deployed using different deployment models such as private cloud, community cloud, public cloud, hybrid cloud, and so forth.
  • a “cloud computing environment” is an environment in which cloud computing is employed.
  • FIG. 2 illustrates an environment 200 in which the principles described herein may operate.
  • the environment 200 includes a physical space 201 that includes multiple physical entities 210 , which may be any extant object, person, or thing that emits or reflects physical signals (such as electromagnetic radiation or acoustics) that has a pattern that may be used to potentially identify one or more physical features (also called herein states) of the respective object, person, or thing.
  • An example of such potentially identifying electromagnetic radiation is visible light that has a light pattern (e.g., a still image or video) from which characteristics of visible entities may be ascertained.
  • Such light pattern may be any temporal, spatial, or even higher-dimensional space.
  • An example of such acoustics may the voice of a human being, the sound of an object in normal operation or undergoing an activity or event, or a reflected acoustic echo.
  • the environment 200 also includes sensors 220 that receive physical signals from the physical entities 210 .
  • the sensors need not, of course, pick up every physical signal that the physical entity emits or reflects.
  • a visible light camera still or video
  • Acoustic sensors likewise have limited dynamic range designed for certain frequency ranges.
  • the sensors 220 provide (as represented by arrow 229 ) resulting sensor signals to a recognition component 230 .
  • the recognition component 230 at least estimates (e.g., estimates or recognizes) one or more features of the physical entities 210 within the location based on patterns detected in the received sensor signals.
  • the recognition component 230 may also generate a confidence level associated with the “at least an estimation” of a feature of the physical entity. If that confidence level is less than 100%, then the “at least an estimation” is just an estimation. If that confidence level is 100%, then the “at least an estimation” is really more than an estimation—it is a recognition.
  • a feature that is “at least estimated” will also be referred to as a “sensed” feature to promote clarity.
  • the recognition component 230 may employ deep learning (Deep Neural Network-based and reinforcement-based learning mechanisms) and machine learning algorithms to learn from experience what objects or people that are within an image, thereby improving accuracy of recognition over time.
  • deep learning Deep Neural Network-based and reinforcement-based learning mechanisms
  • the recognition component 230 provides (as represented by arrow 239 ) the sensed features into a sensed feature store 240 , which can store the sensed features (and associated confidence levels) for each physical entity within the location 201 , whether the physical entity is within the physical space for a short time, a long time, or permanently.
  • the computation component 250 may then perform a variety of queries and/or computations on the sensed feature data provided in sensed feature store 240 .
  • the queries and/or computations may be enabled by interactions (represented by arrow 249 ) between the computation component 250 and the sensed feature store 240 .
  • the recognition component 230 senses a sensed feature of a physical entity within the location 201 using sensor signal(s) provided by a sensor
  • the sensor signals are also provided to a store, such as the sensed feature store.
  • the sensed feature store 240 is illustrated as including sensed features 241 as well as the corresponding sensor signals 242 that represent the evidence of the sense features.
  • At least one signal segment is computer-associated with the sensed feature such that computer-navigation to the sensed feature also allows for computer-navigation to the signal segment.
  • the association of the sensed signal with the associated signal segment may be performed continuously, thus resulting in an expanding graph, and an expanding collection of signal segments. That said, as described further below, garbage collection processes may be used to clean up sensed features and/or signal segments that are outdated or no longer of interest.
  • the signal segment may include multiple pieces of metadata such as, for instance, an identification of the sensor or sensors that generated the signal segment.
  • the signal segment need not include all of the signals that were generated by that sensor, and for brevity, may perhaps include only those portions of the signal that were used to sense the sensed feature of the particular physical entity.
  • the metadata may include a description of the portion of the original signal segment that was stored.
  • the sensed signal may be any type of signal that is generated by a sensor. Examples include video, image, and audio signals. However, the variety of signals is not limited to those that can be sensed by a human being. For instance, the signal segment might represented a transformed version of the signal generated by the sensor to allow for human observations of better human focus. Such transformations might include filtering, such a filtering based on frequency, or quantization. Such transformation might also include amplification, frequency shifting, speed adjustment, magnifying, amplitude adjustment, and so forth.
  • the signal segment is stored. For instance, if a video signal, perhaps only a portion of the frames of the video are stored. Furthermore, for any given image, perhaps only the relevant portion of the frame is stored. Likewise, if the sensor signal was an image, perhaps only the relevant portion of the image is stored.
  • the recognition service that uses the signal segment to sense a feature is aware of which portion of the signal segment that was used to sense a feature. Accordingly, a recognition service can specifically carve out the relevant portion of the signal for any given sensed feature.
  • the computation component 250 may also have a security component 251 that may determine access to data with the sensed feature store 240 .
  • the security component 251 may control which users may access the sensed feature data 241 and/or the sensor signals 242 .
  • the security component 251 may even control which of the sensed feature data that computations are performed over, and/or which user are authorized to perform what type of computations or queries. Thus, security is effectively achieved. More regarding this security will be described below with respect to FIG. 6 .
  • the sensed feature data represents the sensed features of the physical entities within the physical space 201 over time
  • complex computing may be performed on the physical entities within the physical space 201 .
  • the very environment itself is filled with helpful computing power that is getting ready for any computing query or computation regarding that physical space. This will be referred to hereinafter also as “ambient computing”.
  • the evidence supporting that recognition components sensing of that feature may be reconstructed.
  • the computing component 240 might provide video evidence of when a particular physical entity first entered a particular location. If multiple sensors generated sensor signals that were used by the recognition component to sense that feature, then the sensor signals for any individual sensor or combination of sensors may be reconstructed and evaluated. Thus, for instance, the video evidence of the physical entity first entering a particular location may be reviewed from different angles.
  • the physical space 201 is illustrated in FIG. 2 and is intended just to be an abstract representation of any physical space that has sensors in it. There are infinite examples of such physical spaces, but examples include a room, a house, a neighborhood, a factory, a stadium, a building, a floor, an office, a car, an airplane, a spacecraft, a Petri dish, a pipe or tube, the atmosphere, underground spaces, caves, land, combinations and/or portions thereof.
  • the physical space 201 may be the entirety of the observable universe or any portion thereof so long as there are sensors capable of receiving signals emitted from, affected by (e.g., diffraction, frequency shifting, echoes, etc.), and/or reflected from the physical entities within the location.
  • the physical entities 210 within the physical space 201 are illustrated as including four physical entities 211 , 212 , 213 and 214 by way of example only.
  • the ellipses 215 represent that there may be any number and variety of physical entities having features that are being sensed based on data from the sensors 220 .
  • the ellipses 215 also represent that physical entities may exit and enter the location 201 . Thus, the number and identity of physical entities within the location 201 may change over time.
  • the position of the physical entities may also vary over time. Though the position of the physical entities is shown in the upper portion of the physical space 201 in FIG. 2 , this is simply for purpose of clear labelling. The principles described herein are not dependent on any particular physical entity occupying any particular physical position within the physical space 201 .
  • the physical entities 210 are illustrated as triangles and the sensors 220 are illustrated as circles.
  • the physical entities 210 and the sensors 220 may, of course, have any physical shape or size. Physical entities typically are not triangular in shape, and sensors are typically not circular in shape. Furthermore, sensors 220 may observe physical entities within a physical space 201 without regard for whether or not those sensors 220 are physically located within that physical space 201 .
  • the sensors 220 within the physical space 201 are illustrated as including two sensors 221 and 222 by way of example only.
  • the ellipses 223 represent that there may be any number and variety of sensors that are capable of receiving signals emitted, affected (e.g., via diffraction, frequency shifting, echoes, etc.) and/or reflected by the physical entities within the physical space.
  • the number and capability of operable sensors may change over time as sensors within the physical space are added, removed, upgrade, broken, replaced, and so forth.
  • FIG. 3 illustrates a flowchart of a method 300 for tracking physical entities within a physical space. Since the method 300 may be performed to track the physical entities 210 within the physical space 201 of FIG. 2 , the method 300 of FIG. 3 will now be described with frequent reference to the environment 200 of FIG. 2 .
  • FIG. 4 illustrates an entity tracking data structure 400 that may be used to assist in performing the method 300 , and which may be used to later perform queries on the tracked physical entities, and perhaps also to access and review the sensor signals associated with the tracked physical entities.
  • the entity tracking data structure 400 may be stored in the sensed feature store 240 of FIG. 4 (which is represented as sensed feature data 241 ). Accordingly, the method 300 of FIG. 3 will also be described with frequent reference to the entity tracking data structure 400 of FIG. 4 .
  • a space-time data structure for the physical space is set up (act 301 ).
  • This may be a distributed data structure or a non-distributed data structure.
  • FIG. 4 illustrates an example of an entity tracking data structure 400 that includes a space-time data structure 401 .
  • This entity tracking data structure 400 may be included within the sensed feature store 240 of FIG. 2 as sensed feature data 241 . While the principles described herein are described with respect to tracking physical entities, and their sensed features and activities, the principles described herein may operate to tracking physical entities (and their sensed features and activities) within more than one location.
  • the space-time data structure 401 is not the root node in the tree represented by the entity tracking data structure 400 (as symbolized by the ellipses 402 A and 402 B). Rather there may be multiple space-time data structures that may be interconnected via a common root node.
  • the content of box 310 A may be performed for each of multiple physical entities (e.g., physical entities 210 ) that are at least temporarily within a physical space (e.g., physical space 201 ).
  • the content of box 310 B is illustrated as being nested within box 310 A, and represents that its content may be performed at each of multiple times for a given physical entity.
  • a complex entity tracking data structure 400 may be created and grown, to thereby record the sensed features of physical entities that are one or more times within the location.
  • the entity tracking data structure 400 may potentially also be used to access the sensed signals that resulted in certain sensed features (or feature changes) being recognized.
  • a physical entity is sensed by one or more sensors (act 311 ).
  • one or more physical signals emitted from, affected by (e.g., via diffraction, frequency shifting, echoes, etc.), and/or reflected from the physical entity is received by one or more of the sensors.
  • FIG. 1 suppose that physical entity 211 has one or more features that are sensed by both sensors 221 and 222 at a particular time.
  • the recognition component 230 may have a security component 231 that, according to particular settings, may refuse to record sensed features associated with particular physical entities, sensed features of a particular type, and/or that were sensed from sensor signals generated at particular time, or combinations thereof. For instance, perhaps the recognition component 230 will not record sensed features of any people that are within the location. As a more fine-grained examples, perhaps the recognition component 230 will not record sensed features of a set of people, where those sensed features relate to an identity or gender of the person, and where those sensed features resulted from sensor signals that were generated at particular time frames. More regarding this security will again be described below with respect to FIG. 6 .
  • an at least approximation of that particular time at which the physical entity was sensed is represented within an entity data structure that corresponds to the physical entity and this is computing-associated with the space-time data structure (act 312 ).
  • the entity data structure 410 A may correspond to the physical entity 211 and is computing-associated (as represented by line 430 A) with the space-time data structure 401 .
  • one node of a data structure is “computing-associated” with another node of a data structure if a computing system is, by whatever means, able to detect an association between the two nodes.
  • the use of pointers is one mechanism for computing-association.
  • a node of a data structure may also be computing-associated by being included within the other node of the data structure, and by any other mechanism recognized by a computing system as being an association.
  • the time data 411 represents an at least approximation of the time that the physical entity was sensed (at least at this time iteration of the content of box 310 B) within the entity data structure 410 A.
  • the time may be a real time (e.g., expressed with respect to an atomic clock), or may be an artificial time.
  • the artificial time may be a time that is offset from real-time and/or expressed in a different manner than real time (e.g., number of seconds or minutes since the last turn of the millennium).
  • the artificial time may also be a logical time, such as a time that is expressed by a monotonically increasing number that increments at each sensing.
  • the environment senses at least one physical feature (and perhaps multiple) of the particular physical entity in which the particular physical entity exists at the particular time (act 313 ).
  • the recognition component 230 may sense at least one physical feature of the physical entity 211 based on the signals received from the sensors 221 and 222 (e.g., as represented by arrow 229 ).
  • the sensed at least one physical feature of the particular physical entity is then represented in the entity data structure (act 314 ) in a manner computing-associated with the at least approximation of the particular time.
  • the sensed feature data is provided (as represented by arrow 239 ) to the sensed feature store 240 .
  • this sensed feature data may be provided along with the at least approximation of the particular time so as to modify the entity tracking data structure 400 in substantially one act.
  • act 312 and act 314 may be performed at substantially the same time to reduce write operations into the sensed feature store 240 .
  • the sensor signal(s) that the recognition component relied upon to sense the sensed feature are recorded in a manner that is computer-associated with the sensed feature (act 315 ).
  • the sensed feature that is in the sensed feature data 241 e.g., in the space-time data structure 401
  • the first entity data structure now has sensed feature data 421 that is computing-associated with time 411 .
  • the sensed feature data 421 includes two sensed physical features 421 A and 421 B of the physical entity.
  • the ellipses 421 C represents that there may be any number of sensed features of the physical entity that is stored as part of the sensed feature data 421 within the entity data structure 401 . For instance, there may be a single sensed feature, or innumerable sensed features, or any number in-between for any given physical entity as detected at any particular time.
  • the sensed feature may be associated with other features. For instance, if the physical entity is a person, the feature might be a name of the person.
  • That specifically identified person might have known characteristics based on features not represented within the entity data structure. For instance, the person might have a certain rank or position within an organization, have certain training, be a certain height, and so forth.
  • the entity data structure may be extended by, when a particular feature is sensed (e.g., a name), pointing to additional features of that physical entity (e.g., rank, position, training, height) so as to even further extend the richness of querying and/or other computation on the data structure.
  • the sensed feature data may also have confidence levels associated with each sensed feature that represents an estimated probability that the physical entity really has the sensed feature at the particular time 410 A.
  • confidence level 421 a is associated with sensed feature 421 A and represents a confidence that the physical entity 211 really has the sensed feature 421 A.
  • confidence level 421 b is associated with sensed feature 421 B and represents a confidence that the physical entity 211 really has the sensed feature 421 B.
  • the ellipses 421 c again represents that there may be confidence levels expressed for any number of physical features. Furthermore, there may be some physical features for which there is no confidence level expressed (e.g., in the case where there is certainty or in case where it is not important or desirable to measure confidence of a sensed physical feature).
  • the sensed feature data may also have computing-association (e.g., a pointer) to the sensor signal(s) that were used by the recognition component to sense the sense feature of that confidence level.
  • computing-association e.g., a pointer
  • sensor signal(s) 421 Aa is computing-associated with sensed feature 421 A and represents the sensor signal(s) that were used to sense the sensed feature 421 A at the time 411 .
  • sensor signal(s) 421 Bb is computing-associated with sensed feature 421 B and represents the sensor signal(s) that were used to sense the sensed feature 421 B at the time 411 .
  • the ellipses 421 Cc again represents that there may be computing-associations of any number of physical features.
  • the security component 231 of the recognition component 230 may also exercise security in deciding whether or not to record sensor signal(s) that were used to sense particular features at particular times.
  • the security component 231 may exercise security in 1) determining whether to record that particular features were sensed, 2) determining whether to record features associated with particular physical entities, 3) determining whether to record features sensed at particular times, 4) determining whether to record the sensor signal(s), and if so which signals, to record as evidence of a sensed feature, and so forth.
  • the location being tracked is a room.
  • an image sensor e.g., a camera
  • senses something within the room An example sensed feature is that the “thing” is a human being.
  • Another example sensed feature is that the “thing” is a particular named person.
  • the sensed feature set includes one feature that is a more specific type of another feature.
  • the image data from the camera may be pointed to by the record of the sensed feature of the particular physical entity at the particular time.
  • Another example feature is that the physical entity simply exists within the location, or at a particular position within the location. Another example is that this is the first appearance of the physical entity since a particular time (e.g., in recent times, or even ever). Another example of features is that the item is inanimate (e.g., with 99 percent certainty), a tool (e.g., with 80 percent certainty), and a hammer (e.g., with 60 percent certainty). Another example feature is that the physical entity is no longer present (e.g., is absent) from the location, or has a particular pose, is oriented in a certain way, or has a positional relationship with another physical entity within the location (e.g., “on the table” or “sitting in chair # 5 ”).
  • the number and types of features that can be sensed from the number and types of physical entities within any location is innumerable.
  • the acts within box 310 B may potentially be performed multiple times for any given physical entity.
  • physical entity 211 may be against detected by one or both of sensors 221 and 222 .
  • this detection results in the time of the next detection (or is approximation) to be represented within the entity data structure 410 .
  • time 412 is also represented within the entity data structure.
  • sensed features 422 e.g., including perhaps sensed feature 422 A and 422 B—with ellipses 422 C again representing flexibility
  • those sensed features may also have associated confidence levels (e.g., 422 a , 422 b , ellipses 422 c ).
  • those sensed features may also have associated sensor signals (e.g., 422 Aa, 422 Bb, ellipses 422 Cc).
  • the sensed features sensed at the second time may be the same as or different than the sensed features sensed at the first time.
  • the confidence levels may change over time. As an example, suppose a human being is detected at time # 1 at one side of a large room via an image with 90 percent confidence, and that the human being is specifically sensed as being John Doe with 30 percent confidence. Now, at time # 2 that is 0.1 seconds later, John Doe is sensed 50 feet away at another part of the room with 100 percent confidence, and there remains a human being at the same location where John Doe was speculated to be at time 1 .
  • the ellipses 413 and 423 represent that there is no limit to the number of times that a physical entity may be detected within the location. As subsequent detections are made, more may be learned about the physical entity, and thus sensed features may be added (or removed) as appropriate, with corresponding adjustments to confidence levels for each sensed feature.
  • feature changes in the particular entity may be sensed (act 322 ) based on comparison (act 321 ) of the sensed feature(s) of the particular physical entity at different times.
  • This sensed changes may be performed by the recognition component 230 or the computation component 250 .
  • those sensed changes may also be recorded (act 323 ).
  • the sensed changes may be recorded in the entity data structure 410 A in a manner that is, or perhaps is not, computing-associated with a particular time.
  • Sensor signals evidencing the feature change may be reconstructed using the sensor signals that evidenced the sensed feature at each time.
  • this tracking of feature(s) of physical entities may be performed for multiple entities over time.
  • the content of box 310 A may be performed for each of physical entities 211 , 212 , 213 or 214 within the physical space 201 or for other physical entities that enter or exit the physical space 201 .
  • the space-time data structure 401 also is computing-associated (as represented by lines 430 B, 430 C, and 430 D) with a second entity data structure 410 B (perhaps associated with the second physical entity 212 of FIG. 2 ), a third entity data structure 410 C (perhaps associated with the third physical entity 213 of FIG. 2 ); and a fourth entity data structure 410 D (perhaps associated with the fourth physical entity 214 of FIG. 2 ).
  • the space-time data structure 401 may also include one or more triggers that define conditions and actions. When the conditions are met, corresponding actions are to occur.
  • the triggers may be stored at any location in the space-time data structure. For instance, if the conditions are/or actions are with respect to a particular entity data structure, the trigger may be stored in the corresponding entity data structure. If the conditions and/or actions are with respect to a particular feature of a particular entity data structure, the trigger may be stored in the corresponding feature data structure.
  • the ellipses 410 E represent that the number of entity data structures may change. For instance, if tracking data is kept forever with respect to physical entities that are ever within the physical space, then additional entity data structures may be added each time a new physical entity is detected within the location, and any given entity data structure may be augmented each time a physical entity is detected within the physical space. Recall, however, that garbage collection may be performed (e.g., by clean-up component 260 ) to keep the entity tracking data structure 400 from growing too large to be properly edited, stored and/or navigated.
  • physical relationships between different physical entities may be sensed (act 332 ) based on comparison of the associated entities data structures (act 331 ). Those physical relationships may likewise be recorded in the entity tracking data structure 401 (act 333 ) perhaps within the associated entity data structures that have the sensed physical relationships, and/or perhaps associated with the time that the physical entities are sensed as having the relationship. For instance, by analysis of the entity data structures for different physical entities through time, it might be determined that at a particular time, that a physical entity may be hidden behind another physical entity, or that a physical entity may be obscuring the sensing of another physical entity, or that two physical entities have been joined or that a physical entity has been detached to create multiple physical entities. Sensor signals evidencing the physical entity relationship may be reconstructed using the sensor signals that evidenced the sensed feature at the appropriate time and for each physical entity.
  • the feature data store 240 may now be used as a powerful store upon which to compute complex functions and queries over representations of physical entities over time within a physical space. Such computation and querying may be performed by the computation component 250 .
  • This enables enumerable numbers of helpful embodiments, and in fact introduces an entirely new form of computing referred to herein as “ambient computing”.
  • Ambient computing Within the physical space that has sensors, it is as though the very air itself can be used to compute and sense state about the physical world. It is as though a crystal ball has now been created for that physical space from which it is possible to query and/or compute many things about that location, and its history.
  • a user may now query whether an object is right now in a physical space, or where an object was at a particular time within the physical space.
  • the user might also query which person having particular features (e.g., rank or position within a company) is near that object right now, and communicate with that person to bring the object to the user.
  • the user might query as to relationships between physical entities. For instance, the user might query who has possession of an object.
  • the user might query as to the state of an object, whether it is hidden, and what other object is obscuring view of the object.
  • the user might query when a physical entity first appeared within the physical space, when they exited, and so forth.
  • the user might also query when the lights were turned off, when the system became certain of one or more features of a physical entity.
  • the user might also search on feature(s) of an object.
  • the user might also query on activities that have occurred within the location.
  • a user might compute the mean time that a physical entity of a particular type is within the location, anticipate where a physical entity will be at some future time, and so forth. Accordingly, rich computing and querying may be performed on a physical space that has sensors.
  • FIG. 5 illustrates a flowchart of a method 500 for efficiently rendering signal segments of interest.
  • the computing system navigates the navigable graph of sensed features to reach a particular sensed feature (act 501 ). For instance, this navigation may be performed automatic or in response to user input.
  • the navigation may be the result of a calculation, or may simply involve identifying the sensed feature of interest.
  • the navigation may be the result of a user query.
  • a calculation or query may result in multiple sensed features being navigated to. As an example, suppose that the computing system navigates to sensed feature 222 A in FIG. 2 .
  • the computing system then navigates to the sensed signal computer-associated with the particular sensed feature (act 502 ) using the computer-association between the particular sensed feature and the associated sensor signal. For instance, in FIG. 2 , with the sensed feature being sensed feature 222 A, the computer-association is used to navigate to the signal segment 222 Aa.
  • the signal segment may then be rendered (act 503 ) on an appropriate output device.
  • the appropriate output device might be one or more of output mechanisms 112 A.
  • audio signals may be rendered using speakers, and visual data may be rendered using a display.
  • navigating to the sensed signal(s) multiple things could happen. The user might play a particular signal segment, or perhaps choose from multiple signal segments that contributed to the feature. A view could be synthesized from the multiple signal segments.
  • FIG. 6 illustrates a flowchart of a method 600 for controlling creation of or access to information sensed by one or more sensors in a physical space.
  • the method includes creating (act 601 ) a computer-navigable graph of features of sensed physical entities sensed in a physical space over time.
  • the principles described herein are not limited to the precise structure of such a computer-navigable graph. An example structure and its creation have been described with respect to FIGS. 2 through 4 .
  • the method 600 also includes restricting creation of or access to nodes of the computer-navigable graph based on one or more criteria (act 602 ).
  • security is imposed upon the computer-navigable graph.
  • the arrows 603 and 604 represent that the process of creating the graph and restrict creation/access to its nodes may be a continual process.
  • the graph may be continuously have nodes added to (and perhaps removed from) the graph.
  • restrictions of creation may be considered whenever there is a possibility of creation of a node.
  • Restrictions of access may be decided when a node of the graph is created, or at any point thereafter. Examples of restrictions might include, for instance, a prospective identity of a sensed physical entity, a sensed feature of a sensed physical entity, and so forth.
  • access criteria for each node.
  • Such access criteria may be explicit or implicit. That is, if there is no access criteria explicit for the node that is to be accessed, then perhaps a default set of access criteria may apply.
  • the access criteria for any given node may be organized in any manner. For instance, in one embodiment, the access criteria for a node may be stored with the node in the computer-navigable graph.
  • the access restrictions might also include restrictions based on a type of access requested. For instance, a computational access means that node is not directly accessed, but is used in a computation. Direct access to read the content of a node may be restricted, whilst computational access that does not report the exact contents of the node may be allowed.
  • Access restrictions may also be based on the type of node accessed. For instance, there may be a restriction in access to the particular entity data structure node of the computer-navigable graph. For instance, if that particular entity data structure node represents detections of a particular person in the physical space, access might be denied. There may also be restrictions in access to particular signal segment nodes of the computer-navigable graph. As an example, perhaps one may be able to determine that a person was in a location at a given time, but not be able to review video recordings of that person at that location. Access restrictions may also be based on who is the requestor of access.
  • determining whether to restrict creation of a particular sensed feature node of the computer-navigable graph there may be a variety of criteria considered. For instance, there may be a restriction in creation of a particular signal segment node of a computer-navigable graph.
  • FIG. 7 illustrates a recurring flow 700 showing that in addition to creating a computer-navigable graph of sensed features in the physical space (act 701 ), there may also be pruning of the computer-navigable graph (act 702 ). These acts may even occur simultaneously and continuously (as represented by the arrows 703 and 704 ) to thereby keep the computer-navigable graph of sensed features at a manageable size. There has been significant description herein about how the computer-navigable graph may be created (represented as act 701 ).
  • any node of the computer-navigable graph may be subject to removal.
  • sensed features of a physical entity data structure may be removed for specific time or group of times.
  • a sensed feature of a physical entity data structure may also be removed for all times.
  • More than one sensed features of a physical entity data structure may be removed for any given time, or for any group of times.
  • a physical entity data structure may be entirely removed in some cases.
  • the removal of a node may occur, for instance, when the physical graph represents something that is impossible given the laws of physics. For instance, a given object cannot be at two places at the same time, nor can that object travel significant distances in a short amount of time in an environment in which such travel is infeasible or impossible. Accordingly, if a physical entity is tracked with absolute certainty at one location, any physical entity data structure that represent with lesser confidence that the same physical entity is at an inconsistent location may be deleted.
  • the removal of a node may also occur when more confidence is obtained regarding a sensed feature of a physical entity. For instance, if a sensed feature of a physical entity within a location is determined with 100 percent certainty, then the certainty levels of that sensed feature of that physical entity may be updated to read 100 percent for all prior times also. Furthermore, sensed features that have been learned to not be applicable to a physical entity (i.e., the confidence level has reduced to zero or negligible), the sensed feature may be removed for that physical entity.
  • some information in the computer-navigable graph may simply be too stale to be useful. For instance, if a physical entity has not been observed in the physical space for a substantial period of time so as to make the prior recognition of the physical entity no longer relevant, then the entire physical entity data structure may be removed. Furthermore, detections of a physical entity that have become staled may be removed though the physical entity data structure remains to reflect more recent detections.
  • cleansing (or pruning) of the computer-navigable graph may be performed via intrinsic analysis and/or via extrinsic information. This pruning intrinsically improves the quality of the information represented in the computer-navigable graph, by removing information of lesser quality, and freeing up space for more relevant information to be stored.
  • the principles described herein allow for a computer-navigable graph of the physical world.
  • the graph may be searchable and queriable thereby allowing for searching and querying and other computations to be performed on the real world.
  • Security may further be imposed in such an environment.
  • the graph may be kept to a manageable size through cleansing and pruning.
  • the above-described computer-navigable graph of physical space enables a wide variety of applications and technical achievements.
  • three of such achievements that will now be described are based on a physical graph that has signal segments that evidence state of physical entities, and thus the physical graph has a semantic understanding of what is happening within any given signal segment.
  • portions of signal segments may be shared, with semantic understanding taking part on which portions of the signal segment are extracted for sharing.
  • signal segments may automatically be narrated using semantic understanding of what is happening within a signal segment.
  • actors may be trained in a just-in-time fashion by providing representations of signal segments when the actor is to begin or has begun an activity.
  • FIG. 8 illustrates a flowchart of a method 800 for sharing at least a portion of a signal segment.
  • the signal segment might be, for instance, multiple signal segments that have captured the same physical entity. For instance, if the signal segment is a video signal segment, multiple video segments may have captured the same physical entity or entities from different perspectives and distances. If the signal is an audio signal segment, multiple audio segments may have been captured the selected physical entity or entities with different acoustic channels intervening between corresponding acoustic sensors and the selected physical entity or entities (or portions thereof).
  • the signal segment(s) being shared may be a live signal segment that is capturing signals live from one or more physical entities within a location. Alternatively, the signal segment(s) being shared may be a recorded signal segment.
  • the system detects selection of one or more physical entities or portions thereof that is/are rendered within one or more signal segments (act 801 ).
  • sharing may be initiated based on the semantic content of a signal segment.
  • the selected physical entity or entities (or portion(s) thereof) may be the target of work or a source of work.
  • the user might select a target of work such as a physical whiteboard.
  • Another example target of work might be a piece of equipment that is being repaired.
  • sources of work might be, for instance, a person writing on a physical whiteboard, a dancer, a magician, a construction worker, and so forth.
  • the individual that selected the physical entity or entities (or portions thereof) for sharing may be a human being.
  • the user might select the physical entity or entities (or portions thereof) in any manner intuitive to a human user. Examples of such input include gestures. For instance, the user might circle an area that encompasses the physical entity or entities (or portions thereof) within a portion of a video or image signal segment.
  • the selection may be made by a system.
  • the system might select that the portion of signal segments that includes particular physical entity or entities (or portions thereof) be shared upon detection of a particular condition, and/or in accordance with policy. For instance, as described below with respect to FIG. 10 , the system might detect that a human actor is about to engage in a particular activity that requires training. The system might then select the physical entities or entities that are similar to a target of activity, or that include an individual as that individual has previously performed the activity, to share with the human actor. A narration of the activity may even be automatically generated and provide (as described with respect to FIG. 9 ).
  • the system then extracts portion(s) of the signal segment(s) in which the selected physical entity or selected portion of the physical entity is rendered (act 802 ).
  • the signal segment might be multiple video signal segments.
  • the system might create a signal segment in which the point of view changes from one signal segment (generated by one sensor) to another signal segment (generated by another sensor) upon the occurrence of condition(s) that occur with respect to the selected physical entity or entities (or the selected portion there). For instance, suppose the selected physical entity is those portions of the whiteboard that an instructor is currently writing on. If the instructor's body was to obscure his own writing from the perspective of one sensor, another signal segment that captures the active portion of the whiteboard may be switched to automatically. The system may perform such switching (of live signal segments) or stitching (or recorded video segments) automatically.
  • the system then dispatches a representation of the signal segment(s) that encompasses the selected physical entity or entities (or portions there) to one or more recipients (act 803 ).
  • Such recipients may be human beings, components, robotics, or any other entity capable of using the shared signal segment portion(s).
  • the signal segments represents a portion of a physical graph that includes representations of physical entities sensed within a physical space, along with signal segments that evidence state of the physical entities.
  • An example of such a physical graph has been described above with respect to FIGS. 2 through 4 with respect to the computer-navigable graph 400 .
  • the system could also dispatch a portion of the physical graph that relates to the signal segment portion(s) that are shared, and/or perhaps may extract information from that corresponding portion of the physical graph to share along with (or as an alternative to) the sharing of the signal segment portion(s) themselves.
  • FIG. 9 illustrates a flowchart of a method 900 for automatically generating a narration of what is happening in a signal segment.
  • automatic narration of a chess game, a football game, or the like may be performed.
  • Automatic narration of a work activity may also be performed.
  • the signal segment is accessed from the physical graph (act 901 ).
  • the system determines how physical entity or entities are acting in the signal segment (act 902 ). Again, such a semantic understanding may be gained from the physical graph portions that correspond to the signal segment.
  • the system might determine what actions are salient taking into consideration any number of factors (or a balance thereof) including, for instance, whether an action is happening repeatedly, what has not changed, what is occurring constantly, what portion(s) of the signal segment have been shared, user instruction, and so forth. There could also be some training of the system and machine learning to know what actions are potentially relevant.
  • the system then automatically generates a narration of the actions in the signal segment using the determined actions of the one or more physical entities (act 903 ).
  • the generated narration might be one or more of an audio narration, a diagrammatic narration, or the like.
  • the audio narration what is happening in the video signal segment might be spoken.
  • a diagrammatic narration a diagram of what is happening in the video signal segment may be rendered with irrelevant material removed, and with relevant physical entities being visually emphasized.
  • the relevant physical entities might be represented in simplified (perhaps cartoonish) form, with movement and actions potentially visually represented (e.g., by arrows).
  • the generated narration might include a summarized audio narration that includes a simplified audio of the relevant matters that are heard within the signal segment.
  • the generated narration might also be a diagrammatic narration.
  • the narration might also show or describe to an intended recipient what to do with respect to the intended recipient's environment, since the physical graph might also have semantic understanding of the surroundings of the intended recipient.
  • FIG. 10 illustrates a flowchart of a method 1000 for automatically training upon the occurrence of a condition with respect to that actor.
  • the method 1000 is initiated upon detecting that the condition is met with respect to the actor a human actor is engaging in or is about to engage in a physical activity (act 1001 ).
  • the condition might be that the actor is performing or is about to perform a physical activity, or that the actor has a certain physical status (e.g., presence within a room).
  • the system evaluates whether or not training is to be provided to the human actor for that physical activity (act 1002 ). For instance, the determination might be based on some training policy.
  • training might be mandatory at least once for all employees of an organization with respect to that activity, and might be mandated to occur before the activity is performed. Training might be required on an annual basis. Training might be required depending on who the actor is (e.g., a new hire, or a safety officer).
  • Training might be offered when the system determines that the human actor is engaging in an activity improperly (e.g., failing to bend her knees when lifting a heavy object).
  • the training may be tailored to how the human actor is performing the activity improperly.
  • the training may be offered repeatedly depending on a learning rate of the human actor.
  • Training is then dispatched to the actor (act 1003 ).
  • a robot or human may be dispatched to the actor to show the actor how to perform an activity.
  • a representation of a signal segment may then dispatched to the actor (act 1003 ), where the representation provides training to the human actor.
  • the representation might be a narration (such as that automatically generated by the method 900 of FIG. 9 ).
  • the shared representation might be a multi-sensor signal segment—such as a stitched video signal in which the point of view strategically changes upon the occurrence of one or more conditions with respect to a selected physical entity or selected portion of the physical entity, as described above with respect to FIG. 9 .

Abstract

Automatic generation of a narration of what is happening in a signal segment (live or recorded). The signal segment that is to be narrated is accessed from a physical graph. In the physical graph, the signal segment evidences state of physical entities, and thus has a semantic understanding of what is depicted in the signal segment. The system then automatically determines how the physical entities are acting within the signal segment based on that semantic understanding, and builds a narration of the activities based on the determined actions. The system may determine what is interesting for narration based on a wide variety of criteria. The system could use machine learning to determine what will be interesting to narrate.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. application Ser. No. 15/436,655, filed Feb. 17, 2017, titled “AUTOMATIC NARRATION OF SIGNAL SEGMENT,” which will issue on Jun. 9, 2020, as U.S. Pat. No. 10,679,669, which claims the benefit of United States Provisional Application 62/447,821 filed Jan. 18, 2017, titled “AUTOMATIC NARRATION OF SIGNAL SEGMENT”, the entirety of each of which are expressly incorporated herein by this reference.
  • BACKGROUND
  • Computing systems and associated networks have greatly revolutionized our world. At first, computing systems were only able to perform simple tasks. However, as processing power has increased and become increasingly available, the complexity of tasks performed by a computing system has greatly increased. Likewise, the hardware complexity and capability of computing systems has greatly increased, as exemplified with cloud computing that is supported by large data centers.
  • For a long period of time, computing systems just did essentially what they were told by their instructions or software. However, software and the employment of hardware is becoming so advanced that computing systems are now, more than ever before, capable of some level of decision making at higher levels. At present, in some respects, the level of decision making can approach, rival, or even exceed the capability of the human brain to make decisions. In other words, computing systems are now capable of employing some level of artificial intelligence.
  • One example of artificial intelligence is the recognition of external stimuli from the physical world. For instance, voice recognition technology has improved greatly allowing for high degree of accuracy in detecting words that are being spoken, and even the identity of the person that is speaking. Likewise, computer vision allows computing systems to automatically identify objects within a particular picture or frame of video, or recognize human activity across a series of video frames. As an example, face recognition technology allows computing systems to recognize faces, and activity recognition technology allows computing systems to know whether two proximate people are working together.
  • Each of these technologies may employ deep learning (Deep Neural Network-based and reinforcement-based learning mechanisms) and machine learning algorithms to learn from experience what is making a sound, and objects or people that are within an image, thereby improving accuracy of recognition over time. In the area of recognizing objects within a more complex imaged scene with large numbers of visual distractions, advanced computer vision technology now exceeds the capability of a human being to quickly and accurately recognize objects of interest within that scene. Hardware, such as matrix transformation hardware in conventional graphical processing units (GPUs), may also contribute to the rapid speed in object recognition in the context of deep neural networks.
  • The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practiced.
  • BRIEF SUMMARY
  • At least some embodiments described herein relate to automatic generation of a narration of what is happening in a signal segment (live or recorded). The signal segment that is to be narrated is accessed from a physical graph. In the physical graph, the signal segment evidences state of one or more physical entities, and thus has a semantic understanding of what is depicted in the signal segment. The system then automatically determines how one or more physical entities are acting within the signal segment based on that semantic understanding, and builds a narration of the activities based on the determined actions. The system may determine what is interesting for narration based on a wide variety of criteria, such as whether an action is happening repeatedly, what has not changed in the signal segment, what is occurring constantly in the signal, what portion(s) of the signal segment have been shared in the past (or currently), user instructions, and so forth. The system could use machine learning to determine what will be interesting to narrate.
  • This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • In order to describe the manner in which the above-recited and other advantages and features of the invention can be obtained, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered to be limiting of its scope, the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
  • FIG. 1 illustrates an example computer system in which the principles described herein may be employed;
  • FIG. 2 illustrates an environment in which the principles described herein may operate, which includes a physical space that includes multiple physical entities and multiple sensors, a recognition component that senses features of physical entities within the physical space, and a feature store that stores sensed features of such physical entities, such that computation and querying may be performed against those features;
  • FIG. 3 illustrates a flowchart of a method for tracking physical entities within a location and may be performed in the environment of FIG. 2;
  • FIG. 4 illustrates an entity tracking data structure that may be used to assist in performing the method of FIG. 3, and which may be used to later perform queries on the tracked physical entities; FIG. 5 illustrates a flowchart of a method for efficiently rendering signal segments of interest;
  • FIG. 6 illustrates a flowchart of a method for controlling creation of or access to information sensed by one or more sensors in a physical space;
  • FIG. 7 illustrates a recurring flow showing that in addition to creating a computer-navigable graph of sensed features in the physical space, there may also be pruning of the computer-navigable graph to thereby keep the computer-navigable graph of the real world at a manageable size;
  • FIG. 8 illustrates a flowchart of a method for sharing at least a portion of a signal segment;
  • FIG. 9 illustrates a flowchart of a method for automatically generating a narration of what is happening in a signal segment; and
  • FIG. 10 illustrates a flowchart of a method for automatically training an actor that is performing or is about to perform an activity.
  • DETAILED DESCRIPTION
  • At least some embodiments described herein relate to automatic generation of a narration of what is happening in a signal segment (live or recorded). The signal segment that is to be narrated is accessed from a physical graph. In the physical graph, the signal segment evidences state of one or more physical entities, and thus has a semantic understanding of what is depicted in the signal segment. The system then automatically determines how one or more physical entities are acting within the signal segment based on that semantic understanding, and builds a narration of the activities based on the determined actions. The system may determine what is interesting for narration based on a wide variety of criteria, such as whether an action is happening repeatedly, what has not changed in the signal segment, what is occurring constantly in the signal, what portion(s) of the signal segment have been shared in the past (or currently), user instructions, and so forth. The system could use machine learning to determine what will be interesting to narrate.
  • Because the principles described herein operate in the context of a computing system, a computing system will be described with respect to FIG. 1. Then, the principles of the foundation upon which ambient computing may be performed will then be described with respect to FIGS. 2 through 4. The obtaining of signal segments from the computer-navigable graph will then be described with respect to FIG. 5. Thereafter, the application of security in the context of ambient computing will be described with respect to FIG. 6. Finally, the managing of the size of the computer-navigable graph will be described with respect to FIG. 7. Then three implementations that use the semantic understanding provided by the computer-navigable graph (also called herein a “physical graph” will be described with respect to FIGS. 8 through 10.
  • Computing systems are now increasingly taking a wide variety of forms. Computing systems may, for example, be handheld devices, appliances, laptop computers, desktop computers, mainframes, distributed computing systems, datacenters, or even devices that have not conventionally been considered a computing system, such as wearables (e.g., glasses, watches, bands, and so forth). In this description and in the claims, the term “computing system” is defined broadly as including any device or system (or combination thereof) that includes at least one physical and tangible processor, and a physical and tangible memory capable of having thereon computer-executable instructions that may be executed by a processor. The memory may take any form and may depend on the nature and form of the computing system. A computing system may be distributed over a network environment and may include multiple constituent computing systems.
  • As illustrated in FIG. 1, in its most basic configuration, a computing system 100 typically includes at least one hardware processing unit 102 and memory 104. The memory 104 may be physical system memory, which may be volatile, non-volatile, or some combination of the two. The term “memory” may also be used herein to refer to non-volatile mass storage such as physical storage media. If the computing system is distributed, the processing, memory and/or storage capability may be distributed as well.
  • The computing system 100 has thereon multiple structures often referred to as an “executable component”. For instance, the memory 104 of the computing system 100 is illustrated as including executable component 106. The term “executable component” is the name for a structure that is well understood to one of ordinary skill in the art in the field of computing as being a structure that can be software, hardware, or a combination thereof. For instance, when implemented in software, one of ordinary skill in the art would understand that the structure of an executable component may include software objects, routines, methods that may be executed on the computing system, whether such an executable component exists in the heap of a computing system, or whether the executable component exists on computer-readable storage media.
  • In such a case, one of ordinary skill in the art will recognize that the structure of the executable component exists on a computer-readable medium such that, when interpreted by one or more processors of a computing system (e.g., by a processor thread), the computing system is caused to perform a function. Such structure may be computer-readable directly by the processors (as is the case if the executable component were binary). Alternatively, the structure may be structured to be interpretable and/or compiled (whether in a single stage or in multiple stages) so as to generate such binary that is directly interpretable by the processors. Such an understanding of example structures of an executable component is well within the understanding of one of ordinary skill in the art of computing when using the term “executable component”.
  • The term “executable component” is also well understood by one of ordinary skill as including structures that are implemented exclusively or near-exclusively in hardware, such as within a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), or any other specialized circuit. Accordingly, the term “executable component” is a term for a structure that is well understood by those of ordinary skill in the art of computing, whether implemented in software, hardware, or a combination. In this description, the term “component” may also be used. As used in this description and in the case, this term (regardless of whether the term is modified with one or more modifiers) is also intended to be synonymous with the term “executable component” or be specific types of such an “executable component”, and thus also have a structure that is well understood by those of ordinary skill in the art of computing.
  • In the description that follows, embodiments are described with reference to acts that are performed by one or more computing systems. If such acts are implemented in software, one or more processors (of the associated computing system that performs the act) direct the operation of the computing system in response to having executed computer-executable instructions that constitute an executable component. For example, such computer-executable instructions may be embodied on one or more computer-readable media that form a computer program product. An example of such an operation involves the manipulation of data.
  • The computer-executable instructions (and the manipulated data) may be stored in the memory 104 of the computing system 100. Computing system 100 may also contain communication channels 108 that allow the computing system 100 to communicate with other computing systems over, for example, network 110.
  • While not all computing systems require a user interface, in some embodiments, the computing system 100 includes a user interface 112 for use in interfacing with a user. The user interface 112 may include output mechanisms 112A as well as input mechanisms 112B. The principles described herein are not limited to the precise output mechanisms 112A or input mechanisms 112B as such will depend on the nature of the device. However, output mechanisms 112A might include, for instance, speakers, displays, tactile output, holograms, virtual reality, and so forth. Examples of input mechanisms 112B might include, for instance, microphones, touchscreens, holograms, virtual reality, cameras, keyboards, mouse of other pointer input, sensors of any type, and so forth.
  • Embodiments described herein may comprise or utilize a special purpose or general-purpose computing system including computer hardware, such as, for example, one or more processors and system memory, as discussed in greater detail below. Embodiments described herein also include physical and other computer-readable media for carrying or storing computer-executable instructions and/or data structures. Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computing system. Computer-readable media that store computer-executable instructions are physical storage media. Computer-readable media that carry computer-executable instructions are transmission media. Thus, by way of example, and not limitation, embodiments can comprise at least two distinctly different kinds of computer-readable media: storage media and transmission media.
  • Computer-readable storage media includes RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other physical and tangible storage medium which can be used to store desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computing system.
  • A “network” is defined as one or more data links that enable the transport of electronic data between computing systems and/or modules and/or other electronic devices. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a computing system, the computing system properly views the connection as a transmission medium. Transmissions media can include a network and/or data links which can be used to carry desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computing system. Combinations of the above should also be included within the scope of computer-readable media.
  • Further, upon reaching various computing system components, program code means in the form of computer-executable instructions or data structures can be transferred automatically from transmission media to storage media (or vice versa). For example, computer-executable instructions or data structures received over a network or data link can be buffered in RAM within a network interface module (e.g., a “NIC”), and then eventually transferred to computing system RAM and/or to less volatile storage media at a computing system. Thus, it should be understood that readable media can be included in computing system components that also (or even primarily) utilize transmission media.
  • Computer-executable instructions comprise, for example, instructions and data which, when executed at a processor, cause a general purpose computing system, special purpose computing system, or special purpose processing device to perform a certain function or group of functions. Alternatively, or in addition, the computer-executable instructions may configure the computing system to perform a certain function or group of functions. The computer executable instructions may be, for example, binaries or even instructions that undergo some translation (such as compilation) before direct execution by the processors, such as intermediate format instructions such as assembly language, or even source code.
  • Those skilled in the art will appreciate that the invention may be practiced in network computing environments with many types of computing system configurations, including, personal computers, desktop computers, laptop computers, message processors, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, mobile telephones, PDAs, pagers, routers, switches, datacenters, wearables (such as glasses or watches) and the like. The invention may also be practiced in distributed system environments where local and remote computing systems, which are linked (either by hardwired data links, wireless data links, or by a combination of hardwired and wireless data links) through a network, both perform tasks. In a distributed system environment, program modules may be located in both local and remote memory storage devices.
  • Those skilled in the art will also appreciate that the invention may be practiced in a cloud computing environment. Cloud computing environments may be distributed, although this is not required. When distributed, cloud computing environments may be distributed internationally within an organization and/or have components possessed across multiple organizations. In this description and the following claims, “cloud computing” is defined as a model for enabling on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services). The definition of “cloud computing” is not limited to any of the other numerous advantages that can be obtained from such a model when properly deployed.
  • For instance, cloud computing is currently employed in the marketplace so as to offer ubiquitous and convenient on-demand access to the shared pool of configurable computing resources. Furthermore, the shared pool of configurable computing resources can be rapidly provisioned via virtualization and released with low management effort or service provider interaction, and then scaled accordingly.
  • A cloud computing model can be composed of various characteristics such as on-demand self-service, broad network access, resource pooling, rapid elasticity, measured service, and so forth. A cloud computing model may also come in the form of various service models such as, for example, Software as a Service (“SaaS”), Platform as a Service (“PaaS”), and Infrastructure as a Service (“IaaS”). The cloud computing model may also be deployed using different deployment models such as private cloud, community cloud, public cloud, hybrid cloud, and so forth. In this description and in the claims, a “cloud computing environment” is an environment in which cloud computing is employed.
  • FIG. 2 illustrates an environment 200 in which the principles described herein may operate. The environment 200 includes a physical space 201 that includes multiple physical entities 210, which may be any extant object, person, or thing that emits or reflects physical signals (such as electromagnetic radiation or acoustics) that has a pattern that may be used to potentially identify one or more physical features (also called herein states) of the respective object, person, or thing. An example of such potentially identifying electromagnetic radiation is visible light that has a light pattern (e.g., a still image or video) from which characteristics of visible entities may be ascertained. Such light pattern may be any temporal, spatial, or even higher-dimensional space. An example of such acoustics may the voice of a human being, the sound of an object in normal operation or undergoing an activity or event, or a reflected acoustic echo.
  • The environment 200 also includes sensors 220 that receive physical signals from the physical entities 210. The sensors need not, of course, pick up every physical signal that the physical entity emits or reflects. For instance, a visible light camera (still or video) is capable of receiving electromagnetic radiation in the form of visible light and converting such signals into processable form, but cannot pick up all electromagnetic radiation of any frequency since cameras all have a finite dynamic range. Acoustic sensors likewise have limited dynamic range designed for certain frequency ranges. In any case, the sensors 220 provide (as represented by arrow 229) resulting sensor signals to a recognition component 230.
  • The recognition component 230 at least estimates (e.g., estimates or recognizes) one or more features of the physical entities 210 within the location based on patterns detected in the received sensor signals. The recognition component 230 may also generate a confidence level associated with the “at least an estimation” of a feature of the physical entity. If that confidence level is less than 100%, then the “at least an estimation” is just an estimation. If that confidence level is 100%, then the “at least an estimation” is really more than an estimation—it is a recognition. In the remainder of this description and in the claims, a feature that is “at least estimated” will also be referred to as a “sensed” feature to promote clarity. This is consistent with the ordinary usage of the term “sense” since a feature that is “sensed” is not always present with complete certainty. The recognition component 230 may employ deep learning (Deep Neural Network-based and reinforcement-based learning mechanisms) and machine learning algorithms to learn from experience what objects or people that are within an image, thereby improving accuracy of recognition over time.
  • The recognition component 230 provides (as represented by arrow 239) the sensed features into a sensed feature store 240, which can store the sensed features (and associated confidence levels) for each physical entity within the location 201, whether the physical entity is within the physical space for a short time, a long time, or permanently. The computation component 250 may then perform a variety of queries and/or computations on the sensed feature data provided in sensed feature store 240. The queries and/or computations may be enabled by interactions (represented by arrow 249) between the computation component 250 and the sensed feature store 240.
  • In some embodiments, when the recognition component 230 senses a sensed feature of a physical entity within the location 201 using sensor signal(s) provided by a sensor, the sensor signals are also provided to a store, such as the sensed feature store. For instance, in FIG. 2, the sensed feature store 240 is illustrated as including sensed features 241 as well as the corresponding sensor signals 242 that represent the evidence of the sense features.
  • For at least one (and preferably many) of the sensed features for at least one of the sensed plurality of entities, at least one signal segment is computer-associated with the sensed feature such that computer-navigation to the sensed feature also allows for computer-navigation to the signal segment. The association of the sensed signal with the associated signal segment may be performed continuously, thus resulting in an expanding graph, and an expanding collection of signal segments. That said, as described further below, garbage collection processes may be used to clean up sensed features and/or signal segments that are outdated or no longer of interest.
  • The signal segment may include multiple pieces of metadata such as, for instance, an identification of the sensor or sensors that generated the signal segment. The signal segment need not include all of the signals that were generated by that sensor, and for brevity, may perhaps include only those portions of the signal that were used to sense the sensed feature of the particular physical entity. In that case, the metadata may include a description of the portion of the original signal segment that was stored.
  • The sensed signal may be any type of signal that is generated by a sensor. Examples include video, image, and audio signals. However, the variety of signals is not limited to those that can be sensed by a human being. For instance, the signal segment might represented a transformed version of the signal generated by the sensor to allow for human observations of better human focus. Such transformations might include filtering, such a filtering based on frequency, or quantization. Such transformation might also include amplification, frequency shifting, speed adjustment, magnifying, amplitude adjustment, and so forth.
  • In order to allow for reduction in storage requirements as well as proper focus on the signal of interest, perhaps only a portion of the signal segment is stored. For instance, if a video signal, perhaps only a portion of the frames of the video are stored. Furthermore, for any given image, perhaps only the relevant portion of the frame is stored. Likewise, if the sensor signal was an image, perhaps only the relevant portion of the image is stored. The recognition service that uses the signal segment to sense a feature is aware of which portion of the signal segment that was used to sense a feature. Accordingly, a recognition service can specifically carve out the relevant portion of the signal for any given sensed feature.
  • The computation component 250 may also have a security component 251 that may determine access to data with the sensed feature store 240. For instance, the security component 251 may control which users may access the sensed feature data 241 and/or the sensor signals 242. Furthermore, the security component 251 may even control which of the sensed feature data that computations are performed over, and/or which user are authorized to perform what type of computations or queries. Thus, security is effectively achieved. More regarding this security will be described below with respect to FIG. 6.
  • Since the sensed feature data represents the sensed features of the physical entities within the physical space 201 over time, complex computing may be performed on the physical entities within the physical space 201. As will be described below, for a user, it is as though the very environment itself is filled with helpful computing power that is getting ready for any computing query or computation regarding that physical space. This will be referred to hereinafter also as “ambient computing”.
  • Furthermore, whenever a sensed feature is of interest, the evidence supporting that recognition components sensing of that feature may be reconstructed. For instance, the computing component 240 might provide video evidence of when a particular physical entity first entered a particular location. If multiple sensors generated sensor signals that were used by the recognition component to sense that feature, then the sensor signals for any individual sensor or combination of sensors may be reconstructed and evaluated. Thus, for instance, the video evidence of the physical entity first entering a particular location may be reviewed from different angles.
  • The physical space 201 is illustrated in FIG. 2 and is intended just to be an abstract representation of any physical space that has sensors in it. There are infinite examples of such physical spaces, but examples include a room, a house, a neighborhood, a factory, a stadium, a building, a floor, an office, a car, an airplane, a spacecraft, a Petri dish, a pipe or tube, the atmosphere, underground spaces, caves, land, combinations and/or portions thereof. The physical space 201 may be the entirety of the observable universe or any portion thereof so long as there are sensors capable of receiving signals emitted from, affected by (e.g., diffraction, frequency shifting, echoes, etc.), and/or reflected from the physical entities within the location.
  • The physical entities 210 within the physical space 201 are illustrated as including four physical entities 211, 212, 213 and 214 by way of example only. The ellipses 215 represent that there may be any number and variety of physical entities having features that are being sensed based on data from the sensors 220. The ellipses 215 also represent that physical entities may exit and enter the location 201. Thus, the number and identity of physical entities within the location 201 may change over time.
  • The position of the physical entities may also vary over time. Though the position of the physical entities is shown in the upper portion of the physical space 201 in FIG. 2, this is simply for purpose of clear labelling. The principles described herein are not dependent on any particular physical entity occupying any particular physical position within the physical space 201.
  • Lastly, for convention only and to distinguish physical entities 210 from the sensors 220, the physical entities 210 are illustrated as triangles and the sensors 220 are illustrated as circles. The physical entities 210 and the sensors 220 may, of course, have any physical shape or size. Physical entities typically are not triangular in shape, and sensors are typically not circular in shape. Furthermore, sensors 220 may observe physical entities within a physical space 201 without regard for whether or not those sensors 220 are physically located within that physical space 201.
  • The sensors 220 within the physical space 201 are illustrated as including two sensors 221 and 222 by way of example only. The ellipses 223 represent that there may be any number and variety of sensors that are capable of receiving signals emitted, affected (e.g., via diffraction, frequency shifting, echoes, etc.) and/or reflected by the physical entities within the physical space. The number and capability of operable sensors may change over time as sensors within the physical space are added, removed, upgrade, broken, replaced, and so forth.
  • FIG. 3 illustrates a flowchart of a method 300 for tracking physical entities within a physical space. Since the method 300 may be performed to track the physical entities 210 within the physical space 201 of FIG. 2, the method 300 of FIG. 3 will now be described with frequent reference to the environment 200 of FIG. 2. Also, FIG. 4 illustrates an entity tracking data structure 400 that may be used to assist in performing the method 300, and which may be used to later perform queries on the tracked physical entities, and perhaps also to access and review the sensor signals associated with the tracked physical entities. Furthermore, the entity tracking data structure 400 may be stored in the sensed feature store 240 of FIG. 4 (which is represented as sensed feature data 241). Accordingly, the method 300 of FIG. 3 will also be described with frequent reference to the entity tracking data structure 400 of FIG. 4.
  • In order to assist with tracking, a space-time data structure for the physical space is set up (act 301). This may be a distributed data structure or a non-distributed data structure. FIG. 4 illustrates an example of an entity tracking data structure 400 that includes a space-time data structure 401. This entity tracking data structure 400 may be included within the sensed feature store 240 of FIG. 2 as sensed feature data 241. While the principles described herein are described with respect to tracking physical entities, and their sensed features and activities, the principles described herein may operate to tracking physical entities (and their sensed features and activities) within more than one location. In that case, perhaps the space-time data structure 401 is not the root node in the tree represented by the entity tracking data structure 400 (as symbolized by the ellipses 402A and 402B). Rather there may be multiple space-time data structures that may be interconnected via a common root node.
  • Then, returning to FIG. 3, the content of box 310A may be performed for each of multiple physical entities (e.g., physical entities 210) that are at least temporarily within a physical space (e.g., physical space 201). Furthermore, the content of box 310B is illustrated as being nested within box 310A, and represents that its content may be performed at each of multiple times for a given physical entity. By performing the method 300, a complex entity tracking data structure 400 may be created and grown, to thereby record the sensed features of physical entities that are one or more times within the location. Furthermore, the entity tracking data structure 400 may potentially also be used to access the sensed signals that resulted in certain sensed features (or feature changes) being recognized.
  • For a particular physical entity in the location at a particular time, a physical entity is sensed by one or more sensors (act 311). In other words, one or more physical signals emitted from, affected by (e.g., via diffraction, frequency shifting, echoes, etc.), and/or reflected from the physical entity is received by one or more of the sensors. Referring to FIG. 1, suppose that physical entity 211 has one or more features that are sensed by both sensors 221 and 222 at a particular time.
  • One aspect of security may enter at this point. The recognition component 230 may have a security component 231 that, according to particular settings, may refuse to record sensed features associated with particular physical entities, sensed features of a particular type, and/or that were sensed from sensor signals generated at particular time, or combinations thereof. For instance, perhaps the recognition component 230 will not record sensed features of any people that are within the location. As a more fine-grained examples, perhaps the recognition component 230 will not record sensed features of a set of people, where those sensed features relate to an identity or gender of the person, and where those sensed features resulted from sensor signals that were generated at particular time frames. More regarding this security will again be described below with respect to FIG. 6.
  • If permitted, an at least approximation of that particular time at which the physical entity was sensed is represented within an entity data structure that corresponds to the physical entity and this is computing-associated with the space-time data structure (act 312). For instance, referring to FIG. 4, the entity data structure 410A may correspond to the physical entity 211 and is computing-associated (as represented by line 430A) with the space-time data structure 401. In this description and in the claims, one node of a data structure is “computing-associated” with another node of a data structure if a computing system is, by whatever means, able to detect an association between the two nodes. For instance, the use of pointers is one mechanism for computing-association. A node of a data structure may also be computing-associated by being included within the other node of the data structure, and by any other mechanism recognized by a computing system as being an association.
  • The time data 411 represents an at least approximation of the time that the physical entity was sensed (at least at this time iteration of the content of box 310B) within the entity data structure 410A. The time may be a real time (e.g., expressed with respect to an atomic clock), or may be an artificial time. For instance, the artificial time may be a time that is offset from real-time and/or expressed in a different manner than real time (e.g., number of seconds or minutes since the last turn of the millennium). The artificial time may also be a logical time, such as a time that is expressed by a monotonically increasing number that increments at each sensing.
  • Also, based on the sensing of the particular physical entity at the particular time (at act 311), the environment senses at least one physical feature (and perhaps multiple) of the particular physical entity in which the particular physical entity exists at the particular time (act 313). For instance, referring to FIG. 2, the recognition component 230 may sense at least one physical feature of the physical entity 211 based on the signals received from the sensors 221 and 222 (e.g., as represented by arrow 229).
  • The sensed at least one physical feature of the particular physical entity is then represented in the entity data structure (act 314) in a manner computing-associated with the at least approximation of the particular time. For instance, in FIG. 2, the sensed feature data is provided (as represented by arrow 239) to the sensed feature store 240. In some embodiments, this sensed feature data may be provided along with the at least approximation of the particular time so as to modify the entity tracking data structure 400 in substantially one act. In other words, act 312 and act 314 may be performed at substantially the same time to reduce write operations into the sensed feature store 240.
  • Furthermore, if permitted, the sensor signal(s) that the recognition component relied upon to sense the sensed feature are recorded in a manner that is computer-associated with the sensed feature (act 315). For instance, the sensed feature that is in the sensed feature data 241 (e.g., in the space-time data structure 401) may be computing-associated with such sensor signal(s) stored in the sensed signal data 242.
  • Referring to FIG. 4, the first entity data structure now has sensed feature data 421 that is computing-associated with time 411. In this example, the sensed feature data 421 includes two sensed physical features 421A and 421B of the physical entity. However, the ellipses 421C represents that there may be any number of sensed features of the physical entity that is stored as part of the sensed feature data 421 within the entity data structure 401. For instance, there may be a single sensed feature, or innumerable sensed features, or any number in-between for any given physical entity as detected at any particular time.
  • In some cases, the sensed feature may be associated with other features. For instance, if the physical entity is a person, the feature might be a name of the person.
  • That specifically identified person might have known characteristics based on features not represented within the entity data structure. For instance, the person might have a certain rank or position within an organization, have certain training, be a certain height, and so forth. The entity data structure may be extended by, when a particular feature is sensed (e.g., a name), pointing to additional features of that physical entity (e.g., rank, position, training, height) so as to even further extend the richness of querying and/or other computation on the data structure.
  • The sensed feature data may also have confidence levels associated with each sensed feature that represents an estimated probability that the physical entity really has the sensed feature at the particular time 410A. In this example, confidence level 421 a is associated with sensed feature 421A and represents a confidence that the physical entity 211 really has the sensed feature 421A. Likewise, confidence level 421 b is associated with sensed feature 421B and represents a confidence that the physical entity 211 really has the sensed feature 421B. The ellipses 421 c again represents that there may be confidence levels expressed for any number of physical features. Furthermore, there may be some physical features for which there is no confidence level expressed (e.g., in the case where there is certainty or in case where it is not important or desirable to measure confidence of a sensed physical feature).
  • The sensed feature data may also have computing-association (e.g., a pointer) to the sensor signal(s) that were used by the recognition component to sense the sense feature of that confidence level. For instance, in FIG. 4, sensor signal(s) 421Aa is computing-associated with sensed feature 421A and represents the sensor signal(s) that were used to sense the sensed feature 421A at the time 411. Likewise, sensor signal(s) 421Bb is computing-associated with sensed feature 421B and represents the sensor signal(s) that were used to sense the sensed feature 421B at the time 411. The ellipses 421Cc again represents that there may be computing-associations of any number of physical features.
  • The security component 231 of the recognition component 230 may also exercise security in deciding whether or not to record sensor signal(s) that were used to sense particular features at particular times. Thus, the security component 231 may exercise security in 1) determining whether to record that particular features were sensed, 2) determining whether to record features associated with particular physical entities, 3) determining whether to record features sensed at particular times, 4) determining whether to record the sensor signal(s), and if so which signals, to record as evidence of a sensed feature, and so forth.
  • As an example, suppose that the location being tracked is a room. Now suppose that an image sensor (e.g., a camera) senses something within the room. An example sensed feature is that the “thing” is a human being. Another example sensed feature is that the “thing” is a particular named person. There might be a confidence level of 100 percent that the “thing” is a person, but only a 20 percent confidence level that the person is a specific identified person. In this case, the sensed feature set includes one feature that is a more specific type of another feature. Furthermore, the image data from the camera may be pointed to by the record of the sensed feature of the particular physical entity at the particular time.
  • Another example feature is that the physical entity simply exists within the location, or at a particular position within the location. Another example is that this is the first appearance of the physical entity since a particular time (e.g., in recent times, or even ever). Another example of features is that the item is inanimate (e.g., with 99 percent certainty), a tool (e.g., with 80 percent certainty), and a hammer (e.g., with 60 percent certainty). Another example feature is that the physical entity is no longer present (e.g., is absent) from the location, or has a particular pose, is oriented in a certain way, or has a positional relationship with another physical entity within the location (e.g., “on the table” or “sitting in chair #5”).
  • In any case, the number and types of features that can be sensed from the number and types of physical entities within any location is innumerable. Also, as previously mentioned, as represented by box 310B, the acts within box 310B may potentially be performed multiple times for any given physical entity. For instance, physical entity 211 may be against detected by one or both of sensors 221 and 222. Referring to FIG. 4, this detection results in the time of the next detection (or is approximation) to be represented within the entity data structure 410. For instance, time 412 is also represented within the entity data structure. Furthermore, sensed features 422 (e.g., including perhaps sensed feature 422A and 422B—with ellipses 422C again representing flexibility) are computing-associated with the second time 412. Furthermore, those sensed features may also have associated confidence levels (e.g., 422 a, 422 b, ellipses 422 c). Likewise, those sensed features may also have associated sensor signals (e.g., 422Aa, 422Bb, ellipses 422Cc).
  • The sensed features sensed at the second time may be the same as or different than the sensed features sensed at the first time. The confidence levels may change over time. As an example, suppose a human being is detected at time #1 at one side of a large room via an image with 90 percent confidence, and that the human being is specifically sensed as being John Doe with 30 percent confidence. Now, at time #2 that is 0.1 seconds later, John Doe is sensed 50 feet away at another part of the room with 100 percent confidence, and there remains a human being at the same location where John Doe was speculated to be at time 1. Since human beings do not travel 50 feet in a tenth of a second (at least in an office setting), it can now be concluded that the human being detected at time 1 is not John Doe at all. So that confidence for time #1 that the human being is John Doe is reduced to zero.
  • Returning to FIG. 2, the ellipses 413 and 423 represent that there is no limit to the number of times that a physical entity may be detected within the location. As subsequent detections are made, more may be learned about the physical entity, and thus sensed features may be added (or removed) as appropriate, with corresponding adjustments to confidence levels for each sensed feature.
  • Now moving outside of box 310B, but remaining within box 310A, for any given physical entity, feature changes in the particular entity may be sensed (act 322) based on comparison (act 321) of the sensed feature(s) of the particular physical entity at different times. This sensed changes may be performed by the recognition component 230 or the computation component 250. If desired, those sensed changes may also be recorded (act 323). For instance, the sensed changes may be recorded in the entity data structure 410A in a manner that is, or perhaps is not, computing-associated with a particular time. Sensor signals evidencing the feature change may be reconstructed using the sensor signals that evidenced the sensed feature at each time.
  • For instance, based on a sensed feature at a first time being a presence of the physical entity within the location, and based on a second feature at a second time being an absence of the physical entity within the location, it can be concluded that the physical entity has exited the physical space. On the contrary, based on a sensed feature at a first time being an absence of the physical entity from the location, and a second feature at a second time being a presence of the physical entity within the location, it can be concluded that the physical entity has entered the location. In some case, perhaps absence from a physical space is not looked for in a physical entity until the physical entity is first detected as being present in the physical space.
  • Now referring to the box 310A, this tracking of feature(s) of physical entities may be performed for multiple entities over time. For instance, the content of box 310A may be performed for each of physical entities 211, 212, 213 or 214 within the physical space 201 or for other physical entities that enter or exit the physical space 201. Referring to FIG. 4, the space-time data structure 401 also is computing-associated (as represented by lines 430B, 430C, and 430D) with a second entity data structure 410B (perhaps associated with the second physical entity 212 of FIG. 2), a third entity data structure 410C (perhaps associated with the third physical entity 213 of FIG. 2); and a fourth entity data structure 410D (perhaps associated with the fourth physical entity 214 of FIG. 2).
  • The space-time data structure 401 may also include one or more triggers that define conditions and actions. When the conditions are met, corresponding actions are to occur. The triggers may be stored at any location in the space-time data structure. For instance, if the conditions are/or actions are with respect to a particular entity data structure, the trigger may be stored in the corresponding entity data structure. If the conditions and/or actions are with respect to a particular feature of a particular entity data structure, the trigger may be stored in the corresponding feature data structure.
  • The ellipses 410E represent that the number of entity data structures may change. For instance, if tracking data is kept forever with respect to physical entities that are ever within the physical space, then additional entity data structures may be added each time a new physical entity is detected within the location, and any given entity data structure may be augmented each time a physical entity is detected within the physical space. Recall, however, that garbage collection may be performed (e.g., by clean-up component 260) to keep the entity tracking data structure 400 from growing too large to be properly edited, stored and/or navigated.
  • Outside of the box 310A, physical relationships between different physical entities may be sensed (act 332) based on comparison of the associated entities data structures (act 331). Those physical relationships may likewise be recorded in the entity tracking data structure 401 (act 333) perhaps within the associated entity data structures that have the sensed physical relationships, and/or perhaps associated with the time that the physical entities are sensed as having the relationship. For instance, by analysis of the entity data structures for different physical entities through time, it might be determined that at a particular time, that a physical entity may be hidden behind another physical entity, or that a physical entity may be obscuring the sensing of another physical entity, or that two physical entities have been joined or that a physical entity has been detached to create multiple physical entities. Sensor signals evidencing the physical entity relationship may be reconstructed using the sensor signals that evidenced the sensed feature at the appropriate time and for each physical entity.
  • The feature data store 240 may now be used as a powerful store upon which to compute complex functions and queries over representations of physical entities over time within a physical space. Such computation and querying may be performed by the computation component 250. This enables enumerable numbers of helpful embodiments, and in fact introduces an entirely new form of computing referred to herein as “ambient computing”. Within the physical space that has sensors, it is as though the very air itself can be used to compute and sense state about the physical world. It is as though a crystal ball has now been created for that physical space from which it is possible to query and/or compute many things about that location, and its history.
  • As an example, a user may now query whether an object is right now in a physical space, or where an object was at a particular time within the physical space. The user might also query which person having particular features (e.g., rank or position within a company) is near that object right now, and communicate with that person to bring the object to the user. The user might query as to relationships between physical entities. For instance, the user might query who has possession of an object. The user might query as to the state of an object, whether it is hidden, and what other object is obscuring view of the object. The user might query when a physical entity first appeared within the physical space, when they exited, and so forth. The user might also query when the lights were turned off, when the system became certain of one or more features of a physical entity. The user might also search on feature(s) of an object. The user might also query on activities that have occurred within the location. A user might compute the mean time that a physical entity of a particular type is within the location, anticipate where a physical entity will be at some future time, and so forth. Accordingly, rich computing and querying may be performed on a physical space that has sensors.
  • As previously mentioned, the computer-navigable graph may has signal segments associated with sensed features. FIG. 5 illustrates a flowchart of a method 500 for efficiently rendering signal segments of interest. First, the computing system navigates the navigable graph of sensed features to reach a particular sensed feature (act 501). For instance, this navigation may be performed automatic or in response to user input. The navigation may be the result of a calculation, or may simply involve identifying the sensed feature of interest. As another example, the navigation may be the result of a user query. In some embodiments, a calculation or query may result in multiple sensed features being navigated to. As an example, suppose that the computing system navigates to sensed feature 222A in FIG. 2.
  • The computing system then navigates to the sensed signal computer-associated with the particular sensed feature (act 502) using the computer-association between the particular sensed feature and the associated sensor signal. For instance, in FIG. 2, with the sensed feature being sensed feature 222A, the computer-association is used to navigate to the signal segment 222Aa.
  • Finally, the signal segment may then be rendered (act 503) on an appropriate output device. For instance, if the computing system is the computing system 100 of FIG. 1, the appropriate output device might be one or more of output mechanisms 112A. For instance, audio signals may be rendered using speakers, and visual data may be rendered using a display. After navigating to the sensed signal(s), multiple things could happen. The user might play a particular signal segment, or perhaps choose from multiple signal segments that contributed to the feature. A view could be synthesized from the multiple signal segments.
  • With computing being performed on the physical world, a new type of ambient computation is enabled. It is as though computers are available in the very ambient environment, embodied within the air itself, and able to perform computations on physical entities that were at any point in contact with that air. In the workplace, productivity may be greatly improved using this ambient computing. For instance, a user may quickly find a misplaced tool, or be able to communicate with a peer close to the tool so that the user can ask that peer to grab that tool and bring it to the user. Furthermore, in addition to ambient computing, human beings may review the sensor signal(s) that were used to sense features of interest for particular physical entities of interest, at particular times of interest. However, the number of scenarios for improving physical productivity by due to responsible use of ambient computing is limitless.
  • Now that the principles of ambient computing have been described with respect to FIGS. 2 through 5, security mechanisms that may be performed in the context of such ambient computing will be described with respect to FIG. 6. FIG. 6 illustrates a flowchart of a method 600 for controlling creation of or access to information sensed by one or more sensors in a physical space. The method includes creating (act 601) a computer-navigable graph of features of sensed physical entities sensed in a physical space over time. The principles described herein are not limited to the precise structure of such a computer-navigable graph. An example structure and its creation have been described with respect to FIGS. 2 through 4.
  • The method 600 also includes restricting creation of or access to nodes of the computer-navigable graph based on one or more criteria (act 602). Thus, security is imposed upon the computer-navigable graph. The arrows 603 and 604 represent that the process of creating the graph and restrict creation/access to its nodes may be a continual process. The graph may be continuously have nodes added to (and perhaps removed from) the graph. Furthermore, restrictions of creation may be considered whenever there is a possibility of creation of a node. Restrictions of access may be decided when a node of the graph is created, or at any point thereafter. Examples of restrictions might include, for instance, a prospective identity of a sensed physical entity, a sensed feature of a sensed physical entity, and so forth.
  • In determining whether access to a node of a computer-navigable graph is authorized, there may be access criteria for each node. Such access criteria may be explicit or implicit. That is, if there is no access criteria explicit for the node that is to be accessed, then perhaps a default set of access criteria may apply. The access criteria for any given node may be organized in any manner. For instance, in one embodiment, the access criteria for a node may be stored with the node in the computer-navigable graph.
  • The access restrictions might also include restrictions based on a type of access requested. For instance, a computational access means that node is not directly accessed, but is used in a computation. Direct access to read the content of a node may be restricted, whilst computational access that does not report the exact contents of the node may be allowed.
  • Access restrictions may also be based on the type of node accessed. For instance, there may be a restriction in access to the particular entity data structure node of the computer-navigable graph. For instance, if that particular entity data structure node represents detections of a particular person in the physical space, access might be denied. There may also be restrictions in access to particular signal segment nodes of the computer-navigable graph. As an example, perhaps one may be able to determine that a person was in a location at a given time, but not be able to review video recordings of that person at that location. Access restrictions may also be based on who is the requestor of access.
  • In determining whether to restrict creation of a particular sensed feature node of the computer-navigable graph, there may be a variety of criteria considered. For instance, there may be a restriction in creation of a particular signal segment node of a computer-navigable graph.
  • FIG. 7 illustrates a recurring flow 700 showing that in addition to creating a computer-navigable graph of sensed features in the physical space (act 701), there may also be pruning of the computer-navigable graph (act 702). These acts may even occur simultaneously and continuously (as represented by the arrows 703 and 704) to thereby keep the computer-navigable graph of sensed features at a manageable size. There has been significant description herein about how the computer-navigable graph may be created (represented as act 701).
  • Now, this description will focus on how the computer-navigable graph may be pruned to remove one or more nodes of the computer-navigable graph (act 702). Any node of the computer-navigable graph may be subject to removal. For instance, sensed features of a physical entity data structure may be removed for specific time or group of times. A sensed feature of a physical entity data structure may also be removed for all times. More than one sensed features of a physical entity data structure may be removed for any given time, or for any group of times. Furthermore, a physical entity data structure may be entirely removed in some cases.
  • The removal of a node may occur, for instance, when the physical graph represents something that is impossible given the laws of physics. For instance, a given object cannot be at two places at the same time, nor can that object travel significant distances in a short amount of time in an environment in which such travel is infeasible or impossible. Accordingly, if a physical entity is tracked with absolute certainty at one location, any physical entity data structure that represent with lesser confidence that the same physical entity is at an inconsistent location may be deleted.
  • The removal of a node may also occur when more confidence is obtained regarding a sensed feature of a physical entity. For instance, if a sensed feature of a physical entity within a location is determined with 100 percent certainty, then the certainty levels of that sensed feature of that physical entity may be updated to read 100 percent for all prior times also. Furthermore, sensed features that have been learned to not be applicable to a physical entity (i.e., the confidence level has reduced to zero or negligible), the sensed feature may be removed for that physical entity.
  • Furthermore, some information in the computer-navigable graph may simply be too stale to be useful. For instance, if a physical entity has not been observed in the physical space for a substantial period of time so as to make the prior recognition of the physical entity no longer relevant, then the entire physical entity data structure may be removed. Furthermore, detections of a physical entity that have become staled may be removed though the physical entity data structure remains to reflect more recent detections. Thus, cleansing (or pruning) of the computer-navigable graph may be performed via intrinsic analysis and/or via extrinsic information. This pruning intrinsically improves the quality of the information represented in the computer-navigable graph, by removing information of lesser quality, and freeing up space for more relevant information to be stored.
  • Accordingly, the principles described herein allow for a computer-navigable graph of the physical world. The graph may be searchable and queriable thereby allowing for searching and querying and other computations to be performed on the real world. Security may further be imposed in such an environment. Finally, the graph may be kept to a manageable size through cleansing and pruning. Thus, a new paradigm in computing has been achieved.
  • The above-described computer-navigable graph of physical space enables a wide variety of applications and technical achievements. In particular, three of such achievements that will now be described are based on a physical graph that has signal segments that evidence state of physical entities, and thus the physical graph has a semantic understanding of what is happening within any given signal segment. In a first implementation, portions of signal segments may be shared, with semantic understanding taking part on which portions of the signal segment are extracted for sharing. In a second implementation, signal segments may automatically be narrated using semantic understanding of what is happening within a signal segment. In a third implementation, actors may be trained in a just-in-time fashion by providing representations of signal segments when the actor is to begin or has begun an activity.
  • FIG. 8 illustrates a flowchart of a method 800 for sharing at least a portion of a signal segment. The signal segment might be, for instance, multiple signal segments that have captured the same physical entity. For instance, if the signal segment is a video signal segment, multiple video segments may have captured the same physical entity or entities from different perspectives and distances. If the signal is an audio signal segment, multiple audio segments may have been captured the selected physical entity or entities with different acoustic channels intervening between corresponding acoustic sensors and the selected physical entity or entities (or portions thereof). The signal segment(s) being shared may be a live signal segment that is capturing signals live from one or more physical entities within a location. Alternatively, the signal segment(s) being shared may be a recorded signal segment.
  • In accordance with the method 800, the system detects selection of one or more physical entities or portions thereof that is/are rendered within one or more signal segments (act 801). Thus, sharing may be initiated based on the semantic content of a signal segment. For instance, the selected physical entity or entities (or portion(s) thereof) may be the target of work or a source of work. As an example, the user might select a target of work such as a physical whiteboard. Another example target of work might be a piece of equipment that is being repaired. Examples of sources of work might be, for instance, a person writing on a physical whiteboard, a dancer, a magician, a construction worker, and so forth.
  • The individual that selected the physical entity or entities (or portions thereof) for sharing may be a human being. In that case, the user might select the physical entity or entities (or portions thereof) in any manner intuitive to a human user. Examples of such input include gestures. For instance, the user might circle an area that encompasses the physical entity or entities (or portions thereof) within a portion of a video or image signal segment.
  • Alternatively, the selection may be made by a system. For instance, the system might select that the portion of signal segments that includes particular physical entity or entities (or portions thereof) be shared upon detection of a particular condition, and/or in accordance with policy. For instance, as described below with respect to FIG. 10, the system might detect that a human actor is about to engage in a particular activity that requires training. The system might then select the physical entities or entities that are similar to a target of activity, or that include an individual as that individual has previously performed the activity, to share with the human actor. A narration of the activity may even be automatically generated and provide (as described with respect to FIG. 9).
  • The system then extracts portion(s) of the signal segment(s) in which the selected physical entity or selected portion of the physical entity is rendered (act 802). For instance, the signal segment might be multiple video signal segments. The system might create a signal segment in which the point of view changes from one signal segment (generated by one sensor) to another signal segment (generated by another sensor) upon the occurrence of condition(s) that occur with respect to the selected physical entity or entities (or the selected portion there). For instance, suppose the selected physical entity is those portions of the whiteboard that an instructor is currently writing on. If the instructor's body was to obscure his own writing from the perspective of one sensor, another signal segment that captures the active portion of the whiteboard may be switched to automatically. The system may perform such switching (of live signal segments) or stitching (or recorded video segments) automatically.
  • The system then dispatches a representation of the signal segment(s) that encompasses the selected physical entity or entities (or portions there) to one or more recipients (act 803). Such recipients may be human beings, components, robotics, or any other entity capable of using the shared signal segment portion(s).
  • In one embodiment, the signal segments represents a portion of a physical graph that includes representations of physical entities sensed within a physical space, along with signal segments that evidence state of the physical entities. An example of such a physical graph has been described above with respect to FIGS. 2 through 4 with respect to the computer-navigable graph 400. The system could also dispatch a portion of the physical graph that relates to the signal segment portion(s) that are shared, and/or perhaps may extract information from that corresponding portion of the physical graph to share along with (or as an alternative to) the sharing of the signal segment portion(s) themselves.
  • As previously mentioned, the representation of the shared portion could be an automatically generated narration. The semantic understanding of what physical activity and entities is being depicted in a signal segment also allows for automatic generation of a narration of the signal segment (whether live narration, or narration of a recorded video segment). FIG. 9 illustrates a flowchart of a method 900 for automatically generating a narration of what is happening in a signal segment. As an example, automatic narration of a chess game, a football game, or the like, may be performed. Automatic narration of a work activity may also be performed.
  • To generate an automatic narration of a signal segment, the signal segment is accessed from the physical graph (act 901). Using the semantic understanding of what is depicted in the signal segment, the system then determines how physical entity or entities are acting in the signal segment (act 902). Again, such a semantic understanding may be gained from the physical graph portions that correspond to the signal segment.
  • Not everything rendered in a signal segment will be of relevance for narration. The system might determine what actions are salient taking into consideration any number of factors (or a balance thereof) including, for instance, whether an action is happening repeatedly, what has not changed, what is occurring constantly, what portion(s) of the signal segment have been shared, user instruction, and so forth. There could also be some training of the system and machine learning to know what actions are potentially relevant.
  • The system then automatically generates a narration of the actions in the signal segment using the determined actions of the one or more physical entities (act 903). As an example, if the signal segment were a video signal segment, the generated narration might be one or more of an audio narration, a diagrammatic narration, or the like. In the audio narration, what is happening in the video signal segment might be spoken. In a diagrammatic narration, a diagram of what is happening in the video signal segment may be rendered with irrelevant material removed, and with relevant physical entities being visually emphasized. The relevant physical entities might be represented in simplified (perhaps cartoonish) form, with movement and actions potentially visually represented (e.g., by arrows).
  • If the signal segment were an audio signal segment, the generated narration might include a summarized audio narration that includes a simplified audio of the relevant matters that are heard within the signal segment. The generated narration might also be a diagrammatic narration. The narration might also show or describe to an intended recipient what to do with respect to the intended recipient's environment, since the physical graph might also have semantic understanding of the surroundings of the intended recipient.
  • FIG. 10 illustrates a flowchart of a method 1000 for automatically training upon the occurrence of a condition with respect to that actor. The method 1000 is initiated upon detecting that the condition is met with respect to the actor a human actor is engaging in or is about to engage in a physical activity (act 1001). For instance, the condition might be that the actor is performing or is about to perform a physical activity, or that the actor has a certain physical status (e.g., presence within a room). In response, the system evaluates whether or not training is to be provided to the human actor for that physical activity (act 1002). For instance, the determination might be based on some training policy. As an example, training might be mandatory at least once for all employees of an organization with respect to that activity, and might be mandated to occur before the activity is performed. Training might be required on an annual basis. Training might be required depending on who the actor is (e.g., a new hire, or a safety officer).
  • Training might be offered when the system determines that the human actor is engaging in an activity improperly (e.g., failing to bend her knees when lifting a heavy object). The training may be tailored to how the human actor is performing the activity improperly. The training may be offered repeatedly depending on a learning rate of the human actor.
  • Training is then dispatched to the actor (act 1003). For instance, a robot or human may be dispatched to the actor to show the actor how to perform an activity. Alternatively or in addition, a representation of a signal segment may then dispatched to the actor (act 1003), where the representation provides training to the human actor. For instance, the representation might be a narration (such as that automatically generated by the method 900 of FIG. 9). The shared representation might be a multi-sensor signal segment—such as a stitched video signal in which the point of view strategically changes upon the occurrence of one or more conditions with respect to a selected physical entity or selected portion of the physical entity, as described above with respect to FIG. 9.
  • Accordingly, the principles described herein use an ambient computing environment, and the semantic understanding of what is happening in the real world, in order to provide significant technical advancements. The present invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described embodiments are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is, therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims (20)

What is claimed:
1. A computing system comprising:
one or more processors;
one or more computer-readable media having thereon computer-executable instructions that are structured such that, when executed by the one or more processors, cause the computing system to perform a method for automatically generating a narration of what has happened in a physical space, the method comprising:
accessing a sensed feature store, the sensed feature store comprising a plurality of signal segments, each signal segment comprising signals detected by sensors from one or more physical entities located within the physical space, each signal segment associated with a time and a location when the signals of the physical entities were detected, the sensed feature store also storing a plurality of sensed features associated with the one or more physical entities;
accessing a signal segment from the sensed feature store associated with a particular one or more physical entities and a particular one or more sensed features of the particular physical entities;
determining a semantic understanding of what is depicted in the accessed signal segment based on the particular one or more physical entities, the particular one or more sensed features of the particular physical entities, and the associated times and locations in which the signal segment evidences state of the particular one or more of the physical entities, determining the semantic understanding including determining actions that happen in the signal segment;
determining how the particular one or more physical entities within the signal segment are acting in the signal segment using the determined semantic understanding of the signal segment; and
generating a narration of the actions in the signal segment using the determined semantic understanding, the particular one or more sensed features, and determined actions of the particular one or more physical entities.
2. The computing system in accordance with claim 1, the signal segment comprising a video signal segment.
3. The computing system in accordance with claim 2, the generated narration of the one or more physical entities comprising an audio narration.
4. The computing system in accordance with claim 2, the generated narration of the one or more physical entities also comprising a diagrammatic narration.
5. The computing system in accordance with claim 1, the signal segment comprising an audio signal segment.
6. The computing system in accordance with claim 5, the generated narration comprising a video narration.
7. The computing system in accordance with claim 5, the generated narration comprising a diagrammatic narration.
8. The computing system in accordance with claim 1, the generated narration showing an intended recipient what to do with respect to a detected environment of the intended recipient.
9. The computing system in accordance with claim 1, the determining of how the one or more physical entities within the signal segment are acting is performed through analysis of saliency.
10. The computing system in accordance with claim 9, the analysis of saliency taking into consideration whether an action is happening repeatedly.
11. The computing system in accordance with claim 9, the analysis of saliency taking into consideration what has not changed.
12. The computing system in accordance with claim 9, the analysis of saliency taking into consideration what is occurring constantly.
13. The computing system in accordance with claim 9, the analysis of saliency using machine learning.
14. The computing system in accordance with claim 9, the analysis of saliency including an interpretation of a user instruction.
15. The computing system in accordance with claim 9, the analysis of saliency including an analysis of what portion or portions of the signal segment have been shared.
16. The computing system in accordance with claim 1, the signal segment comprising a live signal segment.
17. The computing system in accordance with claim 1, the signal segment comprising a recorded signal segment.
18. The computing system in accordance with claim 1, the method being initiated by a detection that a human being is to be trained by narration of work represented in the signal segment.
19. A method for automatically generating a narration of what is happening in a signal segment, the method comprising:
accessing a sensed feature store, the sensed feature store comprising a plurality of signal segments, each signal segment comprising signals detected by sensors from one or more physical entities located within the physical space, each signal segment associated with a time and a location when the signals of the physical entities were detected, the sensed feature store also storing a plurality of sensed features associated with the one or more physical entities;
accessing a signal segment from the sensed feature store associated with a particular one or more physical entities and a particular one or more sensed features of the particular physical entities;
determining a semantic understanding of what is depicted in the accessed signal segment based on the particular one or more physical entities, the particular one or more sensed features of the particular physical entities, and the associated times and locations in which the signal segment evidences state of the particular one or more of the physical entities, determining the semantic understanding including determining actions that happen in the signal segment;
determining how the particular one or more physical entities within the signal segment are acting in the signal segment using the determined semantic understanding of the signal segment; and
generating a narration of the actions in the signal segment using the determined semantic understanding, the particular one or more sensed features, and determined actions of the particular one or more physical entities.
20. A computer program product comprising one or more computer-readable storage media having thereon computer-executable instructions that are structured such that, when executed by the one or more processors, cause the computing system to perform a method for automatically generating a narration of what is happening in a signal segment, the method comprising:
accessing a sensed feature store, the sensed feature store comprising a plurality of signal segments, each signal segment comprising signals detected by sensors from one or more physical entities located within the physical space, each signal segment associated with a time and a location when the signals of the physical entities were detected, the sensed feature store also storing a plurality of sensed features associated with the one or more physical entities;
accessing a signal segment from the sensed feature store associated with a particular one or more physical entities and a particular one or more sensed features of the particular physical entities;
determining a semantic understanding of what is depicted in the accessed signal segment based on the particular one or more physical entities, the particular one or more sensed features of the particular physical entities, and the associated times and locations in which the signal segment evidences state of the particular one or more of the physical entities, determining the semantic understanding including determining actions that happen in the signal segment;
determining how the particular one or more physical entities within the signal segment are acting in the signal segment using the determined semantic understanding of the signal segment; and
generating a narration of the actions in the signal segment using the determined semantic understanding, the particular one or more sensed features, and determined actions of the particular one or more physical entities.
US16/895,725 2017-01-18 2020-06-08 Automatic narration of signal segment Abandoned US20200302970A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/895,725 US20200302970A1 (en) 2017-01-18 2020-06-08 Automatic narration of signal segment

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201762447821P 2017-01-18 2017-01-18
US15/436,655 US10679669B2 (en) 2017-01-18 2017-02-17 Automatic narration of signal segment
US16/895,725 US20200302970A1 (en) 2017-01-18 2020-06-08 Automatic narration of signal segment

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US15/436,655 Continuation US10679669B2 (en) 2017-01-18 2017-02-17 Automatic narration of signal segment

Publications (1)

Publication Number Publication Date
US20200302970A1 true US20200302970A1 (en) 2020-09-24

Family

ID=62841394

Family Applications (2)

Application Number Title Priority Date Filing Date
US15/436,655 Active 2037-03-31 US10679669B2 (en) 2017-01-18 2017-02-17 Automatic narration of signal segment
US16/895,725 Abandoned US20200302970A1 (en) 2017-01-18 2020-06-08 Automatic narration of signal segment

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US15/436,655 Active 2037-03-31 US10679669B2 (en) 2017-01-18 2017-02-17 Automatic narration of signal segment

Country Status (4)

Country Link
US (2) US10679669B2 (en)
EP (1) EP3571634A1 (en)
CN (1) CN110192202A (en)
WO (1) WO2018136315A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11094212B2 (en) * 2017-01-18 2021-08-17 Microsoft Technology Licensing, Llc Sharing signal segments of physical graph

Family Cites Families (189)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4754266A (en) * 1987-01-07 1988-06-28 Shand Kevin J Traffic director
US4894642A (en) * 1988-11-03 1990-01-16 Cyclone Corporation Voice security system
US5093869A (en) 1990-12-26 1992-03-03 Hughes Aircraft Company Pattern recognition apparatus utilizing area linking and region growth techniques
JP3501456B2 (en) 1992-07-10 2004-03-02 富士通株式会社 3D figure editing device
US5708767A (en) 1995-02-03 1998-01-13 The Trustees Of Princeton University Method and apparatus for video browsing based on content and structure
US20150244969A1 (en) 1995-09-07 2015-08-27 Kba2, Inc. System and method for recording and distributing video
US5757424A (en) 1995-12-19 1998-05-26 Xerox Corporation High-resolution video conferencing system
US6295367B1 (en) 1997-06-19 2001-09-25 Emtera Corporation System and method for tracking movement of objects in a scene using correspondence graphs
US6188777B1 (en) 1997-08-01 2001-02-13 Interval Research Corporation Method and apparatus for personnel detection and tracking
US6059576A (en) 1997-11-21 2000-05-09 Brann; Theodore L. Training and safety device, system and method to aid in proper movement during physical activity
US6266614B1 (en) 1997-12-24 2001-07-24 Wendell Alumbaugh Travel guide
US6363380B1 (en) * 1998-01-13 2002-03-26 U.S. Philips Corporation Multimedia computer system with story segmentation capability and operating program therefor including finite automation video parser
ATE248409T1 (en) 1998-04-13 2003-09-15 Eyematic Interfaces Inc WAVELET-BASED FACIAL MOTION CAPTURE FOR AVATA ANIMATION
US6144375A (en) 1998-08-14 2000-11-07 Praja Inc. Multi-perspective viewer for content-based interactivity
US6833865B1 (en) * 1998-09-01 2004-12-21 Virage, Inc. Embedded metadata engines in digital capture devices
DE19846982C2 (en) 1998-10-12 2001-05-17 Siemens Ag Method and system for monitoring a user's posture on exercise equipment
WO2005099423A2 (en) * 2004-04-16 2005-10-27 Aman James A Automatic event videoing, tracking and content generation system
US6492998B1 (en) * 1998-12-05 2002-12-10 Lg Electronics Inc. Contents-based video story browsing system
TW452748B (en) * 1999-01-26 2001-09-01 Ibm Description of video contents based on objects by using spatio-temporal features and sequential of outlines
US6023223A (en) 1999-03-18 2000-02-08 Baxter, Jr.; John Francis Early warning detection and notification network for environmental conditions
US6329904B1 (en) 1999-06-11 2001-12-11 Safety Through Cellular, Inc. Apparatus and method for providing weather and other alerts
US7015954B1 (en) 1999-08-09 2006-03-21 Fuji Xerox Co., Ltd. Automatic video system using multiple cameras
US6424370B1 (en) 1999-10-08 2002-07-23 Texas Instruments Incorporated Motion based event detection system and method
AUPQ464099A0 (en) * 1999-12-14 2000-01-13 Canon Kabushiki Kaisha Emotive editing system
US6665423B1 (en) * 2000-01-27 2003-12-16 Eastman Kodak Company Method and system for object-oriented motion-based video description
US6600784B1 (en) * 2000-02-02 2003-07-29 Mitsubishi Electric Research Laboratories, Inc. Descriptor for spatial distribution of motion activity in compressed video
US6816878B1 (en) 2000-02-11 2004-11-09 Steven L. Zimmers Alert notification system
US6590529B2 (en) 2000-02-14 2003-07-08 Mysky Communications Individualized, location specific weather forecasting system
JP2001331799A (en) 2000-03-16 2001-11-30 Toshiba Corp Image processor and image processing method
JP2001359070A (en) * 2000-06-14 2001-12-26 Canon Inc Data processing unit, data processing method and computer-readable storage medium
US6813313B2 (en) * 2000-07-06 2004-11-02 Mitsubishi Electric Research Laboratories, Inc. Method and system for high-level structure analysis and event detection in domain specific videos
US6968179B1 (en) 2000-07-27 2005-11-22 Microsoft Corporation Place specific buddy list services
US20020091793A1 (en) 2000-10-23 2002-07-11 Isaac Sagie Method and system for tourist guiding, including both navigation and narration, utilizing mobile computing and communication devices
IL156424A0 (en) 2000-12-15 2004-01-04 Nooly Technologies Ltd Location-based weather nowcast system and method
US6767211B2 (en) 2001-03-13 2004-07-27 Carolyn W. Hall Method and apparatus for behaviorally reinforced training with guided practice
US6587668B1 (en) 2001-04-30 2003-07-01 Cyberu, Inc. Method and apparatus for a corporate education system
US7499077B2 (en) * 2001-06-04 2009-03-03 Sharp Laboratories Of America, Inc. Summarization of football video content
US20050005308A1 (en) * 2002-01-29 2005-01-06 Gotuit Video, Inc. Methods and apparatus for recording and replaying sports broadcasts
US20030046592A1 (en) 2001-08-28 2003-03-06 Woodruff Wayne D. Personal video recorder with secure recording, storage and playback control
US6803925B2 (en) 2001-09-06 2004-10-12 Microsoft Corporation Assembling verbal narration for digital display images
US7650058B1 (en) 2001-11-08 2010-01-19 Cernium Corporation Object selective video recording
FR2834632A1 (en) * 2002-01-15 2003-07-18 Oleg Tretiakoff PORTABLE READING MACHINE FOR THE BLIND
US6882837B2 (en) 2002-01-23 2005-04-19 Dennis Sunga Fernandez Local emergency alert for cell-phone users
US7120873B2 (en) * 2002-01-28 2006-10-10 Sharp Laboratories Of America, Inc. Summarization of sumo video content
WO2003107039A2 (en) * 2002-06-13 2003-12-24 I See Tech Ltd. Method and apparatus for a multisensor imaging and scene interpretation system to aid the visually impaired
US7200266B2 (en) 2002-08-27 2007-04-03 Princeton University Method and apparatus for automated video activity analysis
US6774788B1 (en) * 2002-10-07 2004-08-10 Thomas J. Balfe Navigation device for use by the visually impaired
US20040177253A1 (en) 2002-11-19 2004-09-09 My Ez Communications, Llc. Automated and secure digital mobile video monitoring and recording
GB2400513B (en) 2003-03-14 2005-10-05 British Broadcasting Corp Video processing
JP2004349909A (en) 2003-05-21 2004-12-09 Canon Inc Information recording device, imaging and recording device, reproducing device, and privacy protecting method
US7647166B1 (en) 2003-06-30 2010-01-12 Michael Lester Kerns Method of providing narrative information to a traveler
US7099745B2 (en) 2003-10-24 2006-08-29 Sap Aktiengesellschaft Robot system using virtual world
US7359724B2 (en) 2003-11-20 2008-04-15 Nokia Corporation Method and system for location based group formation
US20050209849A1 (en) 2004-03-22 2005-09-22 Sony Corporation And Sony Electronics Inc. System and method for automatically cataloguing data by utilizing speech recognition procedures
TWI236901B (en) * 2004-06-11 2005-08-01 Oriental Inst Technology An apparatus and method for identifying surrounding environment by means of image processing and for outputting the resutls
US7084775B1 (en) 2004-07-12 2006-08-01 User-Centric Ip, L.P. Method and system for generating and sending user-centric weather alerts
WO2007018523A2 (en) 2004-07-28 2007-02-15 Sarnoff Corporation Method and apparatus for stereo, multi-camera tracking and rf and video track fusion
WO2006012645A2 (en) * 2004-07-28 2006-02-02 Sarnoff Corporation Method and apparatus for total situational awareness and monitoring
US7415675B2 (en) 2004-09-01 2008-08-19 Sap Ag Software training method and apparatus
US8832121B2 (en) 2005-02-02 2014-09-09 Accuweather, Inc. Location-based data communications system and method
US7765265B1 (en) 2005-05-11 2010-07-27 Aol Inc. Identifying users sharing common characteristics
US8081822B1 (en) 2005-05-31 2011-12-20 Intellectual Ventures Holding 67 Llc System and method for sensing a feature of an object in an interactive video display
US20080252725A1 (en) 2005-09-26 2008-10-16 Koninklijke Philips Electronics, N.V. Method and Device for Tracking a Movement of an Object or of a Person
CA2528466A1 (en) 2005-11-30 2007-05-30 Virtual Expert Clinics Inc. Web-based expert system for educational therapy planning
US8467715B2 (en) 2006-01-03 2013-06-18 General Electric Company System and method for just-in-time training in software applications
JP4346613B2 (en) * 2006-01-11 2009-10-21 株式会社東芝 Video summarization apparatus and video summarization method
US8150155B2 (en) 2006-02-07 2012-04-03 Qualcomm Incorporated Multi-mode region-of-interest video object segmentation
US8077028B2 (en) * 2006-02-15 2011-12-13 Abl Ip Holding Llc System and apparatus with self-diagnostic and emergency alert voice capabilities
US8184154B2 (en) 2006-02-27 2012-05-22 Texas Instruments Incorporated Video surveillance correlating detected moving objects and RF signals
FI20065147A (en) 2006-03-03 2006-03-03 Firstbeat Technologies Oy System and method for controlling the training
US7823056B1 (en) 2006-03-15 2010-10-26 Adobe Systems Incorporated Multiple-camera video recording
US7671732B1 (en) 2006-03-31 2010-03-02 At&T Mobility Ii Llc Emergency alert notification for the hearing impaired
US10078693B2 (en) * 2006-06-16 2018-09-18 International Business Machines Corporation People searches by multisensor event correlation
US7881505B2 (en) 2006-09-29 2011-02-01 Pittsburgh Pattern Recognition, Inc. Video retrieval system for human face content
US8274564B2 (en) 2006-10-13 2012-09-25 Fuji Xerox Co., Ltd. Interface for browsing and viewing video from multiple cameras simultaneously that conveys spatial and temporal proximity
US7865300B2 (en) 2006-11-15 2011-01-04 International Business Machines Corporation System and method for providing turn-by-turn directions to a moving waypoint
US8149278B2 (en) 2006-11-30 2012-04-03 Mitsubishi Electric Research Laboratories, Inc. System and method for modeling movement of objects using probabilistic graphs obtained from surveillance data
US8165405B2 (en) * 2006-12-18 2012-04-24 Honda Motor Co., Ltd. Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video
US20080162561A1 (en) * 2007-01-03 2008-07-03 International Business Machines Corporation Method and apparatus for semantic super-resolution of audio-visual data
US20080178232A1 (en) 2007-01-18 2008-07-24 Verizon Data Services Inc. Method and apparatus for providing user control of video views
US7788203B2 (en) 2007-02-26 2010-08-31 International Business Machines Corporation System and method of accident investigation for complex situations involving numerous known and unknown factors along with their probabilistic weightings
US20090164397A1 (en) 2007-12-20 2009-06-25 Mitchell Kwok Human Level Artificial Intelligence Machine
US20080268418A1 (en) 2007-04-25 2008-10-30 Tashner John H Virtual education system and method of instruction
WO2008147915A2 (en) 2007-05-22 2008-12-04 Vidsys, Inc. Intelligent video tours
US8934717B2 (en) 2007-06-05 2015-01-13 Intellectual Ventures Fund 83 Llc Automatic story creation using semantic classifiers for digital assets and associated metadata
US8942536B2 (en) 2007-09-19 2015-01-27 Nvidia Corporation Video navigation system and method
US8929601B2 (en) 2007-12-05 2015-01-06 John Caulfield Imaging detecting with automated sensing of an object or characteristic of that object
US20090153654A1 (en) 2007-12-18 2009-06-18 Enge Amy D Video customized to include person-of-interest
US8010601B2 (en) 2007-12-21 2011-08-30 Waldeck Technology, Llc Contiguous location-based user networks
US9164995B2 (en) * 2008-01-03 2015-10-20 International Business Machines Corporation Establishing usage policies for recorded events in digital life recording
TWI534719B (en) 2008-03-03 2016-05-21 艾威吉隆專利第2控股公司 Dynamic object classification
US8190118B2 (en) 2008-03-26 2012-05-29 At&T Mobility Ii Llc Integration of emergency alert information
US9300912B2 (en) 2008-03-28 2016-03-29 Microsoft Technology Licensing, Llc Software based whiteboard capture solution for conference room meetings
JP2009239855A (en) * 2008-03-28 2009-10-15 Mitsubishi Electric Corp Metadata management device
US8311275B1 (en) 2008-06-10 2012-11-13 Mindmancer AB Selective viewing of a scene
GB2461344A (en) 2008-07-04 2010-01-06 Canford Audio Plc Secure recording of interviews using a hashed algorithm to produce an authentication code
US8548503B2 (en) 2008-08-28 2013-10-01 Aol Inc. Methods and system for providing location-based communication services
US20100081954A1 (en) 2008-09-30 2010-04-01 Searete Llc, A Limited Liability Corporation Of The State Of Delaware Method, device, and system to control pH in pulmonary tissue of a subject
US9141859B2 (en) * 2008-11-17 2015-09-22 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
FR2940487A1 (en) 2008-12-19 2010-06-25 Thales Sa METHOD AND SYSTEM FOR MERGING DATA OR INFORMATION
WO2010086680A1 (en) 2009-01-29 2010-08-05 Thomson Licensing Navigation system for routing directions to a moving target
US8605942B2 (en) 2009-02-26 2013-12-10 Nikon Corporation Subject tracking apparatus, imaging apparatus and subject tracking method
IT1395648B1 (en) * 2009-05-28 2012-10-16 St Microelectronics Srl PROCEDURE AND SYSTEM FOR DETECTION OF PORNOGRAPHIC CONTENT IN VIDEO SEQUENCES, RELATIVE COMPUTER PRODUCT
US8463537B2 (en) 2009-06-03 2013-06-11 Motorola Solutions, Inc. Navigating to a moving destination
US8358834B2 (en) * 2009-08-18 2013-01-22 Behavioral Recognition Systems Background model for complex and dynamic scenes
WO2011055309A1 (en) 2009-11-03 2011-05-12 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Representing visual images by alternative senses
WO2011119859A2 (en) 2010-03-24 2011-09-29 Hameed Khan Proximity-based social networking
EP2371339A1 (en) * 2010-04-02 2011-10-05 POZOR 360 d.o.o. Surroundings recognition & describing device for blind people
US8688434B1 (en) 2010-05-13 2014-04-01 Narrative Science Inc. System and method for using data to automatically generate a narrative story
US8655893B2 (en) 2010-07-16 2014-02-18 Shutterfly, Inc. Organizing images captured by multiple image capture devices
US9361523B1 (en) * 2010-07-21 2016-06-07 Hrl Laboratories, Llc Video content-based retrieval
US9607652B2 (en) * 2010-08-26 2017-03-28 Blast Motion Inc. Multi-sensor event detection and tagging system
US20120075490A1 (en) * 2010-09-27 2012-03-29 Johney Tsai Systems and methods for determining positioning of objects within a scene in video content
US20120084669A1 (en) 2010-09-30 2012-04-05 International Business Machines Corporation Dynamic group generation
US20120106854A1 (en) * 2010-10-28 2012-05-03 Feng Tang Event classification of images from fusion of classifier classifications
US9384587B2 (en) 2010-11-29 2016-07-05 Verizon Patent And Licensing Inc. Virtual event viewing
US8490003B2 (en) 2010-12-03 2013-07-16 International Business Machines Corporation Dynamic proximity based text exchange within a group session
EP2469420B1 (en) 2010-12-22 2019-11-27 Software AG CEP engine and method for processing CEP queries
EP2673660B1 (en) * 2011-02-08 2017-08-30 Rapiscan Systems, Inc. Covert surveillance using multi-modality sensing
WO2012115593A1 (en) * 2011-02-21 2012-08-30 National University Of Singapore Apparatus, system, and method for annotation of media files with sensor data
US20120239584A1 (en) 2011-03-20 2012-09-20 Microsoft Corporation Navigation to dynamic endpoint
US9064538B2 (en) 2011-04-07 2015-06-23 Infosys Technologies, Ltd. Method and system for generating at least one of: comic strips and storyboards from videos
WO2012139127A1 (en) 2011-04-08 2012-10-11 Wombat Security Technologies, Inc. Context-aware training systems, apparatuses, and methods
US8789133B2 (en) 2011-04-20 2014-07-22 Cisco Technology, Inc. Location based content filtering and dynamic policy
US8797386B2 (en) * 2011-04-22 2014-08-05 Microsoft Corporation Augmented auditory perception for the visually impaired
US9117147B2 (en) * 2011-04-29 2015-08-25 Siemens Aktiengesellschaft Marginal space learning for multi-person tracking over mega pixel imagery
EP2708032A4 (en) * 2011-05-12 2014-10-29 Solink Corp Video analytics system
US9098611B2 (en) 2012-11-26 2015-08-04 Intouch Technologies, Inc. Enhanced video interaction for a user interface of a telepresence network
US9026596B2 (en) 2011-06-16 2015-05-05 Microsoft Technology Licensing, Llc Sharing of event media streams
US20130060921A1 (en) 2011-09-07 2013-03-07 Hemang N. Gadhia Methods and Systems for Capturing and Transmitting Locations Events Data Using Data Networks
TWI435054B (en) 2011-10-03 2014-04-21 Wistron Corp Method of tracking a dynamic target for navigation and portable electronic device thereof
US20130177296A1 (en) * 2011-11-15 2013-07-11 Kevin A. Geisner Generating metadata for user experiences
US9744672B2 (en) 2011-11-16 2017-08-29 University Of South Florida Systems and methods for communicating robot intentions to human beings
US20130128050A1 (en) * 2011-11-22 2013-05-23 Farzin Aghdasi Geographic map based control
US8812499B2 (en) 2011-11-30 2014-08-19 Nokia Corporation Method and apparatus for providing context-based obfuscation of media
US9681125B2 (en) * 2011-12-29 2017-06-13 Pelco, Inc Method and system for video coding with noise filtering
US20130169834A1 (en) 2011-12-30 2013-07-04 Advanced Micro Devices, Inc. Photo extraction from video
US9002098B1 (en) 2012-01-25 2015-04-07 Hrl Laboratories, Llc Robotic visual perception system
US9124800B2 (en) 2012-02-13 2015-09-01 Htc Corporation Auto burst image capture method applied to a mobile device, method for tracking an object applied to a mobile device, and related mobile device
US9244924B2 (en) * 2012-04-23 2016-01-26 Sri International Classification, search, and retrieval of complex video events
US9260122B2 (en) 2012-06-06 2016-02-16 International Business Machines Corporation Multisensor evidence integration and optimization in object inspection
US9026367B2 (en) 2012-06-27 2015-05-05 Microsoft Technology Licensing, Llc Dynamic destination navigation system
CN202887470U (en) 2012-09-13 2013-04-17 吴李海 Moving target lock system based on iOS/Android
US9208676B2 (en) 2013-03-14 2015-12-08 Google Inc. Devices, methods, and associated information processing for security in a smart-sensored home
US8825377B2 (en) 2012-10-19 2014-09-02 Microsoft Corporation Mobile navigation to a moving destination
US20140136186A1 (en) 2012-11-15 2014-05-15 Consorzio Nazionale Interuniversitario Per Le Telecomunicazioni Method and system for generating an alternative audible, visual and/or textual data based upon an original audible, visual and/or textual data
US20140233919A1 (en) 2013-02-18 2014-08-21 Treasured Stores, LLC Memory Preservation and Life Story Generation System and Method
US9900171B2 (en) 2013-02-25 2018-02-20 Qualcomm Incorporated Methods to discover, configure, and leverage relationships in internet of things (IoT) networks
US9911361B2 (en) * 2013-03-10 2018-03-06 OrCam Technologies, Ltd. Apparatus and method for analyzing images
US9317813B2 (en) 2013-03-15 2016-04-19 Apple Inc. Mobile device with predictive routing engine
US20160098941A1 (en) * 2013-05-21 2016-04-07 Double Blue Sports Analytics, Inc. Methods and apparatus for goaltending applications including collecting performance metrics, video and sensor analysis
US20140375425A1 (en) 2013-06-24 2014-12-25 Infosys Limited Methods for dynamically sending alerts to users and devices thereof
US10447554B2 (en) 2013-06-26 2019-10-15 Qualcomm Incorporated User presence based control of remote communication with Internet of Things (IoT) devices
US9871865B2 (en) 2013-07-11 2018-01-16 Neura, Inc. Physical environment profiling through internet of things integration platform
US9159371B2 (en) 2013-08-14 2015-10-13 Digital Ally, Inc. Forensic video recording with presence detection
EP3047662B1 (en) 2013-09-20 2019-11-06 Convida Wireless, LLC Method of joint registration and de-registration for proximity services and internet of things services
US8812960B1 (en) 2013-10-07 2014-08-19 Palantir Technologies Inc. Cohort-based presentation of user interaction data
US20150100979A1 (en) 2013-10-07 2015-04-09 Smrtv, Inc. System and method for creating contextual messages for videos
US10589150B2 (en) 2013-11-08 2020-03-17 Performance Lab Technologies Limited Automated prescription of activity based on physical activity data
US9438647B2 (en) 2013-11-14 2016-09-06 At&T Intellectual Property I, L.P. Method and apparatus for distributing content
US9709413B2 (en) 2013-12-12 2017-07-18 Cellco Partnership Directions based on predicted future travel conditions
US9135347B2 (en) 2013-12-18 2015-09-15 Assess2Perform, LLC Exercise tracking and analysis systems and related methods of use
US10491936B2 (en) 2013-12-18 2019-11-26 Pelco, Inc. Sharing video in a cloud video service
US20150219458A1 (en) 2014-01-31 2015-08-06 Aruba Networks Inc. Navigating to a moving target
US20150278263A1 (en) 2014-03-25 2015-10-01 Brian Bowles Activity environment and data system for user activity processing
US20150279226A1 (en) 2014-03-27 2015-10-01 MyCognition Limited Adaptive cognitive skills assessment and training
US9451335B2 (en) * 2014-04-29 2016-09-20 At&T Intellectual Property I, Lp Method and apparatus for augmenting media content
DE202014103729U1 (en) 2014-08-08 2014-09-09 Leap Motion, Inc. Augmented reality with motion detection
US10345767B2 (en) 2014-08-19 2019-07-09 Samsung Electronics Co., Ltd. Apparatus and method for gamification of sensor data interpretation in smart home
US20160073482A1 (en) 2014-09-05 2016-03-10 Qualcomm Incorporated Implementing a target lighting scene in an internet of things environment using a mobile light output device
US10180974B2 (en) * 2014-09-16 2019-01-15 International Business Machines Corporation System and method for generating content corresponding to an event
JP6065889B2 (en) 2014-09-18 2017-01-25 株式会社デンソー Driving assistance device
US10356649B2 (en) 2014-09-26 2019-07-16 Intel Corporation Multisensory change detection for internet of things domain
WO2016087311A2 (en) * 2014-12-01 2016-06-09 Schott Ag Electrical storage system comprising a sheet-type discrete element, discrete sheet-type element, method for the production thereof and use thereof
US10816944B2 (en) 2015-01-06 2020-10-27 Afero, Inc. System and method for using data collected from internet-of-things (IoT) sensors to disable IoT-enabled home devices
US20170032823A1 (en) * 2015-01-15 2017-02-02 Magisto Ltd. System and method for automatic video editing with narration
US9576460B2 (en) * 2015-01-21 2017-02-21 Toyota Motor Engineering & Manufacturing North America, Inc. Wearable smart device for hazard detection and warning based on image and audio data
CN105989683A (en) * 2015-02-12 2016-10-05 贝斯科技国际有限公司 Enhanced residence security system
US9824574B2 (en) * 2015-09-21 2017-11-21 Tyco Fire & Security Gmbh Contextual fire detection and alarm verification method and system
US9984100B2 (en) * 2015-09-29 2018-05-29 International Business Machines Corporation Modification of images and associated text
TWI553494B (en) * 2015-11-04 2016-10-11 創意引晴股份有限公司 Multi-modal fusion based Intelligent fault-tolerant video content recognition system and recognition method
WO2017112813A1 (en) * 2015-12-22 2017-06-29 Sri International Multi-lingual virtual personal assistant
US10032081B2 (en) * 2016-02-09 2018-07-24 Oath Inc. Content-based video representation
US9792821B1 (en) * 2016-03-25 2017-10-17 Toyota Jidosha Kabushiki Kaisha Understanding road scene situation and semantic representation of road scene situation for reliable sharing
US11082754B2 (en) * 2016-08-18 2021-08-03 Sony Corporation Method and system to generate one or more multi-dimensional videos
US20180096632A1 (en) * 2016-09-30 2018-04-05 Omar U. Florez Technology to provide visual context to the visually impaired
US20180143230A1 (en) * 2016-11-22 2018-05-24 Google Inc. System and method for parallel power monitoring
US10720182B2 (en) * 2017-03-02 2020-07-21 Ricoh Company, Ltd. Decomposition of a video stream into salient fragments
US10546197B2 (en) * 2017-09-26 2020-01-28 Ambient AI, Inc. Systems and methods for intelligent and interpretive analysis of video image data using machine learning

Also Published As

Publication number Publication date
CN110192202A (en) 2019-08-30
WO2018136315A1 (en) 2018-07-26
US20180204596A1 (en) 2018-07-19
EP3571634A1 (en) 2019-11-27
US10679669B2 (en) 2020-06-09

Similar Documents

Publication Publication Date Title
US20180202819A1 (en) Automatic routing to event endpoints
US11410672B2 (en) Organization of signal segments supporting sensed features
US20180204108A1 (en) Automated activity-time training
US11070504B2 (en) Communication routing based on physical status
US10606814B2 (en) Computer-aided tracking of physical entities
US20200302970A1 (en) Automatic narration of signal segment
US11094212B2 (en) Sharing signal segments of physical graph
US10635981B2 (en) Automated movement orchestration
US20180204096A1 (en) Taking action upon physical condition
US20180203885A1 (en) Controlling creation/access of physically senses features
US20180203886A1 (en) Cleansing of computer-navigable physical feature graph
US20180203881A1 (en) Taking action based on physical graph

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MITAL, VIJAY;REEL/FRAME:052868/0672

Effective date: 20170131

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION