WO2018009460A2 - System for collecting and extracting information from virtual environment models - Google Patents

System for collecting and extracting information from virtual environment models Download PDF

Info

Publication number
WO2018009460A2
WO2018009460A2 PCT/US2017/040458 US2017040458W WO2018009460A2 WO 2018009460 A2 WO2018009460 A2 WO 2018009460A2 US 2017040458 W US2017040458 W US 2017040458W WO 2018009460 A2 WO2018009460 A2 WO 2018009460A2
Authority
WO
WIPO (PCT)
Prior art keywords
user
location
advertisement
data
reconstruction
Prior art date
Application number
PCT/US2017/040458
Other languages
French (fr)
Other versions
WO2018009460A3 (en
Inventor
Tatu V. J. HARVIAINEN
Original Assignee
Pcms Holdings, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pcms Holdings, Inc. filed Critical Pcms Holdings, Inc.
Publication of WO2018009460A2 publication Critical patent/WO2018009460A2/en
Publication of WO2018009460A3 publication Critical patent/WO2018009460A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0261Targeted advertisements based on user location

Definitions

  • This disclosure relates to systems and methods for augmented or virtual reality environments. More specifically, this disclosure relates to systems and methods for information extraction based on sensor data from users of augmented or virtual reality environments.
  • augmented reality (AR) and virtual reality (VR) head mounted displays intended for wide scale consumer use will enable collection of very detailed information about the physical environments where users are utilizing these devices.
  • AR augmented reality
  • VR virtual reality
  • devices have integrated RGB-D sensors capable of real-time 3D reconstruction of the physical environment.
  • Similar add-on RGB-D sensor and camera solutions have been demonstrated for VR HMDs which capture 3D data of the user's environment, and by doing so enable physical navigation and adjustment of virtual experience to the physical environment.
  • devices such as Google Tango are the first generation of mobile devices embedded with RGB-D sensors capable of enabling reconstruction of detailed 3D models of the environments where they are being used.
  • image data captured by regular 2D cameras can also be used to reconstruct 3D models of user environments.
  • Microsoft published results of an implementation allowing mobile devices to perform 3D reconstructions locally from 2D image data acquired by a mobile device's camera see Ondruska et al., “MobileFusion: Real-time Volumetric Surface Reconstruction and Dense Tracking On Mobile Phones", in Visualization and Computer Graphics, IEEE Transactions on, no.99, pp.1-1, doi: 10.1109/TVCG.2015.2459902).
  • map services developed by companies working on location services, such as Google, Microsoft, Nokia, etc.
  • location services such as Google, Microsoft, Nokia, etc.
  • photo based 3D reconstruction has been used for generating building models to enable 3D views to the map data.
  • 3D reconstructions produced by 2D/3D image sensors embedded with AR/VR HMDs have generally only been used locally during the run-time operation of applications to enable adjustment of virtual content with the physical environment or to enable interaction with the virtual content.
  • Social media services are currently performing user profiling based on user created content, e.g., text, photographs, and video clips provided by the user, as well as location and internet use statistics, such as GPS and web navigation histories collected from the user. While it is already possible to collect information quite efficiently with the previously mentioned data sources, there are opportunities for collecting additional information that can be used to provide greater services to users.
  • user created content e.g., text, photographs, and video clips provided by the user
  • location and internet use statistics such as GPS and web navigation histories collected from the user.
  • Systems and methods of the present disclosure provide efficient collection and extraction of information about users, environments associated with users, and contexts of use. Such extracted information may enable various AR & VR experiences and services, such as identification of situations where a user may need assistance via augmentation, matching of virtual experiences with physical environments, virtual product placement and advertisement, and the like.
  • a method of targeted advertising comprising: receiving gathered data, from a head mounted display (HMD) of a first user, indicating the presence of a first object at a first-user-controlled first location; responsive to detecting that the first user is at a second location: selecting a targeted advertisement for a second product based on the HMD- gathered data; and displaying, via the HMD, the selected targeted advertisement for the second product to the first user while the first user is at the second location.
  • HMD head mounted display
  • a method of extracting information from the setup of an immersive experience comprising: receiving information derived from sensor data of a head- mounted display camera of a user device of a first user at a first location; generating a first 3D reconstruction of the first location based at least in part on the received information derived from sensor data of the head-mounted display camera of the first user at the first location; comparing the first 3D reconstruction of the first location to a second 3D reconstruction of the first location, wherein the second 3D reconstruction was previously captured; classifying the first 3D reconstruction of the first location based at least in part upon the reconstruction geometry; identifying at least one object in the first 3D reconstruction of the first location based at least in part upon the received information derived from the sensor data of the head-mounted display camera of the first user; estimating at least one user characteristic at least in part from the received information; and providing a tailored immersive experience based at least in part on the first and second 3D reconstructions of the first location based at least in part on the at least one
  • FIG. 1 illustrates an overview of one embodiment of an architecture for collecting, extracting, and utilizing data from 3D environments.
  • FIG. 2 illustrates one embodiment of the process of 3D reconstruction and storage of 3D reconstructions.
  • FIG. 3 illustrates one embodiment of data processing performed by the service in order to extract user and environment information.
  • FIG. 4 illustrates one embodiment of a process performed by the service to extract information from data received from a user.
  • FIG. 5 illustrates one embodiment of targeted advertising based on 3D reconstructions and user information extraction.
  • FIG. 6 illustrates one embodiment of a process performed by the service to extract information from data received from a user and present a targeted advertisement.
  • FIG. 7 A illustrates one embodiment of a process performed by the service to extract information from data received from a user and present a targeted advertisement.
  • FIG. 7B illustrates another embodiment of a process performed by the service to extract information from data received from a user and present a targeted advertisement.
  • FIG. 8 illustrates an exemplary wireless transmit/receive unit (WTRU) that may be employed as a module of a head mounted display in some embodiments.
  • WTRU wireless transmit/receive unit
  • FIG. 9 illustrates an exemplary network entity that may be employed in some embodiments.
  • modules that carry out (i.e., perform, execute, and the like) various functions that are described herein in connection with the respective modules.
  • a module includes hardware (e.g., one or more processors, one or more microprocessors, one or more microcontrollers, one or more microchips, one or more application-specific integrated circuits (ASICs), one or more field programmable gate arrays (FPGAs), one or more memory devices) deemed suitable by those of skill in the relevant art for a given implementation.
  • ASICs application-specific integrated circuits
  • FPGAs field programmable gate arrays
  • Each described module may also include instructions executable for carrying out the one or more functions described as being carried out by the respective module, and it is noted that those instructions could take the form of or include hardware (i.e., hardwired) instructions, firmware instructions, software instructions, and/or the like, and may be stored in any suitable non-transitory computer- readable medium or media, such as commonly referred to as RAM, ROM, etc.
  • 3D reconstructions produced by AR/VR FDVIDs may be collected during participation in AR/VR experiences and stored for later reference.
  • a social media service such as Facebook
  • per-user collection of environment models may be easy to arrange, and may enable collection of very detailed information about users and their associated environments.
  • a user provides RGB-D data to the on-line information collection and extraction service (called service), and the service creates, collects, and classifies stored 3D reconstructions and performs further processing of the data in order to extract user-specific and contextual information.
  • the service may provide the extracted information to or for third party services that can use the information to tailor their service per user per context.
  • the device sensors 150 of a user's device may communicate sensor data 160 to a client 152 (e.g., an application on the user's HMD, etc.).
  • the client 152 may communicate the sensor data and additional context data 162 from the client to the service 154.
  • the service 154 may perform 164 various functions, such as: creating a virtual environment model based on 2D/3D image data; classifying the environment model based on received sensor and context data; segmenting and detecting objects from the virtual environment model; extracting per user and per environment information from the collected data; and/or the like.
  • a third-party service 156 may communicate a data request 166 to the service 154, responsive to which the service 154 may provide requested per user data 168, and/or the like.
  • a process performed by the service to extract information from data received from a user may be generally described by the following steps.
  • Classify the 3D reconstruction based on reconstruction geometry and other received sensor and context data (420).
  • the service collects data from users, process the data, and then provides the data for use by third parties.
  • Each of the steps involved with the processes are set forth in more detail below. The description of processes is broken down into three areas: data collection, data processing, and use of collected data.
  • FIG. 2 illustrates one embodiment of the process of 3D reconstruction and storage of 3D reconstructions.
  • a client 205 executed on a user's device may initialize data collection by requesting data from device sensors.
  • the client 205 in this case may be any application which uses the sensor data in its operation while also streaming it to a server, or can be a dedicated application for data collection, or functionality embedded with the operating system or server drivers which autonomously stream data to the service without user initiation.
  • a client application streams sensor data and other context data 235 collected from the user's device to the service 208, such as to a service's receiver module 210.
  • Device sensors that provide data for 3D reconstruction include various 2D and/or 3D image sensors such as cameras, camera arrays, RGB- D sensors, and/or the like.
  • any or all other potentially available sensor data may be sent to the service and may assist in context recognition and data classification. Examples of other sensors may include, but are not limited to, GPS, accelerometer, digital compass, gyroscope, and the like.
  • additional data collected from the device which may help in context recognition or in the classification may be communicated to the service, such as cell ID, current browser history, active applications, and the like.
  • image data 240 may be passed to a 3D reconstruction module 215, which may perform 3D reconstruction 245.
  • the receiver module 210 may communicate any location and/or context data 250 (which may include GPS, cell ID, browser history, etc.) to a data manager module 220 of the service. In some embodiments, this information may be passed 255 to a context recognition module 225 of the service, which may analyze 260 the data to detect the user's current activity and/or location, which may then be communicated back 265 to the data manager 220.
  • the data manager 220 may use the detected activity and/or location to retrieve 270, from a database 230 or the like, earlier 3D reconstructions potentially matching the environment the user is currently in, or that were captured from close proximity thereto.
  • the service compares 285 the new (e.g., current) 3D reconstruction 280 with the pre-existing ones 275, in order to detect if the current environment was previously captured, and to align a new 3D reconstruction with an old reconstruction such that any differences can be detected, and missing or changed areas updated within the database.
  • the data manager 220 may request 290 that a new location be created in the database. For either new locations or existing ones, based on the new 3D reconstruction, the database 230 may be updated 295.
  • a first step may be to find a best matching alignment between the old and new reconstructions. Once the optimal matching alignment is found, the deviations between reconstructions are examined by the service to decide if the old and new reconstructions represent the same environment, or if they are from different locations.
  • the alignment of the 3D reconstructions may be performed with iterative closest point (ICP) algorithm, or any other more advanced point cloud / 3D reconstruction registration method as described in publications such as Shiratori et al., "Efficient Large-Scale Point Cloud Registration Using Loop Closures", 2015 International Conference on 3D Vision (3DV), IEEE, 2015, p. 232-240, and Eckart et al., "MLMD: Maximum Likelihood Mixture Decoupling for Fast and Accurate Point Cloud Registration", 2015 International Conference on 3D Vision (3DV), IEEE, 2015, p. 241-249.
  • the reconstructions can be compared.
  • One approach for comparison is direct comparison of cumulative deviation of data points and the use of some threshold value to determine if the environment is regarded as new or old.
  • Another, more advanced approach may first segment both 3D reconstructions to separate static elements such as floors, ceilings, and walls from dynamic elements such as pieces of furniture and random objects. Then, an amount of deviation between the reconstructions can be weighted so that static elements are required to match more accurately than dynamic elements in order for the algorithm to consider 3D reconstructions to be from the same environment.
  • a comparison result may be used to avoid redundant copies of the same environment being stored in the database. Furthermore, comparison may help to align several 3D reconstructions with partial overlap together, thus resulting in a larger area being continuously covered. Comparison of 3D reconstructions can also help in further context recognition.
  • the service may compare 3D reconstructions associated with different users with the same location. For example, based on location information the service may detect that, for a particular location where the new 3D reconstruction is captured, a different user has already provided a matching 3D reconstruction. These 3D reconstructions can then be compared, and, if determined to represent the same location, the space can be cross-referenced between the users.
  • the service can expect that for a user A who provided new 3D reconstruction, this environment could be the living room of a friend or a relative. If there is additional context information available, such as social connections from Facebook or a similar service, the relationship between the two users may be verified as correct, and very accurate labeling and cross-referencing established between the users and the environments.
  • Comparing 3D reconstructions and identification of previously existing 3D reconstructions also permits detection of deviations in the environment between different capture sessions. If the service estimates that the deviations between sessions are a result of the removal or addition of one or more dynamic objects, these areas can be marked and used in later processing steps to detect if some specific objects have been removed or added, which then can be used as an additional feature vector element in the user characteristic estimation.
  • the service may perform a process for extracting user-specific information from the raw data consisting of 3D reconstructions, sensor data, and other context information linked with the user and environment.
  • FIG. 3 illustrates one embodiment of the process of data processing performed by the service in order to extract user and environment information.
  • a first step is to classify the 3D reconstruction, such as with a classification module 302 of the service 300. This may operate on a 3D reconstruction retrieved 320 from an environment data and context info module 314 of a database 313 of the service 300.
  • One goal of the classification is to predict which type of space the 3D reconstruction represents and what the relationship is between the user and the space.
  • the type of space may be predicted using any suitable known approach for 3D reconstruction classification, such as the approach presented by Swadzba and Wachsmuth in "A detailed analysis of a new 3D spatial feature vector for indoor scene classification.”, Robotics and Autonomous Systems, 2014, 62.5: 646-662, among others.
  • Classification may determine the space type 322, e.g., kitchen, living room, bed room, backyard, and the like. Based on the predicted space type, context data (e.g., location, activity, time of day, ambient sound characteristics, Bluetooth identifiers in the vicinity, and the like), and the frequency of user visits in the same space, the association between the space and the user may be predicted (e.g., a room of the user's home, a room at the home of a user's friend, a room at the user's workplace, etc.).
  • context data e.g., location, activity, time of day, ambient sound characteristics, Bluetooth identifiers in the vicinity, and the like
  • the frequency of user visits in the same space e.g., a room of the user's home, a room at the home of a user's friend, a room at the user's workplace, etc.
  • Exemplary context recognition techniques that may be used with mobile devices in embodiments described herein include those described in Hoseini-Tabatabaei et al., "A survey on smartphone-based systems for opportunistic user context recognition.”, ACM Computing Surveys
  • context recognition may operate based on first transforming raw sensor data into features. Features captured from the raw sensor data are collected to feature vectors to be input into a predictive model constructed with a machine learning approach. The predictive model outputs the predicted classification for the context, or as in the present disclosure, context features can be composited directly together with the space type prediction and used for predicting the classification of the association between the user and the space, e.g., user's living room, user's office, living room of another person, public cafe, and/or the like.
  • the classification may be improved by accumulating similar unsuccessful classification events and examining them as a time series, or by prompting the user for assistance in the classification when an automatic classification fails. If the user is asked directly to assist the classification, handling of unknown associations may be readily accomplished.
  • the time series inspection may provide further insight to the unknown associations.
  • the means and variances of different time series features can be extracted and used for another predictive model, which may also be constructed using a machine learning approach. For example, frequency, duration, and variation between occurrences, as well as detected social and/or activity contexts in the occurrences may provide additional insights into the association between the user and the space.
  • the system may attempt to identify the primary owner of the space by cross referencing the spaces linked with other users.
  • Linkedln can be used to improve and/or limit the cross-referencing search.
  • the classification of the association between the user and the space need not be perfect, as in addition to the classification, confidence (e.g., estimation of how accurate the classification is) may be made available for recipients of the service outputs (e.g., other applications or third parties which use the information provided by the service). This may permit the application logic to handle uncertainty and unknown situations in a way that is best for a particular application.
  • a subsequent step is the decision whether to further analyze and store the 3D reconstruction or discard it as irrelevant.
  • This may be managed, for example, by a 3D reconstruction manager module 304, which may, based on the class 322 and retrieved context information related with the 3D reconstruction 324 (e.g., location, activity, etc.). For example, spaces that do not include personal elements, such as public spaces, unrecognized outdoor areas, or the like, may be discarded 326. If the space is regarded as meaningful, it is passed 328 to a next stage of the analysis which segments 330 (e.g., with a segmentation module 306) the 3D reconstruction in order to detect individual objects in the space.
  • segments 330 e.g., with a segmentation module 306
  • Segmented elements 332 of the 3D reconstruction may be used for classification of objects in the space 334 by an object classification module 308, and later in the process for object recognition by an object and space recognition module 310.
  • Object recognition may compare segmented elements 336 with an object and space reference database 318 having specific reference product 3D models and existing 3D assets.
  • An objective may be to identify whether the object in the user's space is of a specific model or brand, and to maintain references to existing 3D assets representing the object.
  • Object recognitions may be stored 340 in the database 313.
  • the gathered information may be stored and used for predicting 344 user characteristics, such as by a user classification module 312.
  • User characteristics may include, but are not limited to, classification of the user to various demographics, psychographics, and/or behavioral segments, or the like.
  • any or all existing user specific data 342 is processed 344 for use as an input for predictive model (such as in feature vector models).
  • the resulting classifications 346 may be stored in the database.
  • a predictive model may be created as a separate preprocessing step using a machine learning approach.
  • the predictive model may be generated using known machine learning approaches or combinations of various known approaches.
  • the predictive model may be trained using separately collected known samples representing various classification classes, and when the prediction accuracy is estimated to be sufficiently accurate, model adjustment and training may cease and the model transferred for use predicting the classes of samples from unknown classes.
  • unsupervised learning methods the collected samples forming the feature vectors may be fed to the learning algorithm without any class descriptions, as the task of the learning algorithm is to cluster the data such that similar occurrences become clustered together. The clustering proposed by the algorithm may thus divide the data into logical groups which may be studied and identified.
  • supervised learning approaches may be preferable.
  • a training data set may be constructed by collecting data and labeling resulting feature vectors to represent specific socio-demographic classes (e.g., gender, age, top-five personality traits, average income range, marital status, number of kids, employment status, and/or the like).
  • Construction of the training data may be simplified by combining socio-demographic characteristics provided by existing services, such as social media services, with data collected from the user.
  • An example of one feature vector for one space associated with one user may unfold to a structured list with items such as, but not limited to: type of space, owner of space, frequency of visits, variance of frequency of visits, average time spent on space, variation of times spent in the space, objects in the space (e.g., type, specific model, dimensions, etc.), dimension of the space, average context of user presence in the space (e.g., activity, social context, etc.), variation of objects between visits, and the like.
  • items such as, but not limited to: type of space, owner of space, frequency of visits, variance of frequency of visits, average time spent on space, variation of times spent in the space, objects in the space (e.g., type, specific model, dimensions, etc.), dimension of the space, average context of user presence in the space (e.g., activity, social context, etc.), variation of objects between visits, and the like.
  • feature vectors may be compiled for any or all spaces with which the user is associated.
  • the machine learning algorithm may tune the predictive model using the labeled training data as an input.
  • a machine learning method anything can be used from logistic regression to multilayer neural networks, support vector machines (SVM) to ensemble methods such as Gradient Boosted Decision Trees - GBDT used by Pennacchiotti andffy (see Pennacchiotti andffy, "A Machine Learning Approach to Twitter User Classification.”, ICWSM, 2011, 11.1 : 281-288) for predicting socio-demographic characteristic of users based on their Twitter messages, to complex deep learning approaches combining several methods in several steps.
  • SVM support vector machines
  • part of the training data reserved for validation may be used to test the performance of the resulting predictive model.
  • the training and testing may be performed iteratively comparing prediction accuracy between different methods and/or parameters used in the training.
  • different predictive models may be chosen to predict different aspects of the user characteristics, where specific models improve the prediction performance for specific aspects.
  • the system may predict the same socio-demographic classifications for new users from whom the socio-demographic characteristics are unknown that were provided as desired output during the training, such as gender, age, top-five personality traits, average income range, marital status, number of kids, employment status, and/or the like.
  • the predictive model may estimate the confidence of the estimation. Estimation confidence, together with the estimated predictive model accuracy, may be provided for users of the system, so that an application using the data provided by the system may determine how to handle classifications with high uncertainty.
  • Any or all the user characteristics predicted and collected during the process may be stored 346 to the database (such as to a user characteristics module 316) for later reference.
  • Extracted data from the environment may be organized in a relational manner connecting user characteristics with the users, users with the environments, environments with the environment types, objects with the environments, objects with the 3D assets, and/or the like, so that information may be effectively queried using techniques analogous to those used for querying relational databases.
  • the service may provide a query interface for the clients which can be local applications, applications running on external devices, or any other third-party service.
  • the 3D reconstruction of the environment is performed on the client device instead of on the service side. Once 3D reconstruction on the client side is determined to be complete (or sufficiently complete), it may be sent to the service along with data from relevant sensors and context information.
  • Another embodiment of the systems and methods may work with individual objects. As full 3D reconstruction of environments and detection of the objects may be difficult, considering for example the limitations caused by sub-optimal network bandwidth, requirements of the systems and methods may be reduced by working at an individual object level. For example, object reconstruction can be performed on a client device and optimized reconstructions of the objects loaded to the service side.
  • Example Use Case User Assistance in Unfamiliar Environments.
  • the user has been providing environment data to the system for some time.
  • the system Based on the environment classification and context information, the system has a comprehensive collection of environment descriptions of environments that are familiar to the user.
  • An application dedicated for user assistance in unfamiliar environments may continuously query a current environment type from the system in order to detect when the user is in an unfamiliar environment, or in some embodiments the user may initiate an assistance application explicitly.
  • the assistance application may query the service for the type of the current environment, objects found from the environment, and object characteristics such as brands and language of manufactured products, and/or the like. Based on the environment and object description(s) received from the service, the assistance application may use augmented reality to replace or overlay unfamiliar obj ects with familiar virtual objects, replace labels on objects to provide all texts and icons localized for the user, search from the internet and augment user guides and instructions for objects based on object identification, or based on context information predict what actions user is performing and highlight objects relevant for that action from the environment.
  • a virtual experience may require certain objects to be present, such as an augmented reality game may require some common household objects to be available like a soda can which is used for anchoring the virtual content, or operates as a physical proxy object enabling user interaction and haptic feedback.
  • a virtual experience may benefit from the presence of certain shapes in addition to specific objects.
  • the application can query 3D reconstructions of objects found from some environment and compare them with the desired haptic feedback shape, such as tabletop or similar surface where virtual control surface is placed, or shape which could be used for anchoring virtual content, such as a chair for virtual character to sit on.
  • a user is profiled based on environments and objects found from the environments with which he or she has personal connections.
  • a first step in the 3D reconstruction analysis is the detection of the space type and determining associations between the user and the space.
  • Space type classification can be performed by analyzing the geometry of the 3D reconstruction, for example with an approach similar to that described in Shiratori et al. Reliability of the space type classification may be further improved by combining context information with the 3D reconstruction as a classification input. From context data, it is also possible to predict the association between the space and the user (e.g., room is located in the location identified as the user's home or part of the location identified as the user's workplace).
  • Some or all the information extracted from the user and the 3D reconstruction may be combined after the object recognition phase to a feature vector, which can then be used as an input for a user characteristics classification prediction.
  • the application may use the information for selecting content most likely matching the user's interests, such as placing new virtual objects in the augmented environment with the possibility for the user to buy them, or completely replacing objects with similar virtual objects to provide a visualization of how the user's environment would look with alternative furniture or decoration objects.
  • FIG. 5 depicts one embodiment of the herein disclosed systems and methods.
  • a user equipped with device sensors 505, AR/VR social media client 510, etc.
  • advertisers 520 may request targeted advertisements 545 to be sent either to a user or users that match a desired profile specified by the advertiser 560 (e.g., users who live in a certain area and who have product x in their living room) or a user connected with a first user fulfilling the profile specified by the advertiser 555 (e.g., friends of a user who owns product x). These targeted advertisements may be presented responsive to matching 550 carried out by the social media service 515.
  • a desired profile specified by the advertiser 560 e.g., users who live in a certain area and who have product x in their living room
  • a user connected with a first user fulfilling the profile specified by the advertiser 555 e.g., friends of a user who owns product x.
  • the user's HMD device may use sensors to capture data related to a current location of the user.
  • sensor data may be communicated through the social media client, which may further incorporate additional context data (e.g., location, location's relation to user, activity, etc.).
  • the service backend e.g., social media service
  • the service may receive the sensor and context data, and use such data to create 3D reconstructions of the user's location/environment.
  • the service may classify the 3D reconstruction based at least in part on the received sensor and context data.
  • the 3D reconstruction may be segmented by the service to detect specific objects in the 3D reconstruction.
  • the service may also extract user characteristics based on the detected objects, collected information, other information from the social media service, and/or the like.
  • Such information may be stored in a database for use in responding to third party requests (e.g., from advertisers, etc.).
  • the service may receive a targeted advertising request from a third-party advertiser.
  • the service may match user profiles, objects, and spaces with the targeted advertisement request.
  • the advertiser may communicate information identifying an advertisement-relevance-indicator object to the service.
  • Advertisement-relevance-indicator objects may represent specific items detectable in a user's environment/location which may be "triggers" for an advertisement.
  • a wok may serve as an advertisement-relevance-indicator object with respect to an advertisement for soy sauce
  • an espresso maker may serve as an advertisement-relevance-indicator object with respect to an advertisement for gourmet coffee beans.
  • advertisement requests may be structured such that the social media aspect of the service is of greater importance. For example, based on user profiles and data extracted from users' environments, the service may present advertisements to particular
  • primary users who directly match a profile set forth in the targeted advertisement request e.g., primary user has product x in their living room and lives in a specified area
  • secondary users who are connected with the primary users e.g., friends of a user who owns a product x and live in the specified area
  • Such connected user advertisements may be beneficial by permitting advertisers to reach users who may not themselves be identifiable as useful advertising targets, but who because of their user connections may still be useful targets of an advertisement.
  • a secondary user who does not own a video game console may not match a targeted advertisement request for a video game on that console.
  • the secondary user may still be a valuable target of the advertisement (e.g., even though secondary user cannot use video game at their home, they could use it at the primary user's home).
  • FIG. 6 depicts one embodiment of the herein disclosed methods. A user may operate
  • the AR/VR equipment may generate a 3D reconstruction of the current environment and may receive context data from the equipment and/or a linked social media service 605.
  • classification 615 and segmentation 620 may be utilized to detect 625 specific objects in the user's environment.
  • the 3D reconstruction segmentation may detect a wok, an espresso maker, and a key lime pie. These objects may also be classified and recognized 630.
  • the context information received by the system may determine the environment type, here a kitchen, and the relation with the user, here that it is the user's kitchen.
  • the segmented objects and the environment type and relation with the user may be stored in a data storage by the system 635.
  • user profiling may occur, in response to a request by a third-party advertiser 640.
  • the stored data may be provided to and/or made accessible by the third-party advertiser.
  • the system could receive information identifying an advertisement-relevance-indicator object from the third party.
  • Such an advertisement-relevance-indicator object may relate to information regarding an intended audience of a first advertisement.
  • the advertisement-relevance-indicator object may be useful for identifying users who should receive a specific advertisement.
  • the advertisement-relevance-indicator object may be a key lime pie (or any pie).
  • an advertisement for a particular brand or type of pie may be presented in the user's AR/VR experience 645.
  • the advertisement may be cached or otherwise retained until such time as the user is in a grocery shopping environment.
  • the advertisement may comprise an overlay of the user's view of a freezer section of a grocery store, highlighting the advertised brand or type of pie with a coupon or other discount.
  • a wok may be an advertisement-relevance-object related to an advertisement for soy sauce.
  • an espresso maker may be an advertisement-relevance-object related to an advertisement for gourmet coffee beans.
  • the method may include receiving information or gathered data, such as from a HMD of a first user 705 at a first-user- controlled first location.
  • the data may indicate the presence of a first object at the first-user- controlled first location 710, or this may be otherwise determined.
  • a targeted advertisement for a second product may be selected 720, such as based on the HMD-gathered data.
  • the selected targeted advertisement for the second product may then be presented or otherwise displayed 725, such as the HMD, to the first user while the first user is at the second location.
  • the advertisement may be presented to the user at a social media site. In other alternative embodiments, the advertisement may be presented to the user within the user's current environment.
  • FIG. 7B illustrates another embodiment of the herein disclosed methods.
  • the method is related to presenting targeted advertising.
  • the method may comprise receiving information derived from sensor data of a head-mounted display camera of a first user at a first location 750. Once received, the system may determine an identity of at least one object at the first location based at least in part on the received information derived from sensor data of the head-mounted display camera 755. The system may then receive, such as from a third-party advertiser, information regarding an intended audience for a first advertisement comprising information identifying at least one advertisement-relevance-indicator object 760.
  • the system may cause the first advertisement associated with the information regarding at least one advertisement-relevance-indicator object to be presented to the first user 770.
  • the advertisement comprises a 3D model of a product.
  • the model is presented to the first user at the first location in the HMD.
  • the 3D model is presented as a virtual overlay of the determined at least one object.
  • the model is presented to the first user at a second location in the HMD.
  • the advertisement is presented to the first user at a social media site.
  • the method may require identification of at least two objects and association with at least two advertisement-relevance-indicator objects.
  • a presented advertisement may be for a third object distinct from the at least two identified objects.
  • a method of presenting targeted advertising comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least one object at the first location based at least in part on the received information derived from sensor data of the head-mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least one advertisement-relevance-indicator object; and responsive to a determination that the identity of the at least one object at the first location matches the information regarding at least one advertisement-relevance-indicator object, causing the first advertisement associated with the information regarding at least one advertisement-relevance-indicator object to be presented to the first user.
  • the method may include wherein the advertisement comprises a 3D model of a product.
  • the method may include wherein the model is presented to the first user at the first location in the HMD.
  • the method may include wherein the 3D model is presented as a virtual overlay of the determined at least one object.
  • the method may include wherein the model is presented to the first user at a second location in the HMD.
  • the method may include wherein the advertisement is presented to the first user at a social media site.
  • a method of presented target advertising comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least two objects at the first location based at least in part on the received information derived from sensor data of a head-mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least two advertisement-relevance-indicator objects; and responsive to a determination that the identity of the at least two objects at the first location match the information regarding at least two advertisement-relevance-indicator objects, causing the first advertisement associated with the information regarding at least two advertisement-relevance-indicator object to be presented to the first user.
  • the method may include wherein the first advertisement is for a third object of a different kind than the at least two objects.
  • a method of presenting targeted advertising comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least a first object at the first location based at least in part on the received information derived from sensor data of the head-mounted display camera; identifying a relationship between the first user and the first location; responsive to identifying the relationship as a private relationship, modifying a profile associated with the first user based on the identity of the at least one first object; receiving information regarding an intended audience for a first advertisement comprising audience profile information and information regarding at least one advertisement-relevance-indicator object; receiving information derived from sensor data of the head-mounted display camera of the first user at a second location; determining an identity of at least a second object at the second location based at least in part on the received information derived from sensor data of the head-mounted display camera; and responsive to a determination that the audience profile information matches the profile associated with the first user and that the identity of at
  • a method of extracting information from the setup of an immersive experience comprising: receiving information derived from sensor data of a head- mounted display camera of a user device of a first user at a first location; generating a first 3D reconstruction of the first location based at least in part on the received information derived from sensor data of the head-mounted display camera of the first user at the first location; comparing the first 3D reconstruction of the first location to a second 3D reconstruction of the first location, wherein the second 3D reconstruction was previously captured; classifying the first 3D reconstruction of the first location based at least in part upon the reconstruction geometry; identifying at least one object in the first 3D reconstruction of the first location based at least in part upon the received information derived from the sensor data of the head-mounted display camera of the first user; estimating at least one user characteristic at least in part from the received information; and providing a tailored immersive experience based at least in part on the first and second 3D reconstructions of the first location based at least in part on the at least one
  • the method may include wherein the received information further comprises non-camera sensor data.
  • the method may include wherein a non-camera sensor of the user device comprises at least one of a GPS, an accelerometer, a digital compass, or a gyroscope.
  • the method may further comprise receiving, in addition to the sensor data, non-sensor data collected from the user device.
  • the method may include wherein non-sensor data comprises at least one of current browser history or active applications.
  • classifying the first 3D reconstruction is further based at least in part on other received sensor and context data.
  • the method may include wherein comparing further comprises a direct comparison of the cumulative deviation of data points in the first and second 3D reconstructions, and using a threshold value to determine if the environment in each reconstruction is the same.
  • the method may include wherein comparing further comprises: segmenting the first and second 3D reconstructions; separating static elements from dynamic elements in each of the first and second 3D reconstructions; and weighting a deviation between the first and second 3D reconstructions such that static elements need to match more accurately than dynamic elements in order for the 3D reconstructions to be from the same environment.
  • segmented elements of each 3D reconstruction are used for classification of objects in the 3D reconstruction space and for object recognition.
  • object recognition comprises comparing the segmented elements with a database consisting of specific reference product 3D models and existing 3D assets.
  • classification determines a space type of the first 3D reconstruction.
  • the method may include wherein the space type comprises one of: kitchen, living room, bedroom, backyard, office, break room, cafe, dining room.
  • the method may further comprise predicting an association between the first user and the space of the first 3D reconstruction based at least in part on the space type, at least some context data, and a frequency of user visits in the same space.
  • the method may include wherein the at least some context data comprises at least one of: location, activity, time of day, ambient sound characteristics, Bluetooth identifiers in the vicinity.
  • classification further comprises accumulating a plurality of unsuccessful classification events and examining them as a time series.
  • the method may include wherein tailoring the immersive experience comprises at least one of: virtual product placement and advertisement; and augmentation of the environment to assist the user.
  • the method may include wherein the sensor data comprises at least 3D image data from a camera of a head mounted display.
  • the method may include wherein the extracted at least one user characteristic is used to tailor an immersive experience.
  • the method may include wherein the extracted at least one user characteristic is used for precision targeting of overlay product placements.
  • the method may include wherein the extracted at least one user characteristic is used for highly targeted advertisements for goods or services.
  • a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving information derived from sensor data of a head-mounted display camera of a user device of a first user at a first location; generating a first 3D reconstruction of the first location based at least in part on the received information derived from sensor data of the head-mounted display camera of the first user at the first location; comparing the first 3D reconstruction of the first location to a second 3D reconstruction of the first location, wherein the second 3D reconstruction was previously captured; classifying the first 3D reconstruction of the first location based at least in part upon the reconstruction geometry; identifying at least one object in the first 3D reconstruction of the first location based at least in part upon the received information derived from the sensor data of the head-mounted display camera of the first user; estimating at least one user characteristic at least in part from the received information; and providing a tailored immersive experience based at least in part on the first and second 3D
  • a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least a first object at the first location based at least in part on the received information derived from sensor data of the head- mounted display camera; identifying a relationship between the first user and the first location; responsive to identifying the relationship as a private relationship, modifying a profile associated with the first user based on the identity of the at least one first object; receiving information regarding an intended audience for a first advertisement comprising audience profile information and information regarding at least one advertisement-relevance-indicator object; receiving information derived from sensor data of the head-mounted display camera of the first user at a second location; determining an identity of at least a second object at the second location based at least in part on the received information derived from sensor data of the head-mounted display camera; and responsive
  • a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least one object at the first location based at least in part on the received information derived from sensor data of the head- mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least one advertisement-relevance-indicator object; and responsive to a determination that the identity of the at least one object at the first location matches the information regarding at least one advertisement-relevance-indicator object, causing the first advertisement associated with the information regarding at least one advertisement-relevance-indicator object to be presented to the first user.
  • a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least two objects at the first location based at least in part on the received information derived from sensor data of a head- mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least two advertisement-relevance-indicator objects; and responsive to a determination that the identity of the at least two objects at the first location match the information regarding at least two advertisement-relevance-indicator objects, causing the first advertisement associated with the information regarding at least two advertisement-relevance-indicator object to be presented to the first user.
  • a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving sensor and context data from a user; performing a 3D reconstruction based on the received sensor data; comparing the 3D reconstruction with at least one previously stored 3D reconstruction; classifying the 3D reconstruction based at least on a reconstruction geometry and at least one other received sensor or context data element; segmenting the 3D reconstruction; detecting objects from the segmented 3D reconstruction; classifying and recognizing the detected objects from the 3D reconstruction; and extracting at least one user characteristic.
  • a method of presented target advertising comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least two objects at the first location based upon the received information derived from sensor data of a head-mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least two advertisement-relevance-indicator objects; responsive to a determination that the identity of the at least two objects at the second location determined based upon the received information derived from sensor data of a head-mounted display camera matches the information regarding at least two advertisement-relevance-indicator objects: causing the first advertisement associated with the information regarding at least two advertisement-relevance- indicator object to be presented to the first user.
  • the method may include wherein the first advertisement is for a third object of a different kind that the at least two objects.
  • a method of presented target advertising comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least one object at the first location based upon the received information derived from sensor data of a head-mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least one advertisement-relevance-indicator object; responsive to a determination that the identity of the at least one object at the second location determined based upon the received information derived from sensor data of a head-mounted display camera matches the information regarding at least one advertisement-relevance-indicator object: causing the first advertisement associated with the information regarding at least one advertisement-relevance-indicator object to be presented to the first user.
  • the method may include wherein the advertisement comprises a 3D model of a product and the model is presented to the first user at a first location in the HMD.
  • the method may include wherein the 3D model is presented as a virtual overlay of the determined at least one object.
  • the method may include wherein the advertisement is presented to the first user at a social media site.
  • FIG. 8 is a system diagram of an exemplary WTRU 102, which may be employed as a module of a head mounted display in embodiments described herein. As shown in FIG.
  • the WTRU 102 may include a processor 118, a communication interface 119 including a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touchpad 128, a non-removable memory 130, a removable memory 132, a power source 134, a global positioning system (GPS) chipset 136, and sensors 138. It will be appreciated that the WTRU 102 may include any sub-combination of the foregoing elements while remaining consistent with an embodiment.
  • GPS global positioning system
  • the processor 118 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like.
  • the processor 1 18 may perform signal coding, data processing, power control, input/output processing, and/or any other functionality that enables the WTRU 102 to operate in a wireless environment.
  • the processor 118 may be coupled to the transceiver 120, which may be coupled to the transmit/receive element 122. While FIG. 5 depicts the processor 118 and the transceiver 120 as separate components, it will be appreciated that the processor 118 and the transceiver 120 may be integrated together in an electronic package or chip.
  • the transmit/receive element 122 may be configured to transmit signals to, or receive signals from, a base station over the air interface 116.
  • the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals.
  • the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, as examples.
  • the transmit/receive element 122 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.
  • the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may employ MTMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 116.
  • the transceiver 120 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 122 and to demodulate the signals that are received by the transmit/receive element 122.
  • the WTRU 102 may have multi-mode capabilities.
  • the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs, such as UTRA and IEEE 802.11, as examples.
  • the processor 118 of the WTRU 102 may be coupled to, and may receive user input data from, the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128 (e.g., a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit).
  • the processor 118 may also output user data to the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128.
  • the processor 118 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 130 and/or the removable memory 132.
  • the non-removable memory 130 may include random-access memory (RAM), read-only memory (ROM), a hard disk, or any other type of memory storage device.
  • the removable memory 132 may include a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like.
  • SIM subscriber identity module
  • SD secure digital
  • the processor 118 may access information from, and store data in, memory that is not physically located on the WTRU 102, such as on a server or a home computer (not shown).
  • the processor 118 may receive power from the power source 134, and may be configured to distribute and/or control the power to the other components in the WTRU 102.
  • the power source 134 may be any suitable device for powering the WTRU 102.
  • the power source 134 may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd), nickel- zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li -ion), and the like), solar cells, fuel cells, and the like.
  • the processor 118 may also be coupled to the GPS chipset 136, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102.
  • location information e.g., longitude and latitude
  • the WTRU 102 may receive location information over the air interface 1 16 from a base station and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 102 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment.
  • the processor 118 may further be coupled to other peripherals 138, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity.
  • the peripherals 138 may include sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.
  • sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module
  • FIG. 9 depicts an exemplary network entity 190 that may be used in embodiments of the present disclosure.
  • network entity 190 includes a communication interface 192, a processor 194, and non-transitory data storage 196, all of which are communicatively linked by a bus, network, or other communication path 198.
  • Communication interface 192 may include one or more wired communication interfaces and/or one or more wireless-communication interfaces. With respect to wired communication, communication interface 192 may include one or more interfaces such as Ethernet interfaces, as an example. With respect to wireless communication, communication interface 192 may include components such as one or more antennae, one or more transceivers/chipsets designed and configured for one or more types of wireless (e.g., LTE) communication, and/or any other components deemed suitable by those of skill in the relevant art. And further with respect to wireless communication, communication interface 192 may be equipped at a scale and with a configuration appropriate for acting on the network side— as opposed to the client side— of wireless communications (e.g., LTE communications, Wi-Fi communications, and the like). Thus, communication interface 192 may include the appropriate equipment and circuitry (perhaps including multiple transceivers) for serving multiple mobile stations, UEs, or other access terminals in a coverage area.
  • wireless communication interface 192 may include the appropriate equipment and circuitry (perhaps including multiple transceivers)
  • Processor 194 may include one or more processors of any type deemed suitable by those of skill in the relevant art, some examples including a general-purpose microprocessor and a dedicated DSP.
  • Data storage 196 may take the form of any non-transitory computer-readable medium or combination of such media, some examples including flash memory, read-only memory (ROM), and random-access memory (RAM) to name but a few, as any one or more types of non- transitory data storage deemed suitable by those of skill in the relevant art could be used.
  • data storage 196 contains program instructions 197 executable by processor 194 for carrying out various combinations of the various network-entity functions described herein.
  • ROM read only memory
  • RAM random access memory
  • register cache memory
  • semiconductor memory devices magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD- ROM disks, and digital versatile disks (DVDs).
  • a processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.

Landscapes

  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Systems and methods related to efficient information extraction based on sensor data collected from environments where a user is using AR/VR hardware. 3D reconstructions are created and stored by a service which receives sensor data streams. The service collects the data, associates other contextual data available from the user and environment with the 3D reconstruction, and extracts detailed information about the user and the environment based on the 3D reconstructions and other collected data. Such extracted information may enable various AR and VR experiences and services, such as identification of situations where a user may need assistance via augmentation, matching of virtual experiences with physical environments, virtual product placement and advertisement, and the like.

Description

SYSTEM FOR COLLECTING AND EXTRACTING INFORMATION FROM VIRTUAL
ENVIRONMENT MODELS
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] The present application is a non-provisional filing of, and claims benefit under 35 U.S.C. §119(c) from, U.S. Provisional Patent Application Serial No. 62/360, 116, filed July 8, 2016, entitled "SYSTEM FOR COLLECTING AND EXTRACTING INFORMATION FROM VIRTUAL ENVIRONMENT MODELS", which is incorporated herein by reference in its entirety.
TECHNICAL FIELD
[0002] This disclosure relates to systems and methods for augmented or virtual reality environments. More specifically, this disclosure relates to systems and methods for information extraction based on sensor data from users of augmented or virtual reality environments.
BACKGROUND
[0003] Emerging augmented reality (AR) and virtual reality (VR) head mounted displays (HMDs) intended for wide scale consumer use will enable collection of very detailed information about the physical environments where users are utilizing these devices. For example, in the case of Microsoft HoloLens and several similar AR HMDs, devices have integrated RGB-D sensors capable of real-time 3D reconstruction of the physical environment. Similar add-on RGB-D sensor and camera solutions have been demonstrated for VR HMDs which capture 3D data of the user's environment, and by doing so enable physical navigation and adjustment of virtual experience to the physical environment. Also, devices such as Google Tango are the first generation of mobile devices embedded with RGB-D sensors capable of enabling reconstruction of detailed 3D models of the environments where they are being used. In addition to RGB-D sensors, image data captured by regular 2D cameras can also be used to reconstruct 3D models of user environments. Recently, Microsoft published results of an implementation allowing mobile devices to perform 3D reconstructions locally from 2D image data acquired by a mobile device's camera (see Ondruska et al., "MobileFusion: Real-time Volumetric Surface Reconstruction and Dense Tracking On Mobile Phones", in Visualization and Computer Graphics, IEEE Transactions on, no.99, pp.1-1, doi: 10.1109/TVCG.2015.2459902).
[0004] At present, long-term and wide scale collection of environment data has typically been performed by map services developed by companies working on location services, such as Google, Microsoft, Nokia, etc. In these location services, for example, photo based 3D reconstruction has been used for generating building models to enable 3D views to the map data. Unlike with map services, 3D reconstructions produced by 2D/3D image sensors embedded with AR/VR HMDs have generally only been used locally during the run-time operation of applications to enable adjustment of virtual content with the physical environment or to enable interaction with the virtual content.
[0005] Detailed context and use environment understanding, as well as user characteristics such as demographics, psychographics, behavioral variables, product purchase history, and other preferences, are extremely valuable information when constructing any service for the users, and may enable creation of completely new kinds of services.
[0006] Social media services are currently performing user profiling based on user created content, e.g., text, photographs, and video clips provided by the user, as well as location and internet use statistics, such as GPS and web navigation histories collected from the user. While it is already possible to collect information quite efficiently with the previously mentioned data sources, there are opportunities for collecting additional information that can be used to provide greater services to users.
SUMMARY
[0007] Systems and methods of the present disclosure provide efficient collection and extraction of information about users, environments associated with users, and contexts of use. Such extracted information may enable various AR & VR experiences and services, such as identification of situations where a user may need assistance via augmentation, matching of virtual experiences with physical environments, virtual product placement and advertisement, and the like.
[0008] In one embodiment, there is a method of targeted advertising, comprising: receiving gathered data, from a head mounted display (HMD) of a first user, indicating the presence of a first object at a first-user-controlled first location; responsive to detecting that the first user is at a second location: selecting a targeted advertisement for a second product based on the HMD- gathered data; and displaying, via the HMD, the selected targeted advertisement for the second product to the first user while the first user is at the second location.
[0009] In one embodiment, there is a method of extracting information from the setup of an immersive experience, comprising: receiving information derived from sensor data of a head- mounted display camera of a user device of a first user at a first location; generating a first 3D reconstruction of the first location based at least in part on the received information derived from sensor data of the head-mounted display camera of the first user at the first location; comparing the first 3D reconstruction of the first location to a second 3D reconstruction of the first location, wherein the second 3D reconstruction was previously captured; classifying the first 3D reconstruction of the first location based at least in part upon the reconstruction geometry; identifying at least one object in the first 3D reconstruction of the first location based at least in part upon the received information derived from the sensor data of the head-mounted display camera of the first user; estimating at least one user characteristic at least in part from the received information; and providing a tailored immersive experience based at least in part on the first and second 3D reconstructions of the first location based at least in part on the at least one object identified and the at least one estimated user characteristic.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] A more detailed understanding may be had from the following description, presented by way of example in conjunction with the accompanying drawings, wherein:
[0011] FIG. 1 illustrates an overview of one embodiment of an architecture for collecting, extracting, and utilizing data from 3D environments.
[0012] FIG. 2 illustrates one embodiment of the process of 3D reconstruction and storage of 3D reconstructions.
[0013] FIG. 3 illustrates one embodiment of data processing performed by the service in order to extract user and environment information.
[0014] FIG. 4 illustrates one embodiment of a process performed by the service to extract information from data received from a user.
[0015] FIG. 5 illustrates one embodiment of targeted advertising based on 3D reconstructions and user information extraction.
[0016] FIG. 6 illustrates one embodiment of a process performed by the service to extract information from data received from a user and present a targeted advertisement.
[0017] FIG. 7 A illustrates one embodiment of a process performed by the service to extract information from data received from a user and present a targeted advertisement.
[0018] FIG. 7B illustrates another embodiment of a process performed by the service to extract information from data received from a user and present a targeted advertisement.
[0019] FIG. 8 illustrates an exemplary wireless transmit/receive unit (WTRU) that may be employed as a module of a head mounted display in some embodiments.
[0020] FIG. 9 illustrates an exemplary network entity that may be employed in some embodiments. DETAILED DESCRIPTION
[0021] A detailed description of illustrative embodiments will now be provided with reference to the various Figures. Although this description provides detailed examples of possible implementations, it should be noted that the provided details are intended to be by way of example and in no way limit the scope of the application.
[0022] Note that various hardware elements of one or more of the described embodiments are referred to as "modules" that carry out (i.e., perform, execute, and the like) various functions that are described herein in connection with the respective modules. As used herein, a module includes hardware (e.g., one or more processors, one or more microprocessors, one or more microcontrollers, one or more microchips, one or more application-specific integrated circuits (ASICs), one or more field programmable gate arrays (FPGAs), one or more memory devices) deemed suitable by those of skill in the relevant art for a given implementation. Each described module may also include instructions executable for carrying out the one or more functions described as being carried out by the respective module, and it is noted that those instructions could take the form of or include hardware (i.e., hardwired) instructions, firmware instructions, software instructions, and/or the like, and may be stored in any suitable non-transitory computer- readable medium or media, such as commonly referred to as RAM, ROM, etc.
[0023] 3D reconstructions produced by AR/VR FDVIDs may be collected during participation in AR/VR experiences and stored for later reference. Especially in cases where the use of AR/VR HMD may be connected with use of a social media service such as Facebook, per-user collection of environment models may be easy to arrange, and may enable collection of very detailed information about users and their associated environments.
[0024] In view of access to the rich information that 3D reconstructions could provide, a system for creating and collecting 3D reconstructions is needed. There already exist tools for capturing RGB-D data and turning it into 3D reconstructions, such as KinectFusion and KinFu implemented as part of the Point Cloud Library. Also, 2D image data can be used for 3D reconstruction, as demonstrated by various image based photogrammetry software such as VisualSfM and Agisoft Photoscan. However, systems going beyond mere 3D reconstruction towards gaining deeper semantic understandings of the environments are not generally available, despite well-established research in many individual areas required in the sub-tasks used by such products or services. These sub-tasks may include 3D reconstruction segmentation, 3D object search and matching, classification and matching of 3D reconstructions, and combination of object data with other user specific and contextual data. [0025] An overview of one embodiment is illustrated in FIG. 1. In one embodiment, a user provides RGB-D data to the on-line information collection and extraction service (called service), and the service creates, collects, and classifies stored 3D reconstructions and performs further processing of the data in order to extract user-specific and contextual information. The service may provide the extracted information to or for third party services that can use the information to tailor their service per user per context.
[0026] More particularly, as shown in FIG. 1, the device sensors 150 of a user's device (such as a HMD having a camera, etc.) may communicate sensor data 160 to a client 152 (e.g., an application on the user's HMD, etc.). The client 152 may communicate the sensor data and additional context data 162 from the client to the service 154. The service 154 may perform 164 various functions, such as: creating a virtual environment model based on 2D/3D image data; classifying the environment model based on received sensor and context data; segmenting and detecting objects from the virtual environment model; extracting per user and per environment information from the collected data; and/or the like. At some time, a third-party service 156 may communicate a data request 166 to the service 154, responsive to which the service 154 may provide requested per user data 168, and/or the like.
[0027] In one embodiment, as illustrated in FIG. 4, a process performed by the service to extract information from data received from a user may be generally described by the following steps.
- Receive sensor and context data from the user (405).
- Perform 3D reconstruction based on the received 2D/3D data (410).
Compare 3D reconstruction with previously stored 3D reconstructions (415).
Classify the 3D reconstruction based on reconstruction geometry and other received sensor and context data (420).
Segment 3D reconstruction (425).
- Detect objects from segmented 3D reconstruction (430).
Classify and recognize detected objects (435).
Combine all detected and received information (440).
- Extract user and context characteristics from the collected data (445).
[0028] Once all the information is extracted and connected with the user profile, it can be provided for third-party services.
[0029] In order to collect and leverage information that can be extracted from 3D reconstructions of physical environments, the service collects data from users, process the data, and then provides the data for use by third parties. Each of the steps involved with the processes are set forth in more detail below. The description of processes is broken down into three areas: data collection, data processing, and use of collected data.
[0030] Data Collection. FIG. 2 illustrates one embodiment of the process of 3D reconstruction and storage of 3D reconstructions. A client 205 executed on a user's device may initialize data collection by requesting data from device sensors. The client 205 in this case may be any application which uses the sensor data in its operation while also streaming it to a server, or can be a dedicated application for data collection, or functionality embedded with the operating system or server drivers which autonomously stream data to the service without user initiation. A client application streams sensor data and other context data 235 collected from the user's device to the service 208, such as to a service's receiver module 210. Device sensors that provide data for 3D reconstruction include various 2D and/or 3D image sensors such as cameras, camera arrays, RGB- D sensors, and/or the like. In addition to the image sensor data, any or all other potentially available sensor data may be sent to the service and may assist in context recognition and data classification. Examples of other sensors may include, but are not limited to, GPS, accelerometer, digital compass, gyroscope, and the like. In addition to sensor data, additional data collected from the device which may help in context recognition or in the classification may be communicated to the service, such as cell ID, current browser history, active applications, and the like. In some cases, image data 240 may be passed to a 3D reconstruction module 215, which may perform 3D reconstruction 245.
[0031] The receiver module 210 may communicate any location and/or context data 250 (which may include GPS, cell ID, browser history, etc.) to a data manager module 220 of the service. In some embodiments, this information may be passed 255 to a context recognition module 225 of the service, which may analyze 260 the data to detect the user's current activity and/or location, which may then be communicated back 265 to the data manager 220. The data manager 220 may use the detected activity and/or location to retrieve 270, from a database 230 or the like, earlier 3D reconstructions potentially matching the environment the user is currently in, or that were captured from close proximity thereto. Once at least one potentially matching previously-stored 3D reconstruction has been retrieved 275, the service compares 285 the new (e.g., current) 3D reconstruction 280 with the pre-existing ones 275, in order to detect if the current environment was previously captured, and to align a new 3D reconstruction with an old reconstruction such that any differences can be detected, and missing or changed areas updated within the database. In some embodiments, for example if there is no existing 3D reconstruction, the data manager 220 may request 290 that a new location be created in the database. For either new locations or existing ones, based on the new 3D reconstruction, the database 230 may be updated 295.
[0032] To detect if the new 3D reconstruction matches an old reconstruction and to detect changed areas, a first step may be to find a best matching alignment between the old and new reconstructions. Once the optimal matching alignment is found, the deviations between reconstructions are examined by the service to decide if the old and new reconstructions represent the same environment, or if they are from different locations. The alignment of the 3D reconstructions may be performed with iterative closest point (ICP) algorithm, or any other more advanced point cloud / 3D reconstruction registration method as described in publications such as Shiratori et al., "Efficient Large-Scale Point Cloud Registration Using Loop Closures", 2015 International Conference on 3D Vision (3DV), IEEE, 2015, p. 232-240, and Eckart et al., "MLMD: Maximum Likelihood Mixture Decoupling for Fast and Accurate Point Cloud Registration", 2015 International Conference on 3D Vision (3DV), IEEE, 2015, p. 241-249.
[0033] When the optimal matching alignment for 3D reconstructions is found, the reconstructions can be compared. One approach for comparison is direct comparison of cumulative deviation of data points and the use of some threshold value to determine if the environment is regarded as new or old. Another, more advanced approach may first segment both 3D reconstructions to separate static elements such as floors, ceilings, and walls from dynamic elements such as pieces of furniture and random objects. Then, an amount of deviation between the reconstructions can be weighted so that static elements are required to match more accurately than dynamic elements in order for the algorithm to consider 3D reconstructions to be from the same environment.
[0034] A comparison result may be used to avoid redundant copies of the same environment being stored in the database. Furthermore, comparison may help to align several 3D reconstructions with partial overlap together, thus resulting in a larger area being continuously covered. Comparison of 3D reconstructions can also help in further context recognition. In some cases, the service may compare 3D reconstructions associated with different users with the same location. For example, based on location information the service may detect that, for a particular location where the new 3D reconstruction is captured, a different user has already provided a matching 3D reconstruction. These 3D reconstructions can then be compared, and, if determined to represent the same location, the space can be cross-referenced between the users. For example, if the environment has already been labeled as the living room of a user B, then the service can expect that for a user A who provided new 3D reconstruction, this environment could be the living room of a friend or a relative. If there is additional context information available, such as social connections from Facebook or a similar service, the relationship between the two users may be verified as correct, and very accurate labeling and cross-referencing established between the users and the environments.
[0035] Comparing 3D reconstructions and identification of previously existing 3D reconstructions also permits detection of deviations in the environment between different capture sessions. If the service estimates that the deviations between sessions are a result of the removal or addition of one or more dynamic objects, these areas can be marked and used in later processing steps to detect if some specific objects have been removed or added, which then can be used as an additional feature vector element in the user characteristic estimation.
[0036] Data Processing. Once there are 3D reconstructions stored in the database for a user, the service may perform a process for extracting user-specific information from the raw data consisting of 3D reconstructions, sensor data, and other context information linked with the user and environment.
[0037] FIG. 3 illustrates one embodiment of the process of data processing performed by the service in order to extract user and environment information.
[0038] A first step is to classify the 3D reconstruction, such as with a classification module 302 of the service 300. This may operate on a 3D reconstruction retrieved 320 from an environment data and context info module 314 of a database 313 of the service 300. One goal of the classification is to predict which type of space the 3D reconstruction represents and what the relationship is between the user and the space. The type of space may be predicted using any suitable known approach for 3D reconstruction classification, such as the approach presented by Swadzba and Wachsmuth in "A detailed analysis of a new 3D spatial feature vector for indoor scene classification.", Robotics and Autonomous Systems, 2014, 62.5: 646-662, among others. Classification may determine the space type 322, e.g., kitchen, living room, bed room, backyard, and the like. Based on the predicted space type, context data (e.g., location, activity, time of day, ambient sound characteristics, Bluetooth identifiers in the vicinity, and the like), and the frequency of user visits in the same space, the association between the space and the user may be predicted (e.g., a room of the user's home, a room at the home of a user's friend, a room at the user's workplace, etc.).
[0039] Exemplary context recognition techniques that may be used with mobile devices in embodiments described herein include those described in Hoseini-Tabatabaei et al., "A survey on smartphone-based systems for opportunistic user context recognition.", ACM Computing Surveys
(CSUR), 2013, 45.3 : 27. The approaches described therein may also be used here, since the same sensors and other data sources can be expected to be available from the situation when the environment has been captured. As described in Hoseini-Tabatabaei et al., context recognition may operate based on first transforming raw sensor data into features. Features captured from the raw sensor data are collected to feature vectors to be input into a predictive model constructed with a machine learning approach. The predictive model outputs the predicted classification for the context, or as in the present disclosure, context features can be composited directly together with the space type prediction and used for predicting the classification of the association between the user and the space, e.g., user's living room, user's office, living room of another person, public cafe, and/or the like.
[0040] In many cases, mere context information which is available just from the environment capture session may not provide enough information to enable sufficiently accurate prediction of the association between the user and the space. In these cases, the classification may be improved by accumulating similar unsuccessful classification events and examining them as a time series, or by prompting the user for assistance in the classification when an automatic classification fails. If the user is asked directly to assist the classification, handling of unknown associations may be readily accomplished. However, in a true opportunistic approach, where direct user involvement is preferably avoided, the time series inspection may provide further insight to the unknown associations. When unknown events are collected, the means and variances of different time series features can be extracted and used for another predictive model, which may also be constructed using a machine learning approach. For example, frequency, duration, and variation between occurrences, as well as detected social and/or activity contexts in the occurrences may provide additional insights into the association between the user and the space.
[0041] For detected spaces that fall in categories of a space where a single person is a primary owner (e.g., living room or a bedroom), but that are not a user's own, the system may attempt to identify the primary owner of the space by cross referencing the spaces linked with other users.
Information from social networks potentially available from other services, such as Facebook,
Linkedln, and/or the like, can be used to improve and/or limit the cross-referencing search.
[0042] The classification of the association between the user and the space need not be perfect, as in addition to the classification, confidence (e.g., estimation of how accurate the classification is) may be made available for recipients of the service outputs (e.g., other applications or third parties which use the information provided by the service). This may permit the application logic to handle uncertainty and unknown situations in a way that is best for a particular application.
[0043] Once the space type and the association with the user have been predicted, a subsequent step is the decision whether to further analyze and store the 3D reconstruction or discard it as irrelevant. This may be managed, for example, by a 3D reconstruction manager module 304, which may, based on the class 322 and retrieved context information related with the 3D reconstruction 324 (e.g., location, activity, etc.). For example, spaces that do not include personal elements, such as public spaces, unrecognized outdoor areas, or the like, may be discarded 326. If the space is regarded as meaningful, it is passed 328 to a next stage of the analysis which segments 330 (e.g., with a segmentation module 306) the 3D reconstruction in order to detect individual objects in the space.
[0044] Segmented elements 332 of the 3D reconstruction may be used for classification of objects in the space 334 by an object classification module 308, and later in the process for object recognition by an object and space recognition module 310. Object recognition may compare segmented elements 336 with an object and space reference database 318 having specific reference product 3D models and existing 3D assets. An objective may be to identify whether the object in the user's space is of a specific model or brand, and to maintain references to existing 3D assets representing the object. Object recognitions may be stored 340 in the database 313.
[0045] Once information about the space and any or all obj ects in the space has been extracted, as in the previously described steps, the gathered information may be stored and used for predicting 344 user characteristics, such as by a user classification module 312. User characteristics may include, but are not limited to, classification of the user to various demographics, psychographics, and/or behavioral segments, or the like. For classification, any or all existing user specific data 342 is processed 344 for use as an input for predictive model (such as in feature vector models). The resulting classifications 346 may be stored in the database.
[0046] A predictive model may be created as a separate preprocessing step using a machine learning approach. The predictive model may be generated using known machine learning approaches or combinations of various known approaches. Using supervised learning methods, the predictive model may be trained using separately collected known samples representing various classification classes, and when the prediction accuracy is estimated to be sufficiently accurate, model adjustment and training may cease and the model transferred for use predicting the classes of samples from unknown classes. Using unsupervised learning methods, the collected samples forming the feature vectors may be fed to the learning algorithm without any class descriptions, as the task of the learning algorithm is to cluster the data such that similar occurrences become clustered together. The clustering proposed by the algorithm may thus divide the data into logical groups which may be studied and identified.
[0047] In cases such as extracting socio-demographic characteristics of users from collected data, supervised learning approaches may be preferable. With such approaches, first a training data set may be constructed by collecting data and labeling resulting feature vectors to represent specific socio-demographic classes (e.g., gender, age, top-five personality traits, average income range, marital status, number of kids, employment status, and/or the like). Construction of the training data may be simplified by combining socio-demographic characteristics provided by existing services, such as social media services, with data collected from the user. An example of one feature vector for one space associated with one user may unfold to a structured list with items such as, but not limited to: type of space, owner of space, frequency of visits, variance of frequency of visits, average time spent on space, variation of times spent in the space, objects in the space (e.g., type, specific model, dimensions, etc.), dimension of the space, average context of user presence in the space (e.g., activity, social context, etc.), variation of objects between visits, and the like.
[0048] For each user, feature vectors may be compiled for any or all spaces with which the user is associated. In the training phase, the machine learning algorithm may tune the predictive model using the labeled training data as an input. For a machine learning method, anything can be used from logistic regression to multilayer neural networks, support vector machines (SVM) to ensemble methods such as Gradient Boosted Decision Trees - GBDT used by Pennacchiotti and Popescu (see Pennacchiotti and Popescu, "A Machine Learning Approach to Twitter User Classification.", ICWSM, 2011, 11.1 : 281-288) for predicting socio-demographic characteristic of users based on their Twitter messages, to complex deep learning approaches combining several methods in several steps.
[0049] After the training phase, part of the training data reserved for validation may be used to test the performance of the resulting predictive model. The training and testing may be performed iteratively comparing prediction accuracy between different methods and/or parameters used in the training. In some embodiments, different predictive models may be chosen to predict different aspects of the user characteristics, where specific models improve the prediction performance for specific aspects.
[0050] Once the full predictive model or collection of predictive models is determined, the system may predict the same socio-demographic classifications for new users from whom the socio-demographic characteristics are unknown that were provided as desired output during the training, such as gender, age, top-five personality traits, average income range, marital status, number of kids, employment status, and/or the like. In addition to the predicted classifications, the predictive model may estimate the confidence of the estimation. Estimation confidence, together with the estimated predictive model accuracy, may be provided for users of the system, so that an application using the data provided by the system may determine how to handle classifications with high uncertainty. [0051] Any or all the user characteristics predicted and collected during the process may be stored 346 to the database (such as to a user characteristics module 316) for later reference.
[0052] Use of Collected Data. Extracted data from the environment may be organized in a relational manner connecting user characteristics with the users, users with the environments, environments with the environment types, objects with the environments, objects with the 3D assets, and/or the like, so that information may be effectively queried using techniques analogous to those used for querying relational databases. The service may provide a query interface for the clients which can be local applications, applications running on external devices, or any other third-party service.
[0053] Variations. In one embodiment of the systems and methods disclosed herein, the 3D reconstruction of the environment is performed on the client device instead of on the service side. Once 3D reconstruction on the client side is determined to be complete (or sufficiently complete), it may be sent to the service along with data from relevant sensors and context information.
[0054] Another embodiment of the systems and methods may work with individual objects. As full 3D reconstruction of environments and detection of the objects may be difficult, considering for example the limitations caused by sub-optimal network bandwidth, requirements of the systems and methods may be reduced by working at an individual object level. For example, object reconstruction can be performed on a client device and optimized reconstructions of the objects loaded to the service side.
[0055] Example Use Case: User Assistance in Unfamiliar Environments. In some circumstances, the user has been providing environment data to the system for some time. Based on the environment classification and context information, the system has a comprehensive collection of environment descriptions of environments that are familiar to the user. An application dedicated for user assistance in unfamiliar environments may continuously query a current environment type from the system in order to detect when the user is in an unfamiliar environment, or in some embodiments the user may initiate an assistance application explicitly.
[0056] For many scenarios, simply examining the user's current GPS location and comparing with the user's location history may be sufficient to identify when the user is in an unfamiliar environment. However, on many occasions when the user is in public or semi-public places, the GPS location alone may not fully reveal if the environment and objects in it are familiar or unfamiliar to the user. For example, subway stations are at numerous locations, but a user who daily travels between two particular stations can also operate a ticket machine in any other station as the ticket machine is identical. Contrary to this, in a large office building where (generally) only one GPS location is determined, a user may be familiar with objects in one break room, but may be unfamiliar with objects present in another break room on a different floor.
[0057] When a need for assistance is detected or initiated by the user, the assistance application may query the service for the type of the current environment, objects found from the environment, and object characteristics such as brands and language of manufactured products, and/or the like. Based on the environment and object description(s) received from the service, the assistance application may use augmented reality to replace or overlay unfamiliar obj ects with familiar virtual objects, replace labels on objects to provide all texts and icons localized for the user, search from the internet and augment user guides and instructions for objects based on object identification, or based on context information predict what actions user is performing and highlight objects relevant for that action from the environment.
[0058] Matching of Virtual Experiences with the Physical Environments. In this scenario, the systems and methods disclosed herein are used for finding suitable environments for virtual experiences or finding suitable virtual experiences that match the characteristics of the current environment. Matching may occur according to several characteristics.
[0059] For example, a virtual experience may require certain objects to be present, such as an augmented reality game may require some common household objects to be available like a soda can which is used for anchoring the virtual content, or operates as a physical proxy object enabling user interaction and haptic feedback. In addition to specific objects, a virtual experience may benefit from the presence of certain shapes in addition to specific objects. For example, the application can query 3D reconstructions of objects found from some environment and compare them with the desired haptic feedback shape, such as tabletop or similar surface where virtual control surface is placed, or shape which could be used for anchoring virtual content, such as a chair for virtual character to sit on.
[0060] Virtual Product Placement. In some scenarios, a user is profiled based on environments and objects found from the environments with which he or she has personal connections. A first step in the 3D reconstruction analysis is the detection of the space type and determining associations between the user and the space. Space type classification can be performed by analyzing the geometry of the 3D reconstruction, for example with an approach similar to that described in Shiratori et al. Reliability of the space type classification may be further improved by combining context information with the 3D reconstruction as a classification input. From context data, it is also possible to predict the association between the space and the user (e.g., room is located in the location identified as the user's home or part of the location identified as the user's workplace). [0061] Some or all the information extracted from the user and the 3D reconstruction may be combined after the object recognition phase to a feature vector, which can then be used as an input for a user characteristics classification prediction.
[0062] After the user characteristics classification, the application may use the information for selecting content most likely matching the user's interests, such as placing new virtual objects in the augmented environment with the possibility for the user to buy them, or completely replacing objects with similar virtual objects to provide a visualization of how the user's environment would look with alternative furniture or decoration objects.
[0063] FIG. 5 depicts one embodiment of the herein disclosed systems and methods. In some embodiments, a user (equipped with device sensors 505, AR/VR social media client 510, etc.) provides RGB-D data (or other sensor data) 530, and possibly other context data 535 to a social media service 515, and the social media service 515 creates, collects, and classifies 540 stored 3D reconstructions and performs further processing of the data in order to extract user specific information. In some embodiments, advertisers 520 may request targeted advertisements 545 to be sent either to a user or users that match a desired profile specified by the advertiser 560 (e.g., users who live in a certain area and who have product x in their living room) or a user connected with a first user fulfilling the profile specified by the advertiser 555 (e.g., friends of a user who owns product x). These targeted advertisements may be presented responsive to matching 550 carried out by the social media service 515.
[0064] For example, the user's HMD device may use sensors to capture data related to a current location of the user. Such sensor data may be communicated through the social media client, which may further incorporate additional context data (e.g., location, location's relation to user, activity, etc.). The service backend (e.g., social media service) may receive the sensor and context data, and use such data to create 3D reconstructions of the user's location/environment. The service may classify the 3D reconstruction based at least in part on the received sensor and context data. The 3D reconstruction may be segmented by the service to detect specific objects in the 3D reconstruction. The service may also extract user characteristics based on the detected objects, collected information, other information from the social media service, and/or the like. Such information may be stored in a database for use in responding to third party requests (e.g., from advertisers, etc.).
[0065] At some other time (or possibly pre-existing or concurrently), the service may receive a targeted advertising request from a third-party advertiser. The service may match user profiles, objects, and spaces with the targeted advertisement request. For example, the advertiser may communicate information identifying an advertisement-relevance-indicator object to the service. Advertisement-relevance-indicator objects may represent specific items detectable in a user's environment/location which may be "triggers" for an advertisement. As examples, a wok may serve as an advertisement-relevance-indicator object with respect to an advertisement for soy sauce, and an espresso maker may serve as an advertisement-relevance-indicator object with respect to an advertisement for gourmet coffee beans. Once a match is detected by the service between a user and the targeted advertisement request, the service may cause the relevant advertisement to be presented to the user or users who match the request.
[0066] In some embodiments, advertisement requests may be structured such that the social media aspect of the service is of greater importance. For example, based on user profiles and data extracted from users' environments, the service may present advertisements to particular
"primary" users who directly match a profile set forth in the targeted advertisement request (e.g., primary user has product x in their living room and lives in a specified area), as well as "secondary" users who are connected with the primary users (e.g., friends of a user who owns a product x and live in the specified area). Such connected user advertisements may be beneficial by permitting advertisers to reach users who may not themselves be identifiable as useful advertising targets, but who because of their user connections may still be useful targets of an advertisement. For example, a secondary user who does not own a video game console may not match a targeted advertisement request for a video game on that console. But if the secondary user is connected with a primary user whose user profile indicates the presence in their living room of the video game console, the secondary user may still be a valuable target of the advertisement (e.g., even though secondary user cannot use video game at their home, they could use it at the primary user's home).
[0067] FIG. 6 depicts one embodiment of the herein disclosed methods. A user may operate
AR/VR equipment in their home environment (e.g., kitchen). The AR/VR equipment may generate a 3D reconstruction of the current environment and may receive context data from the equipment and/or a linked social media service 605. In the generated 3D reconstruction 610 of the home environment, classification 615 and segmentation 620 may be utilized to detect 625 specific objects in the user's environment. For example, in a kitchen environment, the 3D reconstruction segmentation may detect a wok, an espresso maker, and a key lime pie. These objects may also be classified and recognized 630. The context information received by the system may determine the environment type, here a kitchen, and the relation with the user, here that it is the user's kitchen.
The segmented objects and the environment type and relation with the user may be stored in a data storage by the system 635. At some other time, user profiling may occur, in response to a request by a third-party advertiser 640. In such a request, the stored data may be provided to and/or made accessible by the third-party advertiser. In one embodiment, the system could receive information identifying an advertisement-relevance-indicator object from the third party. Such an advertisement-relevance-indicator object may relate to information regarding an intended audience of a first advertisement. For example, the advertisement-relevance-indicator object may be useful for identifying users who should receive a specific advertisement. In the embodiment of FIG. 6, for example, the advertisement-relevance-indicator object may be a key lime pie (or any pie). In response to the presence of the pie in the user's environment, an advertisement for a particular brand or type of pie may be presented in the user's AR/VR experience 645. In one embodiment, the advertisement may be cached or otherwise retained until such time as the user is in a grocery shopping environment. In one embodiment, the advertisement may comprise an overlay of the user's view of a freezer section of a grocery store, highlighting the advertised brand or type of pie with a coupon or other discount. In other embodiments, a wok may be an advertisement-relevance-object related to an advertisement for soy sauce. In other embodiments, an espresso maker may be an advertisement-relevance-object related to an advertisement for gourmet coffee beans.
[0068] In one embodiment, there is a method of targeted advertising. The method may include receiving information or gathered data, such as from a HMD of a first user 705 at a first-user- controlled first location. The data may indicate the presence of a first object at the first-user- controlled first location 710, or this may be otherwise determined. Responsive to detecting that the first user is at a second location 715, a targeted advertisement for a second product may be selected 720, such as based on the HMD-gathered data. The selected targeted advertisement for the second product may then be presented or otherwise displayed 725, such as the HMD, to the first user while the first user is at the second location.
[0069] In some alternative embodiments, the advertisement may be presented to the user at a social media site. In other alternative embodiments, the advertisement may be presented to the user within the user's current environment.
[0070] FIG. 7B illustrates another embodiment of the herein disclosed methods. In one embodiment, the method is related to presenting targeted advertising. The method may comprise receiving information derived from sensor data of a head-mounted display camera of a first user at a first location 750. Once received, the system may determine an identity of at least one object at the first location based at least in part on the received information derived from sensor data of the head-mounted display camera 755. The system may then receive, such as from a third-party advertiser, information regarding an intended audience for a first advertisement comprising information identifying at least one advertisement-relevance-indicator object 760. Responsive to a determination that the identified object at the first location matches is an instance of an advertisement-relevance-indicator object 765, the system may cause the first advertisement associated with the information regarding at least one advertisement-relevance-indicator object to be presented to the first user 770.
[0071] In some embodiments, the advertisement comprises a 3D model of a product. In some embodiments, the model is presented to the first user at the first location in the HMD. In some embodiments, the 3D model is presented as a virtual overlay of the determined at least one object. In some embodiments, the model is presented to the first user at a second location in the HMD. In some embodiments, the advertisement is presented to the first user at a social media site.
[0072] In some further embodiments, the method may require identification of at least two objects and association with at least two advertisement-relevance-indicator objects. In some embodiments, a presented advertisement may be for a third object distinct from the at least two identified objects.
[0073] In an embodiment, there is a method of presenting targeted advertising, comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least one object at the first location based at least in part on the received information derived from sensor data of the head-mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least one advertisement-relevance-indicator object; and responsive to a determination that the identity of the at least one object at the first location matches the information regarding at least one advertisement-relevance-indicator object, causing the first advertisement associated with the information regarding at least one advertisement-relevance-indicator object to be presented to the first user. The method may include wherein the advertisement comprises a 3D model of a product. The method may include wherein the model is presented to the first user at the first location in the HMD. The method may include wherein the 3D model is presented as a virtual overlay of the determined at least one object. The method may include wherein the model is presented to the first user at a second location in the HMD. The method may include wherein the advertisement is presented to the first user at a social media site.
[0074] In an embodiment, there is a method of presented target advertising, comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least two objects at the first location based at least in part on the received information derived from sensor data of a head-mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least two advertisement-relevance-indicator objects; and responsive to a determination that the identity of the at least two objects at the first location match the information regarding at least two advertisement-relevance-indicator objects, causing the first advertisement associated with the information regarding at least two advertisement-relevance-indicator object to be presented to the first user. The method may include wherein the first advertisement is for a third object of a different kind than the at least two objects.
[0075] In an embodiment, there is a method of presenting targeted advertising, comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least a first object at the first location based at least in part on the received information derived from sensor data of the head-mounted display camera; identifying a relationship between the first user and the first location; responsive to identifying the relationship as a private relationship, modifying a profile associated with the first user based on the identity of the at least one first object; receiving information regarding an intended audience for a first advertisement comprising audience profile information and information regarding at least one advertisement-relevance-indicator object; receiving information derived from sensor data of the head-mounted display camera of the first user at a second location; determining an identity of at least a second object at the second location based at least in part on the received information derived from sensor data of the head-mounted display camera; and responsive to a determination that the audience profile information matches the profile associated with the first user and that the identity of at least the second object at the second location matches the information regarding at least one advertisement-relevance-indicator object, causing the first advertisement to be presented to the first user within the second location. The method may include wherein the second location is either the first location or another location.
[0076] In an embodiment, there is a method of extracting information from the setup of an immersive experience, comprising: receiving information derived from sensor data of a head- mounted display camera of a user device of a first user at a first location; generating a first 3D reconstruction of the first location based at least in part on the received information derived from sensor data of the head-mounted display camera of the first user at the first location; comparing the first 3D reconstruction of the first location to a second 3D reconstruction of the first location, wherein the second 3D reconstruction was previously captured; classifying the first 3D reconstruction of the first location based at least in part upon the reconstruction geometry; identifying at least one object in the first 3D reconstruction of the first location based at least in part upon the received information derived from the sensor data of the head-mounted display camera of the first user; estimating at least one user characteristic at least in part from the received information; and providing a tailored immersive experience based at least in part on the first and second 3D reconstructions of the first location based at least in part on the at least one object identified and the at least one estimated user characteristic. The method may include wherein the received information further comprises non-camera sensor data. The method may include wherein a non-camera sensor of the user device comprises at least one of a GPS, an accelerometer, a digital compass, or a gyroscope. The method may further comprise receiving, in addition to the sensor data, non-sensor data collected from the user device. The method may include wherein non-sensor data comprises at least one of current browser history or active applications. The method may include wherein classifying the first 3D reconstruction is further based at least in part on other received sensor and context data. The method may include wherein comparing further comprises a direct comparison of the cumulative deviation of data points in the first and second 3D reconstructions, and using a threshold value to determine if the environment in each reconstruction is the same. The method may include wherein comparing further comprises: segmenting the first and second 3D reconstructions; separating static elements from dynamic elements in each of the first and second 3D reconstructions; and weighting a deviation between the first and second 3D reconstructions such that static elements need to match more accurately than dynamic elements in order for the 3D reconstructions to be from the same environment. The method may include wherein segmented elements of each 3D reconstruction are used for classification of objects in the 3D reconstruction space and for object recognition. The method may include wherein object recognition comprises comparing the segmented elements with a database consisting of specific reference product 3D models and existing 3D assets. The method may include wherein classification determines a space type of the first 3D reconstruction. The method may include wherein the space type comprises one of: kitchen, living room, bedroom, backyard, office, break room, cafe, dining room. The method may further comprise predicting an association between the first user and the space of the first 3D reconstruction based at least in part on the space type, at least some context data, and a frequency of user visits in the same space. The method may include wherein the at least some context data comprises at least one of: location, activity, time of day, ambient sound characteristics, Bluetooth identifiers in the vicinity. The method may include wherein classification further comprises accumulating a plurality of unsuccessful classification events and examining them as a time series. The method may include wherein tailoring the immersive experience comprises at least one of: virtual product placement and advertisement; and augmentation of the environment to assist the user.
[0077] In an embodiment, there is a method of extracting information from the setup of an immersive experience by: receiving sensor and context data from a user; performing a 3D reconstruction based on the received sensor data; comparing the 3D reconstruction with at least one previously stored 3D reconstruction; classifying the 3D reconstruction based at least on a reconstruction geometry and at least one other received sensor or context data element; segmenting the 3D reconstruction; detecting objects from the segmented 3D reconstruction; classifying and recognizing the detected objects from the 3D reconstruction; and extracting at least one user characteristic. The method may include wherein the sensor data comprises at least 3D image data from a camera of a head mounted display. The method may include wherein the extracted at least one user characteristic is used to tailor an immersive experience. The method may include wherein the extracted at least one user characteristic is used for precision targeting of overlay product placements. The method may include wherein the extracted at least one user characteristic is used for highly targeted advertisements for goods or services.
[0078] In an embodiment, there is a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving information derived from sensor data of a head-mounted display camera of a user device of a first user at a first location; generating a first 3D reconstruction of the first location based at least in part on the received information derived from sensor data of the head-mounted display camera of the first user at the first location; comparing the first 3D reconstruction of the first location to a second 3D reconstruction of the first location, wherein the second 3D reconstruction was previously captured; classifying the first 3D reconstruction of the first location based at least in part upon the reconstruction geometry; identifying at least one object in the first 3D reconstruction of the first location based at least in part upon the received information derived from the sensor data of the head-mounted display camera of the first user; estimating at least one user characteristic at least in part from the received information; and providing a tailored immersive experience based at least in part on the first and second 3D reconstructions of the first location based at least in part on the at least one object identified and the at least one estimated user characteristic.
[0079] In an embodiment, there is a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least a first object at the first location based at least in part on the received information derived from sensor data of the head- mounted display camera; identifying a relationship between the first user and the first location; responsive to identifying the relationship as a private relationship, modifying a profile associated with the first user based on the identity of the at least one first object; receiving information regarding an intended audience for a first advertisement comprising audience profile information and information regarding at least one advertisement-relevance-indicator object; receiving information derived from sensor data of the head-mounted display camera of the first user at a second location; determining an identity of at least a second object at the second location based at least in part on the received information derived from sensor data of the head-mounted display camera; and responsive to a determination that the audience profile information matches the profile associated with the first user and the identity of at least the second object at the second location matches the information regarding at least one advertisement-relevance-indicator object, causing the first advertisement to be presented to the first user within the second location.
[0080] In an embodiment, there is a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least one object at the first location based at least in part on the received information derived from sensor data of the head- mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least one advertisement-relevance-indicator object; and responsive to a determination that the identity of the at least one object at the first location matches the information regarding at least one advertisement-relevance-indicator object, causing the first advertisement associated with the information regarding at least one advertisement-relevance-indicator object to be presented to the first user.
[0081] In an embodiment, there is a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least two objects at the first location based at least in part on the received information derived from sensor data of a head- mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least two advertisement-relevance-indicator objects; and responsive to a determination that the identity of the at least two objects at the first location match the information regarding at least two advertisement-relevance-indicator objects, causing the first advertisement associated with the information regarding at least two advertisement-relevance-indicator object to be presented to the first user.
[0082] In an embodiment, there is a system comprising a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including: receiving sensor and context data from a user; performing a 3D reconstruction based on the received sensor data; comparing the 3D reconstruction with at least one previously stored 3D reconstruction; classifying the 3D reconstruction based at least on a reconstruction geometry and at least one other received sensor or context data element; segmenting the 3D reconstruction; detecting objects from the segmented 3D reconstruction; classifying and recognizing the detected objects from the 3D reconstruction; and extracting at least one user characteristic.
[0083] In an embodiment, there is a method of presented target advertising, comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least two objects at the first location based upon the received information derived from sensor data of a head-mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least two advertisement-relevance-indicator objects; responsive to a determination that the identity of the at least two objects at the second location determined based upon the received information derived from sensor data of a head-mounted display camera matches the information regarding at least two advertisement-relevance-indicator objects: causing the first advertisement associated with the information regarding at least two advertisement-relevance- indicator object to be presented to the first user. The method may include wherein the first advertisement is for a third object of a different kind that the at least two objects.
[0084] In an embodiment, there is a method of presented target advertising, comprising: receiving information derived from sensor data of a head-mounted display camera of a first user at a first location; determining an identity of at least one object at the first location based upon the received information derived from sensor data of a head-mounted display camera; receiving information regarding an intended audience for a first advertisement comprising information regarding at least one advertisement-relevance-indicator object; responsive to a determination that the identity of the at least one object at the second location determined based upon the received information derived from sensor data of a head-mounted display camera matches the information regarding at least one advertisement-relevance-indicator object: causing the first advertisement associated with the information regarding at least one advertisement-relevance-indicator object to be presented to the first user. The method may include wherein the advertisement comprises a 3D model of a product and the model is presented to the first user at a first location in the HMD. The method may include wherein the 3D model is presented as a virtual overlay of the determined at least one object. The method may include wherein the advertisement is presented to the first user at a social media site.
[0085] Exemplary embodiments disclosed herein may be implemented using one or more wired and/or wireless network nodes, such as a wireless transmit/receive unit (WTRU) or other network entity. [0086] FIG. 8 is a system diagram of an exemplary WTRU 102, which may be employed as a module of a head mounted display in embodiments described herein. As shown in FIG. 8, the WTRU 102 may include a processor 118, a communication interface 119 including a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touchpad 128, a non-removable memory 130, a removable memory 132, a power source 134, a global positioning system (GPS) chipset 136, and sensors 138. It will be appreciated that the WTRU 102 may include any sub-combination of the foregoing elements while remaining consistent with an embodiment.
[0087] The processor 118 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like. The processor 1 18 may perform signal coding, data processing, power control, input/output processing, and/or any other functionality that enables the WTRU 102 to operate in a wireless environment. The processor 118 may be coupled to the transceiver 120, which may be coupled to the transmit/receive element 122. While FIG. 5 depicts the processor 118 and the transceiver 120 as separate components, it will be appreciated that the processor 118 and the transceiver 120 may be integrated together in an electronic package or chip.
[0088] The transmit/receive element 122 may be configured to transmit signals to, or receive signals from, a base station over the air interface 116. For example, in one embodiment, the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals. In another embodiment, the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, as examples. In yet another embodiment, the transmit/receive element 122 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.
[0089] In addition, although the transmit/receive element 122 is depicted in FIG. 8 as a single element, the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may employ MTMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 116.
[0090] The transceiver 120 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 122 and to demodulate the signals that are received by the transmit/receive element 122. As noted above, the WTRU 102 may have multi-mode capabilities. Thus, the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs, such as UTRA and IEEE 802.11, as examples.
[0091] The processor 118 of the WTRU 102 may be coupled to, and may receive user input data from, the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128 (e.g., a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit). The processor 118 may also output user data to the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128. In addition, the processor 118 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 130 and/or the removable memory 132. The non-removable memory 130 may include random-access memory (RAM), read-only memory (ROM), a hard disk, or any other type of memory storage device. The removable memory 132 may include a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like. In other embodiments, the processor 118 may access information from, and store data in, memory that is not physically located on the WTRU 102, such as on a server or a home computer (not shown).
[0092] The processor 118 may receive power from the power source 134, and may be configured to distribute and/or control the power to the other components in the WTRU 102. The power source 134 may be any suitable device for powering the WTRU 102. As examples, the power source 134 may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd), nickel- zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li -ion), and the like), solar cells, fuel cells, and the like.
[0093] The processor 118 may also be coupled to the GPS chipset 136, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102. In addition to, or in lieu of, the information from the GPS chipset 136, the WTRU 102 may receive location information over the air interface 1 16 from a base station and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 102 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment.
[0094] The processor 118 may further be coupled to other peripherals 138, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity. For example, the peripherals 138 may include sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.
[0095] FIG. 9 depicts an exemplary network entity 190 that may be used in embodiments of the present disclosure. As depicted in FIG. 9, network entity 190 includes a communication interface 192, a processor 194, and non-transitory data storage 196, all of which are communicatively linked by a bus, network, or other communication path 198.
[0096] Communication interface 192 may include one or more wired communication interfaces and/or one or more wireless-communication interfaces. With respect to wired communication, communication interface 192 may include one or more interfaces such as Ethernet interfaces, as an example. With respect to wireless communication, communication interface 192 may include components such as one or more antennae, one or more transceivers/chipsets designed and configured for one or more types of wireless (e.g., LTE) communication, and/or any other components deemed suitable by those of skill in the relevant art. And further with respect to wireless communication, communication interface 192 may be equipped at a scale and with a configuration appropriate for acting on the network side— as opposed to the client side— of wireless communications (e.g., LTE communications, Wi-Fi communications, and the like). Thus, communication interface 192 may include the appropriate equipment and circuitry (perhaps including multiple transceivers) for serving multiple mobile stations, UEs, or other access terminals in a coverage area.
[0097] Processor 194 may include one or more processors of any type deemed suitable by those of skill in the relevant art, some examples including a general-purpose microprocessor and a dedicated DSP.
[0098] Data storage 196 may take the form of any non-transitory computer-readable medium or combination of such media, some examples including flash memory, read-only memory (ROM), and random-access memory (RAM) to name but a few, as any one or more types of non- transitory data storage deemed suitable by those of skill in the relevant art could be used. As depicted in FIG. 9, data storage 196 contains program instructions 197 executable by processor 194 for carrying out various combinations of the various network-entity functions described herein.
[0099] Although features and elements are described above in particular combinations, one of ordinary skill in the art will appreciate that each feature or element can be used alone or in any combination with the other features and elements. In addition, the methods described herein may be implemented in a computer program, software, or firmware incorporated in a computer- readable medium for execution by a computer or processor. Examples of computer-readable storage media include, but are not limited to, a read only memory (ROM), a random access memory (RAM), a register, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD- ROM disks, and digital versatile disks (DVDs). A processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.

Claims

A method of targeted advertising, comprising:
receiving gathered data, from a head mounted display (HMD) of a first user, indicating the presence of a first object at a first-user-controlled first location; and
responsive to detecting that the first user is at a second location:
selecting a targeted advertisement for a second product based on the HMD- gathered data; and
displaying, via the HMD, the selected targeted advertisement for the second product to the first user while the first user is at the second location.
The method of claim 1, wherein the selected targeted advertisement for the second product comprises a 3D model of the second product.
The method of claim 2, wherein the 3D model is presented as a virtual overlay of the second product.
The method of any of claims 1-3, wherein the targeted advertisement for the second product is selected based at least in part on earlier received targeted-advertisement- campaign data.
The method of any of claims 1-4, further comprising receiving information regarding an intended audience for the targeted advertisement for the second product comprising audience profile information and information regarding at least one advertisement- relevance-indicator object.
The method of claim 5, further comprising determining an identity of at least the second object at the second location based at least in part on received gathered data from the HMD of the first user at the second location.
The method of claim 6, wherein selecting the targeted advertisement for the second product further comprises determining that the audience profile information matches a profile associated with the first user and that the identity of at least the second object at the second location matches the information regarding at least one advertisement- relevance-indicator object.
8. The method of any of claims 1-7, wherein the received gathered data further comprises non-sensor data collected by the HMD.
9. The method of claim 8, wherein non-sensor data comprises at least one of current
browser history or active applications.
10. The method of any of claims 1-9, further comprising:
generating a first 3D reconstruction of the first-user-controlled first location based at least in part on the received HMD-gathered data; and
identifying at least one object in the first 3D reconstruction of the first-user- controlled first location based at least in part on the received HMD-gathered data.
11. The method of claim 10, further comprising estimating at least one user characteristic at least in part from generated first 3D reconstruction and the identified at least one object.
12. The method of claim 11, wherein the selected targeted advertisement for the second product is selected based at least in part on the estimated at least one user characteristic.
13. The method of claim 10, wherein the selected targeted advertisement for the second product is selected based at least in part on the identified at least one object in the first 3D reconstruction.
14. A system comprising a processor and a non-transitory storage medium storing
instructions operative, when executed on the processor, to perform functions including: receiving gathered data, from a head mounted display (HMD) of a first user, indicating the presence of a first object at a first-user-controlled first location; and
responsive to detecting that the first user is at a second location:
selecting a targeted advertisement for a second product based on the
HMD-gathered data; and
displaying, via the HMD, the selected targeted advertisement for the second product to the first user while the first user is at the second location.
PCT/US2017/040458 2016-07-08 2017-06-30 System for collecting and extracting information from virtual environment models WO2018009460A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201662360116P 2016-07-08 2016-07-08
US62/360,116 2016-07-08

Publications (2)

Publication Number Publication Date
WO2018009460A2 true WO2018009460A2 (en) 2018-01-11
WO2018009460A3 WO2018009460A3 (en) 2018-03-22

Family

ID=59315785

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/040458 WO2018009460A2 (en) 2016-07-08 2017-06-30 System for collecting and extracting information from virtual environment models

Country Status (1)

Country Link
WO (1) WO2018009460A2 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8670183B2 (en) * 2011-03-07 2014-03-11 Microsoft Corporation Augmented view of advertisements
US20130293530A1 (en) * 2012-05-04 2013-11-07 Kathryn Stone Perez Product augmentation and advertising in see through displays
US9030495B2 (en) * 2012-11-21 2015-05-12 Microsoft Technology Licensing, Llc Augmented reality help

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
ECKART ET AL.: "International Conference on 3D Vision (3DV", 2015, IEEE, article "MLMD: Maximum Likelihood Mixture Decoupling for Fast and Accurate Point Cloud Registration", pages: 241 - 249
HOSEINI-TABATABAEI ET AL.: "A survey on smartphone-based systems for opportunistic user context recognition", ACM COMPUTING SURVEYS (CSUR, vol. 45.3, 2013, pages 27
ONDRUSKA ET AL.: "MobileFusion: Real-time Volumetric Surface Reconstruction and Dense Tracking On Mobile Phones", VISUALIZATION AND COMPUTER GRAPHICS, IEEE TRANSACTIONS ON, pages 1 - 1
PENNACCHIOTTI; POPESCU: "A Machine Learning Approach to Twitter User Classification", ICWSM, vol. 11.1, 2011, pages 281 - 288
SHIRATORI ET AL.: "International Conference on 3D Vision (3DV", 2015, IEEE, article "Efficient Large-Scale Point Cloud Registration Using Loop Closures", pages: 232 - 240
SWADZBA; WACHSMUTH: "A detailed analysis of a new 3D spatial feature vector for indoor scene classification", ROBOTICS AND AUTONOMOUS SYSTEMS, vol. 62, no. 5, 2014, pages 646 - 662

Also Published As

Publication number Publication date
WO2018009460A3 (en) 2018-03-22

Similar Documents

Publication Publication Date Title
Chon et al. Automatically characterizing places with opportunistic crowdsensing using smartphones
US10467674B2 (en) Visual search in a controlled shopping environment
US9460442B2 (en) Sensor data gathering
US9530251B2 (en) Intelligent method of determining trigger items in augmented reality environments
US11361364B2 (en) Shopping recommendation method, client, and server
US8392450B2 (en) System to augment a visual data stream with user-specific content
EP2695415B1 (en) A method for spatially-accurate location of a device using audio-visual information
US20170132821A1 (en) Caption generation for visual media
US10445778B2 (en) Short distance user recognition system, and method for providing information using same
US10482091B2 (en) Computerized system and method for high-quality and high-ranking digital content discovery
US9183546B2 (en) Methods and systems for a reminder servicer using visual recognition
US9064326B1 (en) Local cache of augmented reality content in a mobile computing device
US20150302458A1 (en) Identifying advertisements based on audio data and performing associated tasks
CN104239408A (en) Data access based on content of image recorded by a mobile device
WO2015017204A1 (en) Systems and methods for caching augmented reality target data at user devices
AU2019201132B2 (en) Item recognition
CN103310189A (en) Object identification in images or image sequences
US20210337010A1 (en) Computerized system and method for automatically providing networked devices non-native functionality
CN107665447B (en) Information processing method and information processing apparatus
CN111226213A (en) Reducing search space for biometric authentication systems
CN110799946B (en) Multi-application user interest memory management
WO2018009460A2 (en) System for collecting and extracting information from virtual environment models
Namiot et al. On mobile wireless tags
US11748057B2 (en) System and method for personalization in intelligent multi-modal personal assistants
KR102278693B1 (en) Signage integrated management system providing Online to Offline user interaction based on Artificial Intelligence and method thereof

Legal Events

Date Code Title Description
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17738028

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17738028

Country of ref document: EP

Kind code of ref document: A2