CN109782902A - A kind of operation indicating method and glasses - Google Patents
A kind of operation indicating method and glasses Download PDFInfo
- Publication number
- CN109782902A CN109782902A CN201811543901.XA CN201811543901A CN109782902A CN 109782902 A CN109782902 A CN 109782902A CN 201811543901 A CN201811543901 A CN 201811543901A CN 109782902 A CN109782902 A CN 109782902A
- Authority
- CN
- China
- Prior art keywords
- user
- article
- image
- map
- operating procedure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 239000011521 glass Substances 0.000 title claims abstract description 46
- 238000011017 operating method Methods 0.000 claims abstract description 35
- 230000004424 eye movement Effects 0.000 claims abstract description 17
- 238000012544 monitoring process Methods 0.000 claims abstract description 10
- 210000001508 eye Anatomy 0.000 claims description 49
- 230000006399 behavior Effects 0.000 claims description 43
- 210000001747 pupil Anatomy 0.000 claims description 28
- 230000008569 process Effects 0.000 claims description 24
- 238000004590 computer program Methods 0.000 claims description 14
- 230000003190 augmentative effect Effects 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 5
- 230000003287 optical effect Effects 0.000 claims description 4
- 238000007873 sieving Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 10
- 238000010276 construction Methods 0.000 abstract description 2
- 238000004422 calculation algorithm Methods 0.000 description 49
- 238000004804 winding Methods 0.000 description 27
- 238000001514 detection method Methods 0.000 description 25
- 238000005457 optimization Methods 0.000 description 24
- 230000000007 visual effect Effects 0.000 description 20
- 230000011218 segmentation Effects 0.000 description 19
- 230000008859 change Effects 0.000 description 15
- 210000003128 head Anatomy 0.000 description 13
- 230000033001 locomotion Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 10
- 238000003860 storage Methods 0.000 description 10
- 238000013461 design Methods 0.000 description 8
- 238000012549 training Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000010411 cooking Methods 0.000 description 5
- 230000001186 cumulative effect Effects 0.000 description 5
- 210000005252 bulbus oculi Anatomy 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000004886 head movement Effects 0.000 description 4
- 238000009933 burial Methods 0.000 description 3
- 238000012790 confirmation Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 238000003825 pressing Methods 0.000 description 3
- 238000000638 solvent extraction Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 210000004087 cornea Anatomy 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 210000003786 sclera Anatomy 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- PCTMTFRHKVHKIS-BMFZQQSSSA-N (1s,3r,4e,6e,8e,10e,12e,14e,16e,18s,19r,20r,21s,25r,27r,30r,31r,33s,35r,37s,38r)-3-[(2r,3s,4s,5s,6r)-4-amino-3,5-dihydroxy-6-methyloxan-2-yl]oxy-19,25,27,30,31,33,35,37-octahydroxy-18,20,21-trimethyl-23-oxo-22,39-dioxabicyclo[33.3.1]nonatriaconta-4,6,8,10 Chemical compound C1C=C2C[C@@H](OS(O)(=O)=O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2.O[C@H]1[C@@H](N)[C@H](O)[C@@H](C)O[C@H]1O[C@H]1/C=C/C=C/C=C/C=C/C=C/C=C/C=C/[C@H](C)[C@@H](O)[C@@H](C)[C@H](C)OC(=O)C[C@H](O)C[C@H](O)CC[C@@H](O)[C@H](O)C[C@H](O)C[C@](O)(C[C@H](O)[C@H]2C(O)=O)O[C@H]2C1 PCTMTFRHKVHKIS-BMFZQQSSSA-N 0.000 description 1
- 244000025254 Cannabis sativa Species 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 206010057315 Daydreaming Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 235000020930 dietary requirements Nutrition 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 210000004205 output neuron Anatomy 0.000 description 1
- 231100000572 poisoning Toxicity 0.000 description 1
- 230000000607 poisoning effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000032258 transport Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 239000010048 yiguan Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- User Interface Of Digital Computer (AREA)
- Processing Or Creating Images (AREA)
Abstract
The present invention provides a kind of operation indicating method, apparatus and glasses, are suitable for technical field of data processing, this method comprises: obtaining the image of user's local environment, and the 3D semanteme map based on picture construction building user's local environment;Eye movement identification is carried out to user, judges whether user watches the article for including in 3D semanteme map attentively;If user watches the article for including in 3D semanteme map attentively, the corresponding operation mode of 3D semanteme map is obtained, includes the operating procedure to one or more articles in operation mode;Whether monitoring user meets the requirement of operating procedure to the operation of article;If user is unsatisfactory for the requirement of operating procedure, the corresponding operation indicating of output operating procedure to the operation of article.User no longer needs to carry out any manual operation input, without always actively against screen viewing, it can learn the problems in oneself operation in time, realize the intelligent Matching to user's operation study course, greatly improve user to the determination efficiency of operating mistake.
Description
Technical field
The invention belongs to technical field of data processing more particularly to operation indicating method and glasses.
Background technique
In real life and work, user is frequently necessary to some operation study courses of internet searching to instruct some operations of oneself
It is whether correct, such as search for some cook and picture and text study course and look at whether oneself cooking methods has according to picture and text study course is cooked
Accidentally, or some equipment operation study courses are searched for judge whether equipment operation is wrong, in the prior art, user is to pass through computer
The equipment such as mobile phone are manually entered some keywords to scan for, and corresponding image-text video study course are obtained, further according to these picture and text
Video tutorials gradually compare, but these require user and carry out a large amount of manual operation input, and need user always
It is actively compared against screen, determines the mistake of operation, cumbersome and inefficiency.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of operation indicating method and glasses, to solve to use in the prior art
Family needs a large amount of manual operations to check and compare, and just can determine that mistake present in oneself operation, cumbersome inefficiency
Problem.
The first aspect of the embodiment of the present invention provides a kind of operation indicating method, comprising:
Obtain the image of user's local environment, and the 3D semanteme map based on described image building user's local environment;
Eye movement identification is carried out to user, judges whether the user watches the article for including in the 3D semanteme map attentively;
If the user watches the article for including in the 3D semanteme map attentively, the corresponding operation of the 3D semanteme map is obtained
Mode includes the operating procedure to one or more articles in the operation mode;
Monitor the requirement whether user meets the operating procedure to the operation of the article;
If the user is unsatisfactory for the requirement of the operating procedure to the operation of the article, the operating procedure pair is exported
The operation indicating answered.
The second aspect of the embodiment of the present invention provides a kind of glasses, and the glasses include memory, processor, described to deposit
The computer program that can be run on the processor is stored on reservoir, the processor executes real when the computer program
Now the step of operation indicating method as described above.
Existing beneficial effect is the embodiment of the present invention compared with prior art: the 3D language by constructing user's local environment
Free burial ground for the destitute figure, and there are articles to watch behavior attentively identifying user determines user there are when operation indicating demand, according to 3D semanteme
Operation mode needed for map actual conditions intelligent recognition goes out user (i.e. to the operation study course of article), and according to user to article
Practical operation situation, to be monitored and prompt user's operation so that user no longer needs to carry out any manual operation input,
Without actively being watched against screen always, it can learn the problems in oneself operation in time, realize to user's operation study course
Intelligent Matching, greatly improve user and efficiency known to operating mistake.
Detailed description of the invention
It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to embodiment or description of the prior art
Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only of the invention some
Embodiment for those of ordinary skill in the art without any creative labor, can also be according to these
Attached drawing obtains other attached drawings.
Fig. 1 is the implementation process schematic diagram for the operation indicating method that the embodiment of the present invention one provides;
Fig. 2 is the implementation process schematic diagram of operation indicating method provided by Embodiment 2 of the present invention;
Fig. 3 A and Fig. 3 B are the implementation process schematic diagrames for the operation indicating method that the embodiment of the present invention three provides;
Fig. 4 A and Fig. 4 B are the implementation process schematic diagrames for the operation indicating method that the embodiment of the present invention four provides;
Fig. 5 is the implementation process schematic diagram for the operation indicating method that the embodiment of the present invention five provides;
Fig. 6 is the implementation process schematic diagram for the operation indicating method that the embodiment of the present invention six provides;
Fig. 7 is the structural schematic diagram for the operation indicating device that the embodiment of the present invention seven provides;
Fig. 8 is the schematic diagram for the glasses that the embodiment of the present invention eight provides.
Specific embodiment
In being described below, for illustration and not for limitation, the tool of such as particular system structure, technology etc is proposed
Body details, to understand thoroughly the embodiment of the present invention.However, it will be clear to one skilled in the art that there is no these specific
The present invention also may be implemented in the other embodiments of details.In other situations, it omits to well-known system, device, electricity
The detailed description of road and method, in case unnecessary details interferes description of the invention.
In order to illustrate technical solutions according to the invention, the following is a description of specific embodiments.
To facilitate the understanding of the present invention, first the embodiment of the present invention is briefly described herein, even if for the ease of user
It was found that the operational issue of oneself, the embodiment of the present invention can construct corresponding 3D based on the image of user environment semantically first
Figure, so that it is determined that out user's actually located environment the case where and environment in include object etc., but in view of in actual conditions
Not necessarily just there are operational requirements user is in a certain environment, such as user may only pass through a certain place, or
Although user is among a certain place but does not need to operate, therefore false triggering in order to prevent, in the embodiment of the present invention
Can also eye movement analysis be carried out to user, judge that user whether there is the behavior of watching attentively to article, and exist in user and watch behavior attentively
When, determine user there are the demands of operation indicating, at this time according to the actually located 3D semanteme map situation of user come intelligent
With operation study course needed for user, finally user analyzes the real-time operation of article, and be unsatisfactory for grasping in user's operation
It is corresponding to carry out correct operating procedure prompt when making the requirement of study course, so that user is not necessarily to carry out cumbersome manual operation,
The problem of required operation study course can also be obtained in time, and learn oneself operation in time and correct mode of operation.
Should clearly, specific executing subject in embodiments of the present invention can be according to practical situations demand not
It is same and different, for example, it is either the smart machines such as wearable device (such as intelligent glasses), are also possible to server etc. and set
Standby, when executing subject is wearable device, all data sampling and processings and output operation are by can in the embodiment of the present invention
Wearable device is completed, and when executing subject is that server etc. directly can not carry out the equipment of data acquisition output to user, this
The work for acquiring and exporting to user data in inventive embodiments is completed by other equipment, i.e., data obtain in the embodiment of the present invention
The direct object for taking and exporting not instead of user can carry out the other equipment of data acquisition output, to user with realization pair
The operation indicating purpose of user, for example, be responsible for acquiring the data of user and being sent to server handling by intelligent glasses,
The prompt generated after processing is sent to intelligent glasses by server, then by intelligent glasses final output to user, it is specific to execute
The determination of main body can be selected and be designed according to practical research and development situation and demand by research staff, this sentences executing subject
To be illustrated for intelligent glasses, details are as follows:
Fig. 1 shows the implementation flow chart of the operation indicating method of the offer of the embodiment of the present invention one, and details are as follows:
S101 obtains the image of user's local environment, and the 3D semanteme map based on picture construction user's local environment.
In order to realize that the intelligence to user demand accurately identifies, in the embodiment of the present invention first can to user's local environment into
Row Image Acquisition, and carry out the building of 3D semanteme map.Wherein, 3D semanteme map is exactly the surrounding three-dimensional comprising semantic information
Figure.In the embodiment of the present invention, wide-angle camera can be set in intelligent glasses to carry out the acquisition of ambient image, get
Ambient image and then the building that surrounding three-dimensional map is carried out based on these images, and identify article wherein included and every
Kind article associated property data.Specifically, the method for 3D semanteme map structuring includes but is not limited to as based on stereoscopic vision
3D semantically nomography or other developing algorithms are not limited herein, can specifically be chosen according to actual needs by technical staff
Or setting, or refer to other related embodiments of the invention.
S102 carries out eye movement identification to user, judges whether user watches the article for including in 3D semanteme map attentively.
When user is in a certain environment, it can not illustrate that it there is the demand of operation study course guidance, such as user
A certain place may only be passed through, or be only at a certain place but do not need to operate etc., therefore, if direct basis
User's local environment operates to carry out operation study course matching and prompt etc., may result in the mistake prompt to user, but work as user
When being in a certain environment, and routinely watching some articles attentively, illustrates that user is particularly likely that and need to carry out object manipulation, because
This, the embodiment of the present invention, which can whether there is user, identifies the behavior of watching attentively of article, and can exist determine user
When watching behavior attentively, the operations such as subsequent operation study course matching are just carried out, specifically, meeting in intelligent glasses in the embodiment of the present invention
The devices such as the built-in device, such as eye tracker that eye movement shooting is carried out to user, and eyes figure is carried out to user by these devices
Acquisition as data and the eye image progress eye movement analysis to acquisition, to determine the eye movement what state of user, such as user
Whether blink, whether watching area at which and has and watches article etc. attentively, in embodiments of the present invention, specific eye movement analysis tracking
Method can voluntarily be selected by technical staff, and some existing algorithms both can be used, can also self-setting according to demand, Huo Zhecan
Examine other related embodiment methods of the invention.
In embodiments of the present invention, it needs to confirm that user whether there is the behavior for watching article attentively, that is, is carrying out eye movement identification
When not only need to identify the article that user sees, it is also necessary to be further confirmed whether to see some article (i.e. there are duration
Watch article attentively) behavior, therefore in the embodiment of the present invention, on the basis of determining article that user sees, it is also necessary to which statistics is used
The time of article is seen at family, and judges whether to watch article attentively according to this, specifically refers to the embodiment of the present invention three.
S103 obtains the corresponding operation mode of 3D semanteme map, behaviour if user watches the article for including in 3D semanteme map attentively
Include the operating procedure to one or more articles in operation mode.
When user is in a certain environment and routinely watches some articles attentively, illustrates that user is particularly likely that and need to carry out
Object manipulation, at this point, the embodiment of the present invention can the 3D semanteme map to user environment carry out analyze determine its actually located ring
Operation that user can be carried out and right is estimated in border situation, kitchen in this way or equipment cabinet etc. further according to ambient conditions
Answer required operation study course.In the embodiment of the present invention, operation study course either technical staff preset it is multiple, further according to 3D
Semantically figure scene type etc. is chosen, operation study course create-rule that can also be certain by the default setting of technical staff,
Further according to 3D, semantically figure scene type etc. is obtained to operate the generation of study course.
As an embodiment of the present invention, when progress operation mode determines, comprising:
Identify the corresponding scene type of 3D semanteme map.
Based on the article for including in scene type and 3D semanteme map, corresponding operation mode is obtained.
In embodiments of the present invention, it is contemplated that different demands of the user under different actual scenes, it can be simultaneously according to user
The corresponding scene type of 3D semanteme map and the article for actually including judge that the actual capabilities demand of user, such as user are in
Among kitchen, and include abundance of food, then illustrate that user very likely cooks, can be generated at this time more corresponding
The study course cooked.Wherein, specific scene type recognition methods and item identification method not limit herein, can be by technology people
Member sets according to actual needs.
As another embodiment of the invention, when progress operation mode determines, comprising:
It identifies the corresponding scene type of 3D semanteme map, and obtains the user data of user.
Based on the article and user data for including in scene type, 3D semanteme map, corresponding operation mode is obtained.
Even the actual demand of different user is also in view of in the case where identical scene article having the same
It is that there may be certain differences, such as even if be among kitchen and include abundance of food, but the taste of each user can
Can be different, corresponding cooking course also certainly exists bigger difference, therefore, according only to user's local environment scene type with
And comprising article situation be difficult to realize accurately identify user's operation study course demand sometimes.In embodiments of the present invention, meeting
Some user data of the article and user that according to the scene type of environment, actually include simultaneously determine the reality of user
Operational requirements, and filter out corresponding operation study course.Wherein the particular content of user data can by technical staff or user from
Row setting, the including but not limited to pre-set some behaviour of the personal information of such as user and taste data or user oneself
Make demand, such as user presets the dietary requirements of oneself.
Whether S104, monitoring user meet the requirement of operating procedure to the operation of article.
After the operation study course needed for getting user, the embodiment of the present invention starts to supervise the practical operation of user
Identification is surveyed, and is compared with the operating procedure among operation study course, judges whether user's practical operation meets operation study course
It is required that so that it is determined that operational deficiencies existing for user out.Wherein, in order to realize the monitoring identification to user's operation, the present invention is real
It applies in example, the camera in intelligent glasses can shoot the behavior of user, and can carry out behavior to obtained image/video
Whether analysis, the i.e. behavior act of analysis user and behavior sequence etc. meet the requirement in operation study course.
S105, if user is unsatisfactory for the requirement of operating procedure to the operation of article, the corresponding operation of output operating procedure is mentioned
Show.
When the operation of operating procedure requirement in operation study course occurs being unsatisfactory in user, it is wrong to illustrate that operation occurs in user
Accidentally, therefore at this time the embodiment of the present invention can carry out operation indicating to user, i.e., there are mistakes for informing user's current operation, and can be same
When inform the correct operating procedure of user, for example, after pressing power button, needing use pattern when operating to equipment
After key selected equipment mode, then by start button carry out, and if user after pressing power button, directly just press start button, at this time
Discovery of embodiment of the present invention user's operation is unsatisfactory for the requirement of operation study course, will prompt user, should not need by start button
It will be by mode key selected equipment mode.Wherein, the way of output of operation indicating includes but is not limited to such as audio/video/text
Prompt, specifically can be by technical staff's sets itself.
As a kind of specific implementation for carrying out prompt output in the embodiment of the present invention one, comprising:
It identifies watching area of the user in 3D semanteme map, and carries out the augmented reality of operation indicating based on watching area
Output.
In the embodiment of the present invention, it can be prompted using augmented reality (Augmented Reality, AR) format technology
The output of information, so that the problem of user more can intuitively know oneself and corresponding correct operation.
The embodiment of the present invention passes through the 3D semanteme map of building user's local environment, and there are article notes identifying user
Depending on behavior, that is, user is determined there are when operation indicating demand, needed for going out user according to 3D semanteme map actual conditions intelligent recognition
Operation mode (i.e. to the operation study course of article), and according to user to the practical operation situation of article, to be monitored and mention
Show user's operation, so that user no longer needs to carry out any manual operation input, without always actively against screen viewing
To learn the problems in oneself operation in time, the intelligent Matching to user's operation study course is realized, user couple is greatly improved
Operating mistake knows efficiency.
As a kind of specific implementation of 3D semanteme map structuring in the embodiment of the present invention one, in the embodiment of the present invention
In, the ambient image for needing to acquire includes the color image and depth image of environment, as shown in Fig. 2, the embodiment of the present invention two, packet
It includes:
S201 is based on color image and depth image, obtains locating for location information and posture information and the user of user
The location information and Item Information of article in environment.
In embodiments of the present invention, intelligent glasses obtain color image and depth map using RGB-D camera as sensor
Picture, using vision SLAM algorithm complete intelligent glasses autonomous positioning and pose estimate and optimization (i.e. the location information of user and
Posture information obtains), while the semantic information that Articles detecting obtains environment is carried out, then it will be perceived using RGB-D partitioning algorithm
To article split, thus the 3D semanteme map of constructing environment.Specifically, vision SLAM basic framework is by vision mileage
Meter, rear end optimization, winding detection and three-dimensional are built four parts of figure and are formed.Visual odometry will be completed to move between adjacent two field pictures
Estimation, roughly estimate the current pose of camera.It is globally consistent that rear end optimization seeks to the progress of the estimation to visual odometry
Optimization, eliminate noise jamming, furthermore with winding detect constrained optimization pose, make positioning and pose estimation it is more accurate.
Winding detection is to eliminate the cumulative errors of process when coming back to the position passed through originally.The building of map is based on first three portion
Movement and the pose for dividing estimation, create the three-dimensional map of environment.
Vision SLAM algorithm whole design is realized by sensor input data to the globally consistent of current location and posture
Estimation, is defining for the positioning and movement to itself.Vision SLAM algorithm is by visual odometry, rear end optimization and winding detection three
A partial cooperative is completed.
Visual odometry utilizes matching using the adjacent two frames ORB feature of FLANN algorithmic match by extracting ORB characteristic point
As a result the rough estimate for completing intelligent glasses position and posture is combined using PnP algorithm and RANSAC algorithm.Visual odometry
Task be estimate intelligent glasses between two field pictures pose variation, estimate intelligent glasses for a period of time in posture and
Motion profile.It is made of following process:
1, feature extraction
We extract characteristics of image by the way of ORB feature extraction.ORB feature is described by FAST key point and BRIEF
Son is constituted, and artificially imparts rotation and scale invariability.
The process of feature extraction:
1) coarse extraction, for certain point p, pixel value Ip, 16 points on the circle for being 3 as center radius using p, if
There are the pixel values of 12 or more points to differ in threshold value with Ip, then it is assumed that the p is candidate's FAST key point.
2) comentropy of each subset is calculated.Using information gain as evaluation criterion, the highest pixel of value is set as determining
The root node of plan tree, and continue to be iterated its subset, the property until determining the point, i.e. FAST key point or non-FAST are closed
Key point.Then ID3 decision tree is just generated, optimal FAST key point is filtered out using the tree.
3) thought of non-maxima suppression is utilized, in subrange, keep score highest FAST key point, deletes it
The lower FAST key point of his score traverses one time, screening can be completed.
4) characteristic dimension and rotational invariance, are assigned.Scale invariability is realized using pyramid principles, by image drop sampling
Processing obtains image pyramid, all completes above-mentioned four steps feature extraction to its each layer, realizes the Scale invariant of FAST key point
Property.Rotational invariance is realized by gray scale centroid method, is calculated using key point as the mass center of center image block U, from center to matter
The vector of the heart is defined as the direction of key point, realizes the rotational invariance of FAST key point
Description: improvement is made on the basis of BRIEF.First is that considering all the points in 31 × 31 neighborhoods of key point, will scheme
After carrying out Gaussian smoothing filter, chooses gray average in 5 × 5 neighborhoods and a single point gray scale is replaced to be calculated, noise immunity is strong.Two
Uncorrelated greedy search algorithm when being 5 × 5 neighborhood of selection using mean value close to 0.5, ensure that description it is representative and
Uniqueness makes it have distinction.
2, characteristic matching
Quick approximate KNN (FLANN) algorithm is selected, and core concept is search range to be determined using export index structure
Characteristic matching is completed in adjacent domain, can effectively accelerate matching speed in characteristic proximity region in position.It is retouched using BRIEF
The characteristics of son is made of 0 and 1 is stated, uses local sensitivity Hash as export index structure.Feature is thrown in the same way
Shadow to hash space, after original two adjacent Projection Characters still the adjacent probability of hash space also can be very high, and it
It, in this way can be in hash space neighborhood in the probability meeting very little that hash space is adjacent after preceding non-conterminous two Projection Characters
It is matched, effectively reduces range.
3, pose algorithm for estimating designs
After extracting and matching the feature of two consecutive frames, the movement and appearance of intelligent glasses are estimated using matching relationship
State.For quantitative estimation intelligent glasses movement and pose it may first have to understand that intelligent glasses imaging is several with the mathematics of spatial point
What relationship.The process of intelligent glasses imaging is also referred to as observation process, i.e. point reflection in three-dimensional space or transmitting light, passes through
The optical center of intelligent glasses projects to the process on the imaging plane of intelligent glasses.
The present invention carries out the rough estimate of two interframe poses using PnP algorithm, and RANSAC algorithm is recycled to carry out the pose
Interframe consistency optimization, avoid error hiding feature to pose estimation cause to seriously affect.Using RANSAC algorithm to the problem
It optimizes.
The ORB characteristic point for extracting image first, the characteristic matching of adjacent two field pictures is carried out using quick nearest neighbor algorithm,
Finally camera motion and Attitude estimation are completed using PnP algorithm and RANSAC algorithm.
4, rear end optimizes
Due to data noise, error hiding, the influence of the factors such as error is calculated, the error of pose estimation is will cause, transports for a long time
Error is built up when row, can seriously affect system performance.Rear end optimization seeks to the estimation to visual odometry and carries out the overall situation
Noise jamming is eliminated in consistent optimization, keeps pose estimation more accurate.In addition, being passed the information on after system detection is to winding
To rear end, cumulative errors are eliminated.
The embodiment of the present invention introduces crucial frame mechanism, selects representational picture frame to carry out pose optimization, reduction need not
The calculating wanted.For local pose optimization problem, adjustment algorithm is collected using bundle, optimization camera is currently in the posture of convergence process
And characteristic point.When system detection goes out winding, rear end uses pose figure optimal way, obtains globally consistent track and pose.
Rear end optimization receives the pose and characteristic point that visual odometry transmits, and is optimized by the way of bundle collection adjustment.Workflow
Are as follows: check queue, processing key frame, point map is rejected, generates and merged, part bundle collection adjusting and optimizing pose.
Meeting following four principles is key frame by image setting:
(1) it since camera acquisition data frame frequency is higher, is had to pass through between current key frame and previous frame key frame certain
Train interval.
(2) rear end optimizes part not in the operating condition.
(3) current key frame is lower than a certain range with the mutual zone of mutual visibility domain of selectively all key frames before.
(4) present frame possesses enough characteristic points and matching, guarantees the rich of feature.
5, winding detects
Winding detection, mainly solves the problems, such as that intelligent glasses pose evaluated error gradually accumulates.When phase in coming once again
Through to cross place when, which confirm and establishes the connection relationship between current pose and history pose, pass to rear end progress it is excellent
Change processing, the accumulated error of system long-play is eliminated, globally consistent track and pose is obtained.On the other hand, winding
Detection provides being associated with for current data and all historical datas, in tracking loss of the visual odometry to feature, Ke Yili
It is relocated with winding detection, the robustness of enhancing pose estimation.
Winding detection is persistently detected in system operation, and the estimation of intelligent glasses pose is eliminated by the way that the constraint of winding occurs
Cumulative errors.Closed loop the constraint relationship when camera is returned to the position that some once came passes to rear end, and to carry out pose figure excellent
Change.The workflow of winding detection includes the following steps: (1) the detection of winding candidate frame.Step 2 is contacted with the foundation of preceding key frame.Step
3, it detects whether that winding, if not occurring, return value step 1, if entering step 4 occurs.Step 4, the optimization of pose figure.
5.1, the foundation of casette model
Feature is regarded as word one by one, training includes the dictionary of all characteristic types in advance, to the feature of each image
According to the set of one equivalent of dictionary creation, that is, bag of words.Then, as long as judging to compare it when the similarity degree of image
Bag of words, greatly accelerate winding detection speed.Feature clustering is calculated using unsupervised machine learning K-means++
Method improves search efficiency using the structure of K-d tree.
The training process of dictionary are as follows:
1) in root node, all samples is divided into k class with above-mentioned K-mean++ algorithm, obtain first layer.
2) sample for belonging to the node is equally clustered again as k class with K-mean++, is obtained by each node to each layer
New one layer.
3) and so on, leaf node layer to the last.The leaf layer is exactly the corresponding word of feature.
5.2 similarity calculating method
Introduce TF-IDF algorithm.If the similarity of present frame and some key frame before is more than present frame and Shang Yiguan
3 times of the similarity of key frame, are considered as may have occurred winding.But there is still a need for a verification steps, set up the slow of winding detection
Mechanism is deposited, single similarity height is not enough to be judged as winding, and when the similarity of successive frame is all very high, just winding occurs for confirmation.
After winding occurs for confirmation, winding detection part sends this information to rear end, and rear end carries out pose using figure optimal way excellent
Change, eliminates cumulative errors, obtain globally consistent track and pose.
The summary of vision SLAM algorithm overall flow:
The design of vision SLAM algorithm consists of three parts: visual odometry, rear end optimization, winding detection.Visual odometry
The ORB feature of sensor input frame is extracted in part first, carries out the spy of adjacent two field pictures using quick nearest neighbor algorithm later
Sign matching is combined using PnP algorithm and RANSAC algorithm and completes the estimation of camera pose.Rear end part transmits visual odometry
Pose and characteristic point are optimized by the way of bundle collection adjustment.Winding detection is detected using bag of words and trained dictionary
The position whether camera arrived before returning to sends rear end for this constraint information, rear end is using figure optimization in case of winding
Mode optimizes pose, eliminates cumulative errors, ensure that the global coherency of camera track and pose.
It, first can be from cromogram in the embodiment of the present invention in order to realize that Item Information and location information to article obtain
Articles detecting is carried out as in, wherein detection algorithm includes but is not limited to such as Yolv V3 object detection algorithms, it not limits herein,
And article segmentation is carried out from color image, details are as follows:
The depth map of respective pixel, that is, three-dimensional point cloud are also obtained while obtaining color image.Therefore, it establishes
The object segmentation for needing will test before semantic map comes out, and recycles the camera pose of the estimation of vision SLAM module and optimization
Pixel is projected into the position in space, to construct three-dimensional semantic map.We pass through a kind of improvement GrabCut algorithm
Realize target RGB-D segmentation, the geometrical plane of algorithm combination CPF (Constrained Plane Fitting) algorithm segmentation
Information improves the segmentation effect of GrabCut, realizes the RGB-D segmentation of objective.Divided first using GrabCut algorithm
Image recycles CPF algorithm to divide three-dimensional point cloud, finally using point cloud segmentation result as filter, rejects image segmentation result
In do not meet the pixel of object space geometrical relationship, complete the RGB-D segmentation of target.
S202 constructs 3D according to the location information of user and the Item Information and location information of posture information and article
Semantic map.
So far the whole information needed for constructing semantic map have been obtained, including the intelligent glasses position of the key frame optimized
Set and posture (i.e. the location information and posture information of user), key frame in article classification (i.e. Item Information) and its position and
The three-dimensional segmentation of the target object detected.Next what is done is exactly the globally consistent three-dimensional language of information architecture of Integration obtaining
Free burial ground for the destitute figure, substantially process are divided into three steps: carrying out the consensus of data first and update target object model, then construct ring
Semantic information, is finally fused in three-dimensional map by the three-dimensional map in border, obtains the semantic map of the three-dimensional comprising abundant information.
The developing algorithm research and design of semantic map.A kind of target RGB-D segmentation for improving GrabCut is devised first
Algorithm combines GrabCut segmentation cromogram with the information of CPF segmentation depth map, and the Target Segmentation that will test comes out, complete
The Objective extraction work of figure is built at the semanteme as unit of object.And marked according to object category, recycle target object
Consensus update subject model, avoid the multiple modeling to same target.And then with colored Octree map
Structure constructs and stores the semantic map of three-dimensional comprising abundant information.
1, data correlation and model modification algorithm design
Whether the effect of data correlation determines the target in map when being the result for obtaining target after RGB-D is divided
In, it needs to add new object and still existing object is safeguarded, avoid repeatedly modeling same target in map
There is ghost image.
Firstly for detecting each time, the Euclidean distance based on each mass center for putting cloud after segmentation selects one group of candidate target
Boundary marker.Then the three-dimensional point of to map existing boundary mark and current goal carries out nearest neighbor search, and calculates reference point
Pair Euclidean distance.The Euclidean distance of two three-dimensional points i.e. the 2- norm of two o'clock.
If the three-dimensional point for having more than half in target is all less than certain threshold value with the existing target range of map, it is believed that
The target with the existing target of map be it is same, current goal information is associated with the existing target of map, to safeguard jointly
Object module.In addition, the nearest neighbor search of three-dimensional point is accelerated by seeking k-d tree structure when local environment is more complex.For
Guarantee data correlation can get newest information, using designing improved RGB-D target above as long as detecting object
Partitioning algorithm is split.Each object in this way in map retains three information: the mesh obtained by data correlation
Mark model observes the probability of all categories that the key frame pose of the target and module of target detection provide.Target object in map
Probability can be updated according to probability value that module of target detection provides, if current detect C type objects altogether, Sc indicates the target
The vector of probability composition of all categories, n are the key frame number for detecting this target, and target detection probability, which updates, to be calculated:
Then the generic of the target is max (Sc) in map, and confidence level is p=max (Sc) ln, in semantic map
The mark of target category and probability provides information.
2, the building and storage form of semantic map
Present invention use is flexible, it is small to account for amount of ram and supports the map view of real-time update: Octree map
Entire space is divided into eight child nodes according to space coordinates as root node, is further continued for each child node point
For eight child nodes, required resolution ratio, i.e. leaf node are assigned to always.Octree map is different from the voxel model of point cloud chart
Point is, when all the points are all occupied or prevent take up in certain square, it is not necessary to this node be unfolded, comparatively occupy
Memory headroom is very small.And the speed for searching for leaf node is very fast, and d layers of Octree time complexity is O (d).In addition,
Octree map can support the color that each node is arranged, that is, colored Octree map view, while support at any time more
New and update information is highly suitable for constructing three-dimensional semantic map.Therefore, select colored Octree map as three-dimensional language herein
The building and storage form of free burial ground for the destitute figure.
The three-dimensional semantic map of building, initially sets up the three-dimensional map and continuous updating of environment, then in real time by semantic information
It is fused in three-dimensional map, it can three-dimensional semantic map of the building comprising abundant information.During camera motion constantly
Ground obtains information and handles, and then is continuously updated semantic map.
Three-dimensional build figure work be according to after key frame is estimated and is optimized in vision SLAM algorithm position and posture, will
RGB-D cameras capture to depth information be mapped in three-dimensional space, establish the three-dimensional map of environment.It is used due to the present invention
The depth of each pixel in the available visual field of RGB-D camera, can directly using depth map carry out it is dense build figure, according to optimization
Camera pose, splice after depth map is mapped as a cloud, obtain three-dimensional map.
Semantic map is exactly the three-dimensional map comprising semantic information, that is, marks out environment when establishing Octree map
Semantic information.It incorporates in octree structure, is just obtained comprising rich when establishing three-dimensional Octree map, while by semantic information
The semantic map of the three-dimensional of rich information.
Semanteme of the present invention builds drawing system using RGB-D camera as visual sensor, captures colour information and depth information.Benefit
The autonomous positioning of AR intelligent glasses is completed with vision SLAM algorithm and pose estimates and optimization, obtains globally consistent track and position
Appearance.Target detection is carried out using convolutional neural networks YOLOv3 model simultaneously, detects the object category occurred in key frame, probability
And its position, obtain the semantic information of environment.Then the object segmentation perceived is come out using RGB-D partitioning algorithm, is selected
The three-dimensional map of Octree map view constructing environment.Finally, semantic information is incorporated the Octree map, the three of environment are completed
Tie up the building of semantic map.
The embodiment of the present invention devises the vision SLAM algorithm using RGB-D camera as sensor.The present invention selects ORB feature
As the basis of algorithm, space geometry relationship not only is provided in the pose estimation of visual odometry, when also detecting as winding
The standard of image similarity judgement, realizes the uniformity of system to a certain extent.The pose estimation of design of the embodiment of the present invention
In algorithm, ICP algorithm is substituted using PnP algorithm, PnP algorithm is sat using the good camera coordinates of previous frame optimization and current frame pixel
Mark calculates pose, avoids the interference of camera measurement error.Rear end optimization part is using targetedly algorithm process, to vision mileage
The optimization for counting the pose and point map that transmit collects adjustment algorithm using part bundle, when detecting the closed loop constrained optimization transmitted to winding
Using the method optimizing pose of figure optimization.
Devise the semantic overall structure for building drawing system.A kind of target RGB-D segmentation for improving GrabCut is devised herein
Algorithm, the CPF segmentation result amendment for being combined depth point cloud information is divided merely with the GrabCut of color image information, real
The complementation of both existing performance.The Object Segmentation of target detection is come out, and is marked according to object category, comprehensive visual SLAM
Detection and the segmentation result of camera pose and target that algorithm obtains, with the building of colored Octree map structure and storage environment
Three-dimensional semanteme map.Operating system in laboratory environments, system is while self poisoning, Attitude estimation, Semantic Aware, structure
Readable and accurate three-dimensional semantic map is built out, the feasibility that semanteme of the present invention builds figure scheme is demonstrated.
As a kind of specific implementation for watch attentively to user Activity recognition in the embodiment of the present invention one, such as Fig. 3 A institute
Show, the embodiment of the present invention three, comprising:
S301 obtains the eyes image of user, carries out Pupil diameter to eyes image, and based on obtained pupil position letter
Cease the watching area for determining user in 3D semanteme map.
In the embodiment of the present invention, the tracking of user's sight can be carried out based on the pupil of user and Purkinje image, and determine
The region that user watches attentively out, therefore firstly the need of the position for determining pupil in eyes.Wherein, specific pupil recognizer can
By technical staff's sets itself according to demand, the sample of pupil image is including but not limited to such as carried out using neural network model
Data training, and identify the pupil in eyes image, or is identified with reference to the embodiment of the present invention four.Due to being based on pupil
It is more mature come the technology for carrying out the tracking of user's sight with Purkinje image, therefore it will not go into details herein.
S302 identifies the article for including in watching area, and counts the article for including in watching area and be look in region
Continued presence duration.
S303 determines that user watches 3D attentively semantically if there is the duration greater than preset duration threshold value in continued presence duration
The article for including in figure.
After determining the watching area of user, it is also necessary to further determine the article for including in watching area,
And each article is look at continuous duration existing for region (i.e. user continuously watches duration attentively to article), if the duration compared with
It is long, then illustrate that user has the behavior for watching a certain article attentively.Wherein, the occurrence size of preset duration threshold value, can be by technology people
Member's sets itself.
As an embodiment of the present invention, it is contemplated that user's head can move, therefore when carrying out above-mentioned Eye-controlling focus,
Sight can change, and in order to realize more accurate Eye-controlling focus, include: in the embodiment of the present invention
The proper motion on head includes two basic exercises: the pitching movement of vertical direction and the left and right fortune of horizontal direction
It is dynamic.The general mapping model based on quadratic polynomial is that user is calibrated in the case where keeping head stationary by multiple spot
It obtains, when head position changes, the estimation which obtains is watched point tolerance attentively and greatly increased.It is mentioned on the basis of herein
A kind of dynamic solution annual reporting law of head based on polynomial map, the algorithm need to obtain head in real time using head motion tracking equipment out
The information of movement, it is preliminary that the polynomial map model obtained when in the case where head is dynamic first with calibration carries out blinkpunkt position
The estimation point coordinate is combined initial eye position to establish three-dimensional direction of visual lines, recycles head movement information to this by estimation
Direction of visual lines carries out rotation and translation compensation, using the intersection point of current compensated direction of visual lines and screen as final blinkpunkt
Estimated coordinates.Furthermore, it is contemplated that the influence of head movement is derived from head movement the position of eyes is changed in fact,
So we only it is to be understood that the situation of change of eye position sight can be compensated.
Illustrate that the sight established under the proper motion of head estimation model indicates in figure as shown in Figure 3B according to the above several points
The side plan view of one sight estimation principle, O1 indicate that initial eye position, O2 indicate the position of eyes after movement.Initial
Position is corresponding to obtain pupil cornea vector pccr1 when watching point S1 on screen attentively, and head is to the right after Y-axis rotation alpha angle, eye
Eyeball moves on to position O2, it is assumed that pccr2 is equal to pccr1, i.e. any variation does not occur for eye figure feature, then corresponding sight will be sent out to the right
Raw deflection, the intersection point with screen is we assume that be S2, this is blinkpunkt position at this time, but since pupil cornea vector does not occur
Variation, carrying out sight with pccr2, to estimate timing error larger.The point according to a preliminary estimate that we estimate the point as sight to this, will
The initial position of the point and eyes establishes direction of visual lines g1, recycles the angle information of head movement to be modified g1, will repair
The intersection point of direction of visual lines and screen after just is current blinkpunkt estimation point.
As a kind of implementation positioned in the embodiment of the present invention three to pupil, as shown in Figure 4 A, the present invention is real
Apply example four, comprising:
Eyes image is divided into N × M area image, and to all areas image grayscale binary conversion treatment, obtained by S401
To corresponding N × M eye gray value, wherein N and M is positive integer.
As shown in Figure 4 B, details are as follows for the eye feature in the embodiment of the present invention, and basic rectangular block is in the same size in figure,
ABCD is the rectangular characteristic of most original in figure, E by 3 it is generally rectangular form, F is made of 9 rectangles, and G is a rectangle, H
With I by 4 it is generally rectangular constitute, J is made of 12 rectangles, and K and L are made of 4 rectangles, and the calculating of each rectangular characteristic is all figure
In black portions pixel and subtract white portion pixel and, feature G here is a single rectangular characteristic, so only counting
Calculate rectangle in pixel and.Eye feature is designed based on eye structure, since the brightness of canthus and surrounding is varied,
Canthus is darker relative to the pixel of surrounding, and feature F can be very good to show this feature, and the central point of eyeball is substantially presented
Black, so simple rectangle G has meant that obvious picture twice can occur for this characteristic of eyeball, the horizontal direction of eyes
Element mutation, from sclera to iris, then by iris to sclera, feature H can reflect this variation, similarly the Vertical Square of eyes
Also there is similar grey scale change feature upwards, I feature is namely based on this feature and generates, they and C and E altogether can
Enhancing the description to eye level and vertical grey scale change feature, J illustrates the grey scale change situation of part between canthus and eyeball,
K, L is demonstrated by the marginal information at canthus.After increasing new feature the eye feature quantity for classification is reduced, makes eye detection
Become to be more easier.
The Pupil diameter of the embodiment of the present invention uses the independent design philosophy of frame.Examine the discovery of human eye screenshot, people
Many parts of eye picture signal can with incident ray, different people, eyes mirror-reflection and change, moreover, with face
The difference of corner, the relative position appeared in screenshot is also different, and directly progress classification based training effect is not fully up to expectations.Through excessive
Secondary comparison, it has been found that pupil region is image information more stable in human eye screenshot, the feature of this part when opening eyes
Obviously.Therefore we first position pupil and then simplify the complex nature of the problem.
In the embodiment of the present invention, N and M are positive integer, and occurrence size is by technical staff's sets itself, with M=N=
It is illustrated for 10, eyes image can be divided into 10 × 10 area images at this time, and each area image can be calculated
Gray value obtains 10 × 10 eye gray values.
S402, obtains the skin image of user, and calculates skin image binarization of gray value treated average gray value.
In order to realize the tradeoff between colour of skin gray scale and minimum gray scale, the embodiment of the present invention can also acquire the skin figure of user
Picture, and calculate its corresponding average gray value.Wherein skin image can be the skin image of circumference of eyes preferably.
S403, according to the sequence of absolute difference from small to large, from the difference of corresponding eye gray value and average gray value
Value absolute value, which is less than in the area image of default gray threshold, carries out optical sieving, until the area image that screening obtains includes
Pixel number is in preset quantity threshold range, obtains the corresponding area image of pupil, to determine pupil in eyes image
Position.
Obtaining the average gray value of the corresponding eye gray value of each area image and the skin being calculated
Later, the embodiment of the present invention can calculate separately the eye gray value of each area image, the difference with the average gray value of skin
Absolute value, and the area image for meeting default gray threshold requirement can be successively filtered out according to sequence from small to large, and every
It is secondary filter out an area image after, the pixel number that primary all area images filtered out include is counted, until screening
The pixel number that area image out includes is in position within the scope of preset quantity, accurate and reliable to guarantee to identify pupil.
Wherein, the occurrence size for presetting gray threshold and preset quantity threshold value, can there is technical staff's sets itself.
As a kind of concrete methods of realizing for carrying out scene Recognition in the embodiment of the present invention one to 3D semanteme map, the present invention
Embodiment includes:
Individually classified to realize to every picture, there is used herein Places205-AlexNet network models, should
Performance of the model in various benchmark datasets is more than other methods.Places205-AlexNet network model follows
The network architecture identical with AlexNet but its targetedly trained in scene classification task.The training dataset packet
About 2,500,000 pictures of 205 semantic categories, and every a kind of at least 5000 samples pictures are contained.These image credits are in each
Kind of Internet resources, such as Google's picture, (Bing), Flickr must be answered, and picture is subjected to classification annotation.The sample of training dataset
This value volume and range of product ensures that resulting classifier can be extensive well, and when it is applied from untrained environment
It need not carry out second training or fine tuning.Which ensure that the semanteme of this paper build drawing system be it is transplantable, can be by a variety of contexts
The user of operation uses.
The input of Places205-AlexNet network model is adjusted to the RGB picture of 224 × 224 × 3 pixels, and
It is unrelated with their original dimension.Places205-AlexNet convolutional neural networks model shares 8 layer networks, including preceding
5, face convolutional layer and below 3 full articulamentums.The picture It of given current scene, the output layer soft-max output of the network
Discrete probabilistic in 205 known scene classes divides p (ot|It).The classifier of this paper uses Places205-AlexNet network
As a feature vector, fc7 layers are that the last one in network is general (i.e. unrelated with class) complete for the output of the fc7 layer of model
Articulamentum.Since the purpose of Places205-AlexNet network model design is in order to Places205 contextual data concentration
205 scene types are identified, therefore last fc8 layer has 205 output neuron nodes with prob layers.
Assuming that given current image It, the output layer prob of the network exports the discrete probabilistic on 205 known class Ci
Distribution p (ot|It).It enablesIndicate the mix vector of known scene class label.Then the corresponding combination of definition
Likelihood are as follows:
Wherein indicate that the image It of t moment belongs to the probability of scene class Ci, and mutually indepedent between Ci.Since camera obtains
Adjacent two picture be continuous in time, therefore recursive Bayesian filter technology can be applied.Herein by robot
Scene classification problem is described as a probability Estimation problem and estimates the discrete probabilistic point on all possible scene tag Ci
Cloth wherein it is all from the observation pictures of past till now be known.Assuming that meeting single order Markov property, then can be obtained such as
Under Bayesian filter formula:
After scene classification problem is considered as a Bayesian estimation problem, other information money can be naturally enough integrated
Source.For example, indoors due to the work of indoor service robot, the outdoor scene in 205 class scenes substantially can not quilt
It observes.
As the embodiment of the present invention five, it is contemplated that in actual conditions, meet scene type, article situation and user data
It is required that operation study course there may come a time when it is multiple, at this point, in order to ensure finally determining operation study course is that user is actually required, such as
Shown in Fig. 5, the embodiment of the present invention includes:
S501 obtains multiple modes of operation.
S502, the mode for obtaining user couple chooses instruction, and chooses instruction based on mode, filters out from multiple modes of operation
A kind of operation mode.
When there are many operation study course met the requirements, the embodiment of the present invention can voluntarily be chosen one needed for it as user
Carry out subsequent operation.Wherein, mode chooses the input mode of instruction, including but not limited to as user passes through voice/eye movement/head
The modes such as posture changing are inputted, such as the relevant informations of a variety of operation study courses, Yong Hutong are shown in intelligent glasses
Crossing voice selection, eye movement operation (such as blink, watch attentively) or head pose variation, (handover operation study course of such as shaking the head is nodded really
Recognize, instruction input chosen with implementation pattern), the side that can carry out instruction interaction need to be specifically set according to actual needs by technical staff
Formula.
As the information exchange implementation of eye movement always of user in the embodiment of the present invention and intelligent glasses, the present invention is implemented
User can carry out a variety of eye movement operations in example, to carry out information exchange operation with intelligent glasses, comprising:
1) judge whether to be reading behavior:
For marking the main blinkpunkt of reading behavior (having confirmed that fixation object, do not need more recognition times),
In time span preset range, preset range is preferably 600-1100 milliseconds.
2) judge that user returns view behavior:
View can be defined back by pan data, i.e., it is the center of circle, 1 field of regard that blinkpunkt coordinate, which falls in preceding 5 blinkpunkts,
Domain be radius constitute space in, but the blinkpunkt preceding 1 blinkpunkt not counting.
3) judge whether it is the behavior for changing global focus:
Whole focus change is that (more than duration preset duration, preferably 1100 in the least in upper one main blinkpunkt
Second or more) afterwards pan length be more than 3 watching areas or more.
4) judge the behavior that local focus changes:
Local focus change is that (more than duration preset duration, preferably 1100 in the least in upper one main blinkpunkt
Second or more) after of length no more than 3 watching areas of multiple pans, but its position change total distance has been more than 3 watching areas
Behavior.
5) to the judgement of search behavior:
Search behavior: when whole focus change occur or local focus changes behavior, it is believed that user is
It scans for.Whole focus change is typical search behavior, and local focus change is then that user thinks approximate region
It has been found that, continually looking for specific objective.
Persistently search: in upper primary main blinkpunkt (more than duration preset duration, preferably 1100 milliseconds or more)
The behavior when pan of 10 watching areas or more continuously occurs afterwards, it is believed that user is in thinking, absent-minded or rest.
In the embodiment of the present invention, user can carry out eye movement control to intelligent glasses by the above method, to realize this
All kinds of man-machine interactive operations needed for invention other embodiments.
As a kind of concrete methods of realizing for carrying out user's operation monitoring in the embodiment of the present invention one, as shown in fig. 6, this hair
Bright embodiment six, comprising:
S601 identifies behavior of the user to object manipulation when, and judges whether behavior meets the requirement of operating procedure.
S602 obtains the corresponding preset attribute threshold value of article that operation mode is related to, and identifies that user grasps article
During work, whether the goods attribute data of article meet preset attribute threshold value.
S603, if behavior is unsatisfactory for the requirement of operating procedure and/or goods attribute data are unsatisfactory for preset attribute threshold value,
Determine that user is unsatisfactory for the requirement of operating procedure to the operation of article.
Even if in view of seeming errorless in user's operation behavior, but wherein specific every single stepping whether entirely accurate sometimes
And be difficult to, such as lectotype-starts again again for the really first booting-of user, but mistake mode is selected when mode is chosen,
It can still result in whole operation at this time to go wrong, therefore, in order to determine the operational issue of user, the embodiment of the present invention pair in time
It is not only merely the operation behavior of monitoring user, while can also be to the attribute of user's operation article when user carries out operation monitoring
According to being monitored, had occurred to prevent class mistake here but unrecognized situation occur.
Specifically, in the embodiment of the present invention corresponding attribute threshold can be respectively provided with to each article being related in operation study course
Value, such as in some study course of cooking, set the cooking time of oven as 1 hour temperature be 120 degree, and in user operation process
The attribute data of article is identified, judges that it operates whether caused goods attribute data variation is in threshold requirement range,
As above-mentioned, judge oven set by user cooking time and temperature whether be 1 hour and 120 degree (can be according to intelligent glasses
Image when the user setting oven taken is identified to obtain).
When the attribute data that article caused by wrong (such as pressing the wrong button) or user's operation occurs in user behavior is unsatisfactory for wanting
All illustrate that the operation of user is wrong when asking (such as oven temperature setting mistake), therefore the embodiment of the present invention can all determine user
Operation study course medium-height grass is unsatisfactory in the requirement of step to the operation of article.
Corresponding to the method for foregoing embodiments, Fig. 7 shows the structure of operation indicating device provided in an embodiment of the present invention
Block diagram, for ease of description, only parts related to embodiments of the present invention are shown.The exemplary operation indicating device of Fig. 7 can be with
It is the executing subject for the operation indicating method that previous embodiment one provides.
Referring to Fig. 7, which includes:
Map structuring module 71, for obtaining the image of user's local environment, and based on locating for described image building user
The 3D semanteme map of environment.
Watch identification module 72 attentively, for carrying out eye movement identification to user, it is semantic to judge whether the user watches the 3D attentively
The article for including in map.
Pattern acquiring module 73, if watching the article for including in the 3D semanteme map attentively for the user, described in acquisition
3D semanteme map corresponding operation mode includes the operating procedure to one or more articles in the operation mode.
Monitoring modular 74 is operated, whether the operating procedure is met to the operation of the article for monitoring the user
It is required that.
Operation indicating module 75, if being unsatisfactory for wanting for the operating procedure to the operation of the article for the user
It asks, exports the corresponding operation indicating of the operating procedure.
Further, map structuring module 71, comprising:
Based on the color image and the depth image, the location information and posture information of the user is obtained, and
The location information and Item Information of the article in user's local environment.
According to the location information of the user and posture information and the Item Information and location information of the article, building
The 3D semanteme map.
Further, watch identification module 72 attentively, comprising:
Pupil diameter module carries out Pupil diameter to the eyes image for obtaining the eyes image of the user, and
Watching area of the user in the 3D semanteme map is determined based on obtained pupil position information.
Duration statistical module, the article for including in the watching area for identification, and count packet in the watching area
The article contained continued presence duration in the watching area.
Confirmation module is watched attentively, if determining for there is the duration greater than preset duration threshold value in the continued presence duration
The user watches the article for including in the 3D semanteme map attentively.
Further, Pupil diameter module, comprising:
The eyes image is divided into N × M area image, and to all area image binarization of gray value at
Reason, obtains corresponding N × M eye gray value, wherein N and M is positive integer.
The skin image of the user is obtained, and calculates the skin image binarization of gray value treated average gray
Value.
According to the sequence of absolute difference from small to large, from the corresponding eye gray value and the average gray value
Absolute difference, which is less than in the area image of default gray threshold, carries out optical sieving, until the region that screening obtains
The pixel number that image includes is in preset quantity threshold range, the corresponding area image of pupil is obtained, to determine
State position of the pupil in the eyes image.
Further, pattern acquiring module 73, comprising:
Identify the corresponding scene type of the 3D semanteme map.
Based on the article for including in the scene type and the 3D semanteme map, the corresponding operation mode is obtained.
Further, pattern acquiring module 73, further includes:
Scene Recognition module, the corresponding scene type of the 3D semanteme map for identification, and obtain the use of the user
User data.
Mode decision module, for based on the article and the use for including in the scene type, the 3D semanteme map
User data obtains the corresponding operation mode.
Further, mode decision module, comprising:
Obtain a variety of operation modes.
The mode for obtaining the user couple chooses instruction, and chooses instruction based on the mode, from a variety of behaviour
Operation mode filters out a kind of operation mode.
Further, monitoring modular 74 is operated, comprising:
It identifies behavior of the user to the object manipulation when, and judges whether the behavior meets the operating procedure
Requirement.
The corresponding preset attribute threshold value of the article that the operation mode is related to is obtained, and identifies the user to described
Article carries out in operating process, and whether the goods attribute data of the article meet the preset attribute threshold value.
If the behavior is unsatisfactory for the requirement of the operating procedure and/or the goods attribute data be unsatisfactory for it is described pre-
If attribute thresholds, determine that the user is unsatisfactory for the requirement of the operating procedure to the operation of the article.
Further, operation indicating module 75, comprising:
It identifies watching area of the user in the 3D semanteme map, and the behaviour is carried out based on the watching area
Make the augmented reality prompted output.
Each module realizes the process of respective function in operation indicating device provided in an embodiment of the present invention, before specifically referring to
The description of embodiment illustrated in fig. 1 one is stated, details are not described herein again.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process
Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit
It is fixed.
Although will also be appreciated that term " first ", " second " etc. are used in some embodiment of the present invention in the text
Various elements are described, but these elements should not be limited by these terms.These terms are used only to an element
It is distinguished with another element.For example, the first table can be named as the second table, and similarly, the second table can be by
It is named as the first table, without departing from the range of various described embodiments.First table and the second table are all tables, but
It is them is not same table.
Fig. 8 is the schematic diagram for the glasses that one embodiment of the invention provides.As shown in figure 8, the glasses 8 of the embodiment include:
Processor 80, memory 81 are stored with the computer program 82 that can be run on the processor 80 in the memory 81.Institute
The step realized in above-mentioned each operation indicating embodiment of the method when processor 80 executes the computer program 82 is stated, such as is schemed
Step 101 shown in 1 is to 105.Alternatively, the processor 80 realizes that above-mentioned each device is implemented when executing the computer program 82
The function of each module/unit in example, such as the function of module 71 to 75 shown in Fig. 7.
Alleged processor 70 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor
Deng.
The memory 71 can be the internal storage unit of the glasses 7, such as the hard disk or memory of glasses 7.It is described
Memory 71 is also possible to the External memory equipment of the glasses 7, such as the plug-in type hard disk being equipped on the glasses 7, intelligence
Storage card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card)
Deng.Further, the memory 71 can also both include the internal storage unit of the glasses 7 or set including external storage
It is standby.The memory 71 is for other programs and data needed for storing the computer program and the glasses.It is described to deposit
Reservoir 71, which can be also used for temporarily storing, have been sent or data to be sent.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or
In use, can store in a computer readable storage medium.Based on this understanding, the present invention realizes above-mentioned implementation
All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program
Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on
The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program generation
Code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer-readable medium
It may include: any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic that can carry the computer program code
Dish, CD, computer storage, read-only memory (Read-Only Memory, ROM), random access memory (Random
Access Memory, RAM), electric carrier signal, telecommunication signal and software distribution medium etc..
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality
Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each
Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified
Or replacement, the essence of corresponding technical solution is departed from the spirit and scope of the technical scheme of various embodiments of the present invention, it should all
It is included within protection scope of the present invention.
Claims (10)
1. a kind of operation indicating method characterized by comprising
Obtain the image of user's local environment, and the 3D semanteme map based on described image building user's local environment;
Eye movement identification is carried out to user, judges whether the user watches the article for including in the 3D semanteme map attentively;
If the user watches the article for including in the 3D semanteme map attentively, the corresponding operation mould of the 3D semanteme map is obtained
Formula includes the operating procedure to one or more articles in the operation mode;
Monitor the requirement whether user meets the operating procedure to the operation of the article;
If the user is unsatisfactory for the requirement of the operating procedure to the operation of the article, it is corresponding to export the operating procedure
Operation indicating.
2. operation indicating method as described in claim 1, which is characterized in that described image includes color image and depth map
Picture, the image for obtaining user's local environment, and the 3D semanteme map based on described image building building user's local environment,
Include:
Based on the color image and the depth image, the location information and posture information and described of the user is obtained
The location information and Item Information of the article in user's local environment;
According to the location information of the user and posture information and the Item Information and location information of the article, described in building
3D semanteme map.
3. operation indicating method as described in claim 1, which is characterized in that it is described that eye movement identification is carried out to user, judge institute
State whether user watches the article for including in the 3D semanteme map attentively, comprising:
The eyes image for obtaining the user carries out Pupil diameter to the eyes image, and based on obtained pupil position letter
Breath determines watching area of the user in the 3D semanteme map;
It identifies the article for including in the watching area, and counts the article for including in the watching area in the watching area
Interior continued presence duration;
If there is the duration greater than preset duration threshold value in the continued presence duration, it is semantic to determine that the user watches the 3D attentively
The article for including in map.
4. operation indicating method as claimed in claim 3, which is characterized in that described fixed to eyes image progress pupil
Position, comprising:
The eyes image is divided into N × M area image, and all area image binarization of gray value are handled, is obtained
To corresponding N × M eye gray value, wherein N and M is positive integer;
The skin image of the user is obtained, and calculates the skin image binarization of gray value treated average gray value;
According to the sequence of absolute difference from small to large, from the difference of corresponding the eye gray value and the average gray value
Absolute value, which is less than in the area image of default gray threshold, carries out optical sieving, until the area image that screening obtains
The pixel number for including is in preset quantity threshold range, the corresponding area image of pupil is obtained, with the determination pupil
Position of the hole in the eyes image.
5. operation indicating method as described in claim 1, which is characterized in that described to obtain the corresponding behaviour of the 3D semanteme map
Operation mode, comprising:
Identify the corresponding scene type of the 3D semanteme map;
Based on the article for including in the scene type and the 3D semanteme map, the corresponding operation mode is obtained.
6. operation indicating method as described in claim 1, which is characterized in that described to obtain the corresponding behaviour of the 3D semanteme map
Operation mode, comprising:
It identifies the corresponding scene type of the 3D semanteme map, and obtains the user data of the user;
Based on the article and the user data for including in the scene type, the 3D semanteme map, obtain corresponding described
Operation mode.
7. operation indicating method as claimed in claim 6, which is characterized in that the process for obtaining the operation mode includes wrapping
It includes:
Obtain a variety of operation modes;
The mode for obtaining the user couple chooses instruction, and chooses instruction based on the mode, from a variety of operation moulds
Formula filters out a kind of operation mode.
8. operation indicating method as described in claim 1, which is characterized in that behaviour of the monitoring user to the article
Whether the requirement of the operating procedure is met, comprising:
It identifies behavior of the user to the object manipulation when, and judges whether the behavior meets wanting for the operating procedure
It asks;
The corresponding preset attribute threshold value of the article that the operation mode is related to is obtained, and identifies the user to the article
It carries out in operating process, whether the goods attribute data of the article meet the preset attribute threshold value;
If the behavior is unsatisfactory for the requirement of the operating procedure and/or the goods attribute data are unsatisfactory for the default category
Property threshold value, determines that the user is unsatisfactory for the requirement of the operating procedure to the operation of the article.
9. operation indicating method as described in claim 1, which is characterized in that described to export the corresponding operation of the operating procedure
Prompt, comprising:
It identifies watching area of the user in the 3D semanteme map, and the operation is carried out based on the watching area and is mentioned
The augmented reality output shown.
10. a kind of glasses, which is characterized in that the glasses include memory, processor, and being stored on the memory can be in institute
The computer program run on processor is stated, the processor realizes such as claim 1 to 9 times when executing the computer program
The step of the method for anticipating.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811543901.XA CN109782902A (en) | 2018-12-17 | 2018-12-17 | A kind of operation indicating method and glasses |
PCT/CN2019/124352 WO2020125499A1 (en) | 2018-12-17 | 2019-12-10 | Operation prompting method and glasses |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811543901.XA CN109782902A (en) | 2018-12-17 | 2018-12-17 | A kind of operation indicating method and glasses |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109782902A true CN109782902A (en) | 2019-05-21 |
Family
ID=66497395
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811543901.XA Pending CN109782902A (en) | 2018-12-17 | 2018-12-17 | A kind of operation indicating method and glasses |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109782902A (en) |
WO (1) | WO2020125499A1 (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110689622A (en) * | 2019-07-05 | 2020-01-14 | 电子科技大学 | Synchronous positioning and composition algorithm based on point cloud segmentation matching closed-loop correction |
CN110908768A (en) * | 2019-10-31 | 2020-03-24 | 北京浪潮数据技术有限公司 | Operation and maintenance method, device and equipment for virtualization platform and readable storage medium |
WO2020125499A1 (en) * | 2018-12-17 | 2020-06-25 | 中国科学院深圳先进技术研究院 | Operation prompting method and glasses |
CN111507195A (en) * | 2020-03-20 | 2020-08-07 | 北京万里红科技股份有限公司 | Iris segmentation neural network model training method, iris segmentation method and device |
CN111652155A (en) * | 2020-06-04 | 2020-09-11 | 北京航空航天大学 | Human body movement intention identification method and system |
CN113728328A (en) * | 2020-03-26 | 2021-11-30 | 艾思益信息应用技术股份公司 | Information processing apparatus, information processing method, and computer program |
CN115097903A (en) * | 2022-05-19 | 2022-09-23 | 深圳智华科技发展有限公司 | MR glasses control method and device, MR glasses and storage medium |
US11934446B2 (en) | 2019-03-29 | 2024-03-19 | Information System Engineering Inc. | Information providing system |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112263217B (en) * | 2020-08-27 | 2023-07-18 | 上海大学 | Improved convolutional neural network-based non-melanoma skin cancer pathological image lesion area detection method |
SE2051359A1 (en) * | 2020-11-20 | 2022-05-21 | Wiretronic Ab | Method and system for compliance determination |
CN114565815B (en) * | 2022-02-25 | 2023-11-03 | 包头市迪迦科技有限公司 | Video intelligent fusion method and system based on three-dimensional model |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120075343A1 (en) * | 2010-09-25 | 2012-03-29 | Teledyne Scientific & Imaging, Llc | Augmented reality (ar) system and method for tracking parts and visually cueing a user to identify and locate parts in a scene |
CN103838378A (en) * | 2014-03-13 | 2014-06-04 | 广东石油化工学院 | Head wearing type eye control system based on pupil recognition positioning |
CN106202269A (en) * | 2016-06-28 | 2016-12-07 | 广东欧珀移动通信有限公司 | A kind of obtain the method for augmented reality Operating Guideline, device and mobile terminal |
CN106774876A (en) * | 2016-12-12 | 2017-05-31 | 大连文森特软件科技有限公司 | Based on the culinary art accessory system that AR augmented realities and menu are generated |
CN107066507A (en) * | 2017-01-10 | 2017-08-18 | 中国人民解放军国防科学技术大学 | A kind of semantic map constructing method that cloud framework is mixed based on cloud robot |
CN107564012A (en) * | 2017-08-01 | 2018-01-09 | 中国科学院自动化研究所 | Towards the augmented reality method and device of circumstances not known |
CN107656505A (en) * | 2017-08-21 | 2018-02-02 | 杭州太若科技有限公司 | Use the methods, devices and systems of augmented reality equipment control man-machine collaboration |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9851803B2 (en) * | 2013-03-15 | 2017-12-26 | Eyecam, LLC | Autonomous computing and telecommunications head-up displays glasses |
CN104965314A (en) * | 2015-08-03 | 2015-10-07 | 陈丽晓 | Intelligent glasses |
US9829976B2 (en) * | 2015-08-07 | 2017-11-28 | Tobii Ab | Gaze direction mapping |
CN106557168A (en) * | 2016-11-23 | 2017-04-05 | 上海擎感智能科技有限公司 | Intelligent glasses and its control method, control device |
CN109782902A (en) * | 2018-12-17 | 2019-05-21 | 中国科学院深圳先进技术研究院 | A kind of operation indicating method and glasses |
-
2018
- 2018-12-17 CN CN201811543901.XA patent/CN109782902A/en active Pending
-
2019
- 2019-12-10 WO PCT/CN2019/124352 patent/WO2020125499A1/en active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120075343A1 (en) * | 2010-09-25 | 2012-03-29 | Teledyne Scientific & Imaging, Llc | Augmented reality (ar) system and method for tracking parts and visually cueing a user to identify and locate parts in a scene |
CN103838378A (en) * | 2014-03-13 | 2014-06-04 | 广东石油化工学院 | Head wearing type eye control system based on pupil recognition positioning |
CN106202269A (en) * | 2016-06-28 | 2016-12-07 | 广东欧珀移动通信有限公司 | A kind of obtain the method for augmented reality Operating Guideline, device and mobile terminal |
CN106774876A (en) * | 2016-12-12 | 2017-05-31 | 大连文森特软件科技有限公司 | Based on the culinary art accessory system that AR augmented realities and menu are generated |
CN107066507A (en) * | 2017-01-10 | 2017-08-18 | 中国人民解放军国防科学技术大学 | A kind of semantic map constructing method that cloud framework is mixed based on cloud robot |
CN107564012A (en) * | 2017-08-01 | 2018-01-09 | 中国科学院自动化研究所 | Towards the augmented reality method and device of circumstances not known |
CN107656505A (en) * | 2017-08-21 | 2018-02-02 | 杭州太若科技有限公司 | Use the methods, devices and systems of augmented reality equipment control man-machine collaboration |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020125499A1 (en) * | 2018-12-17 | 2020-06-25 | 中国科学院深圳先进技术研究院 | Operation prompting method and glasses |
US11934446B2 (en) | 2019-03-29 | 2024-03-19 | Information System Engineering Inc. | Information providing system |
CN110689622A (en) * | 2019-07-05 | 2020-01-14 | 电子科技大学 | Synchronous positioning and composition algorithm based on point cloud segmentation matching closed-loop correction |
CN110689622B (en) * | 2019-07-05 | 2021-08-27 | 电子科技大学 | Synchronous positioning and composition method based on point cloud segmentation matching closed-loop correction |
CN110908768B (en) * | 2019-10-31 | 2022-07-05 | 北京浪潮数据技术有限公司 | Operation and maintenance method, device and equipment for virtualization platform and readable storage medium |
CN110908768A (en) * | 2019-10-31 | 2020-03-24 | 北京浪潮数据技术有限公司 | Operation and maintenance method, device and equipment for virtualization platform and readable storage medium |
CN111507195B (en) * | 2020-03-20 | 2023-10-03 | 北京万里红科技有限公司 | Iris segmentation neural network model training method, iris segmentation method and device |
CN111507195A (en) * | 2020-03-20 | 2020-08-07 | 北京万里红科技股份有限公司 | Iris segmentation neural network model training method, iris segmentation method and device |
CN113728328A (en) * | 2020-03-26 | 2021-11-30 | 艾思益信息应用技术股份公司 | Information processing apparatus, information processing method, and computer program |
CN113728328B (en) * | 2020-03-26 | 2024-04-12 | 艾思益信息应用技术股份公司 | Information processing apparatus and information processing method |
CN111652155A (en) * | 2020-06-04 | 2020-09-11 | 北京航空航天大学 | Human body movement intention identification method and system |
CN115097903A (en) * | 2022-05-19 | 2022-09-23 | 深圳智华科技发展有限公司 | MR glasses control method and device, MR glasses and storage medium |
CN115097903B (en) * | 2022-05-19 | 2024-04-05 | 深圳智华科技发展有限公司 | MR glasses control method and device, MR glasses and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2020125499A9 (en) | 2020-12-17 |
WO2020125499A1 (en) | 2020-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109782902A (en) | A kind of operation indicating method and glasses | |
US20210073953A1 (en) | Method for applying bokeh effect to image and recording medium | |
CN106897670B (en) | Express violence sorting identification method based on computer vision | |
CN104049754B (en) | Real time hand tracking, posture classification and Interface Control | |
CN108427503B (en) | Human eye tracking method and human eye tracking device | |
CN109948447B (en) | Character network relation discovery and evolution presentation method based on video image recognition | |
CN108985210A (en) | A kind of Eye-controlling focus method and system based on human eye geometrical characteristic | |
CN100440246C (en) | Positioning method for human face characteristic point | |
JPH11175246A (en) | Sight line detector and method therefor | |
CN108470354A (en) | Video target tracking method, device and realization device | |
CN106951870B (en) | Intelligent detection and early warning method for active visual attention of significant events of surveillance video | |
CN106133750A (en) | For determining the 3D rendering analyzer of direction of visual lines | |
CN112732071B (en) | Calibration-free eye movement tracking system and application | |
MX2012010602A (en) | Face recognizing apparatus, and face recognizing method. | |
CN111460976B (en) | Data-driven real-time hand motion assessment method based on RGB video | |
CN113592911B (en) | Apparent enhanced depth target tracking method | |
CN106599785A (en) | Method and device for building human body 3D feature identity information database | |
CN109325408A (en) | A kind of gesture judging method and storage medium | |
CN111860091A (en) | Face image evaluation method and system, server and computer readable storage medium | |
CN113435236A (en) | Home old man posture detection method, system, storage medium, equipment and application | |
KR102160128B1 (en) | Method and apparatus for creating smart albums based on artificial intelligence | |
CN110929570B (en) | Iris rapid positioning device and positioning method thereof | |
CN116682140A (en) | Three-dimensional human body posture estimation algorithm based on attention mechanism multi-mode fusion | |
CN116030519A (en) | Learning attention detection and assessment method for live broadcast teaching platform | |
CN117541994A (en) | Abnormal behavior detection model and detection method in dense multi-person scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190521 |
|
RJ01 | Rejection of invention patent application after publication |