WO2017182523A1 - A method and a system for real-time remote support with use of computer vision and augmented reality - Google Patents

A method and a system for real-time remote support with use of computer vision and augmented reality Download PDF

Info

Publication number
WO2017182523A1
WO2017182523A1 PCT/EP2017/059290 EP2017059290W WO2017182523A1 WO 2017182523 A1 WO2017182523 A1 WO 2017182523A1 EP 2017059290 W EP2017059290 W EP 2017059290W WO 2017182523 A1 WO2017182523 A1 WO 2017182523A1
Authority
WO
WIPO (PCT)
Prior art keywords
support
augmented reality
electronic device
computer vision
video
Prior art date
Application number
PCT/EP2017/059290
Other languages
French (fr)
Inventor
Dante MOCCETTI
Fabio Rezzonico
Pietro VERAGOUTH
Antonino TRAMONTE
Lorenzo CAMPO
Jacopo BOSIO
Antonio Leonardo Jacopo MURCIANO
Original Assignee
Newbiquity Sagl
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from ITUA2016A002756A external-priority patent/ITUA20162756A1/en
Priority claimed from CH00526/16A external-priority patent/CH712380A2/en
Application filed by Newbiquity Sagl filed Critical Newbiquity Sagl
Publication of WO2017182523A1 publication Critical patent/WO2017182523A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/131Protocols for games, networked simulations or virtual reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/75Indicating network or usage conditions on the user display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals

Definitions

  • the present invention relates to a method and a system for real-time remote support with use of computer vision and augmented reality.
  • the present invention relates to a method for remote maintenance/user support using computer vision for enabling an expert user to enrich the contents of the environment for a user requesting support in real time.
  • Computer vision mechanisms in use enable to extract characteristic features from the environment so as to recognise objects, textures, parts of objects, in relation to an image.
  • the contents that are affixed to the environment that the camera of the user requesting support captures are "tags" appearing as graphic symbols in augmented reality.
  • AR augmented reality
  • processor-mediated reality is meant the enrichment of human sensory perception by means of information, usually manipulated and conveyed electronically, which would not be perceptible using only the five senses.
  • augmented reality already used in very specific fields such as the military, medicine or research, was presented to the wider public in both the form of communication campaigns, i.e. "augmented advertising" published in newspapers or on the web, and through a progressively growing number of smartphone apps for entertainment (games) or for intensifying an experience by enriching contents to be associated to the environment.
  • recognition of the environment is based on recognition of a specific object and the support provided by the system is therefore limited to a particular object on which the system has been "trained” and a rigid sequence of contents.
  • the expert operator works on a static frame and the operator requesting support receives a picture on which "illustrative" contents have been affixed.
  • the technical task of the invention is to obviate the above-described drawbacks in the prior art, while still providing a system that operates in real time.
  • an object of the present invention is to provide a method and a system for real-time remote support with use of computer vision and augmented reality, which ensure efficient and reliable support in real time.
  • said programs residing in said cloud computing data network operate in an on-demand mode when said programs residing in said electronic device of said user requesting support do not provide a reliable outcome.
  • said cloud computing network comprises a streaming server.
  • said programs comprise a program for extracting images from a video and transformation of the images into dot matrices.
  • said programs comprise an augmented reality filter.
  • said augmented reality filter operates on said dot matrices so as to recalculate the spatial coordinates of the graphic marker.
  • said video acquired by said electronic device of said user requesting support is transmitted to said streaming server from which it is in turn transmitted to said electronic device of said support provider which affixes said graphic marker and returns the spatial coordinates of the tagged dot to said streaming server.
  • said spatial coordinates are made available to said augmented reality filter in order to be processed.
  • said processed spatial coordinates are made available to said electronic device requesting support and to said electronic device providing support.
  • the present invention also relates to a system for real-time remote support with use of computer vision and augmented reality, characterised in that it comprises following components: - an electronic device of a user requesting support provided with a camera and display screen and an electronic device of a support provider provided with a display screen;
  • cloud computing data network comprising at least one streaming server
  • an augmented reality engine having computer vision programs residing in said electronic device of said user requesting support and in said cloud computing data network; said augmented reality engine being configured so as to display on said electronic device of the user requesting support a video acquired by said user requesting support to which the support provider has affixed in augmented reality a graphic marker on a dot, the spatial coordinates of which are recalculated so that in the sequence of images of the video the graphic marker points permanently on the tagged dot.
  • the augmented reality engine thus has computer vision programs divided into two different parts, a first part which draws the augmented reality contents, and a second part, different from the first pail, which performs the computer vision calculations necessary for establishing where to draw the augmented reality contents.
  • the remote support method of the invention is based on a detection of the environment, which is done in duplicate mode on the electronic device of the user requesting support and in the cloud.
  • This system by maintaining a real-time modality in the interaction between the support provider and the user requesting support, has been shown to be more reliable, including in cases in which the scene changes by changing the position of the user requesting support, and has also been demonstrated to be faster than known systems.
  • Part of the augmented reality engine can be improved and implemented without requiring the users to install updates or new software versions given that the software updates can be directly made on the cloud.
  • FIG. 1 shows a block diagram of the functioning of the support system.
  • the system for real-time remote support with use of computer vision and augmented reality comprises an electronic device 1 of a user requesting support, an electronic device 2 of a support provider, a cloud computing data network 3 comprising at least one streaming server 4, and an augmented reality engine 5, 6, 7, 8, 9 having computer vision programs residing in the electronic device 1 of the user requesting support and in the cloud computing data network 3;
  • the electronic devices 1, 2 to which reference will be made in the following are in particular portable devices such as smartphones, though it is understood that any electronic device of another type suitable for the purpose can be used.
  • the augmented reality engine 5, 6, 7, 8, 9 therefore includes a part 7, 8, 9 that locally resides on the smartphone 1 of the user requesting support and a part 5, 6 residing in the cloud computing network 3.
  • the support method is articulated in following steps.
  • the user requesting support using his or her smartphone, contacts the support provider on his or her smartphone.
  • a one-way video call is activated in which the user requesting support sends the video, captured by the camera of his or her smartphone, to the smartphone of the support provider.
  • the user requesting support frames on his/her smartphone the object on which he/she desires to receive support.
  • the support provider receives the video and uses a special interface to affix tags on the object in augmented reality.
  • the tags function as references which indicate specific parts of the scene with respect to which it is giving support.
  • the tags are kept in the correct position even when the user requesting support moves his/her smartphone.
  • the support provider when placing a tag on the video it receives from the smartphone of the user requesting support, sends the x and y coordinates of the dot to be "tagged" via the streaming server 4.
  • the computer vision algorithms can thus analyse a part of the scene by proceeding to extract characteristic features that will serve, as the images of the video flow, to recalculate the x and y coordinates of the dot to be tagged, and are in this way able to move the tag according to the evolution of the situation so as to maintain it on the object or part of the object to be tagged.
  • the computer vision algorithms used serve to collect the characteristic features that can be used for denoting a part of significant information within the images which are flowing in the video streaming.
  • the algorithms used perform "proximity" operations applied to an image, specific structures of the image itself, dots or lines in an image or even complex structures such as objects in images.
  • the characteristic features can also relate to a sequence of images in movement, forms defined in terms of curves or borders between different regions of the image or specific properties of a region of the image.
  • Some algorithms are very fast and "light" in terms of calculation resources, such as for example algorithms based on geometric transformations.
  • the support method includes using efficient and economical algorithms where resources are limited, i.e. on the smartphone of the user requesting support, and more reliable and powerful algorithms where the resources are scalable, i.e. in a cloud network.
  • the smartphone 1 of the user requesting support sends the video stream captured by its camera, based for example on the Wowza proprietary software, to the streaming server 4.
  • the streaming server 4 makes the video flow available to the smartphone 2 of the support provider which displays it.
  • the smartphone 2 of the support provider is appropriately provided with a graphic interface and commands that enable to position a graphic marker on a dot to be tagged on the screen.
  • the metadata is sent to the streaming server 4, which metadata represent the spatial coordinates x and y of the dot to be tagged.
  • the metadata (spatial coordinates x and y) is made available and used by the program 7, 8 of the augmented reality engine residing in the smartphone 1 of the user requesting support in order to perform positioning calculations and computer visions that are "light and rapid”. These calculations enable to reposition the dot to be tagged in the flow of video images.
  • This mechanism enables to "tye” a graphic marker to elements, or parts of elements, present in the scene as captured by the smartphone 1 of the user requesting support.
  • a program 7 of the augmented reality engine residing in the smartphone 1 of the user requesting support deconstructs the video into images and transforms the images into dot matrices
  • a program 8 of the augmented reality engine again residing in the smartphone 1 of the user requesting support, performs the calculations on the dot matrices so as to recalculate the spatial coordinates x and y to be tagged on the images of the video.
  • a program 9 of the augmented reality engine again residing in the smartphone 1 of the user requesting support, uses the spatial coordinates x and y, recalculated in this way, to draw the tag in augmented reality on the images of the video.
  • the calculations used locally by the smartphone 1 of the user requesting support for the positioning are calculations based on dot matrices and on transformations thereof in the sequence of the video images and require a relatively moderate extraction of features characteristic of the image.
  • the augmented reality engine also has support programs residing in the cloud computing network which is not subject to stringent limitations in the calculating resources and intervenes when the augmented reality engine programs residing locally in the smartphone 1 of the user requesting support give evidence of a recognition that is not sufficiently reliable.
  • the support programs residing in the cloud computing network 3 at this point possess the data necessary for them to function.
  • the first support program 5 residing in the cloud computing network has the task of deconstructing, from the video flow sent from the smartphone 1 of the user requesting support, the single images which will be transformed into matrices.
  • the second support program 6 residing in the cloud computing network has the task of analysing the matrices created starting from the video images and, using the metadata sent from the smartphone 2 of the support provider, performing the calculations necessary for providing new metadata, i.e. the spatial coordinates x and y of the dot to be tagged for each video image.
  • OpenCV Computer Vision library
  • the first support program 5 becomes necessary as the computer vision algorithms analyse images (i.e. frames) transformed into matrices.
  • the matrices describe relations between images of a same scene.
  • the nonnal flow of the streaming through the streaming server 4 would not enable to perform calculation operations on dot matrices.
  • the streaming server 4 in itself, serves only for receiving a video flow from a source and making it available to an audience.
  • the first support program 5 relying on a function made available by the streaming server 4 to access the video images, transforms the images into dot matrices which at this point can be processed by the second support program 6.
  • the second support program 6 by using various computer vision algorithms based on the image recognition concept, can remap the x and y coordinates positioned by the smartphone of the support provider 2 on each subsequent image of the video flow.
  • the second support program 6 makes new metadata available when the calculations performed locally are considered to be not sufficiently reliable.
  • both the smartphones 1, 2 receive, from the streaming server 4, new x and y coordinates which can then be used for drawing the markers on the video images.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A method for real-time remote support with use of computer vision and augmented reality, comprising following steps: providing an augmented reality engine having computer vision programs and residing partly in an electronic device of a user requesting support and partly in said cloud computing data network; and wherein the following steps are carried out in real time: acquiring, with a camera of the electronic device of the user requesting support, a video of a work environment; transmitting and displaying said video on an electronic device of a support provider; affixing, by said support provider, a graphic marker in augmented reality on a dot that is to be tagged in said video; running said computer vision programs so as to display said video containing said graphic marker on the electronic device of said user requesting support, the spatial coordinates of the graphic marker being recalculated so that, in the sequence of images of the video displayed on said electronic device of said user requesting support, the graphic marker permanently points on the tagged dot.

Description

A METHOD AND A SYSTEM FOR REAL-TIME REMOTE SUPPORT WITH USE OF COMPUTER VISION AND AUGMENTED REALITY
DESCRIPTION
The present invention relates to a method and a system for real-time remote support with use of computer vision and augmented reality.
In more detail, the present invention relates to a method for remote maintenance/user support using computer vision for enabling an expert user to enrich the contents of the environment for a user requesting support in real time.
Computer vision mechanisms in use enable to extract characteristic features from the environment so as to recognise objects, textures, parts of objects, in relation to an image.
The contents that are affixed to the environment that the camera of the user requesting support captures are "tags" appearing as graphic symbols in augmented reality.
By augmented reality (AR) or processor-mediated reality, is meant the enrichment of human sensory perception by means of information, usually manipulated and conveyed electronically, which would not be perceptible using only the five senses.
In 2009, thanks to technological improvement, augmented reality, already used in very specific fields such as the military, medicine or research, was presented to the wider public in both the form of communication campaigns, i.e. "augmented advertising" published in newspapers or on the web, and through a progressively growing number of smartphone apps for entertainment (games) or for intensifying an experience by enriching contents to be associated to the environment.
Maintenance/support systems are known that use augmented reality.
These software systems enable a non-specialised operator who does not know the environment and the object he/she is to operate on to receive a quantity of augmented reality information so as to be able to carry out the requested operation. In general, for maintenance/support with augmented reality techniques, information is used that the system already stores in its memory.
In this case, there is no real-time relationship between the expert operator and the operator requesting support.
In these cases, recognition of the environment is based on recognition of a specific object and the support provided by the system is therefore limited to a particular object on which the system has been "trained" and a rigid sequence of contents.
In these cases, there is no analysis of a variable environment and the recognition of the object can take place either via physical markers affixed to the object or via specific image recognition for which the system is trained.
There also exist "live" systems that envisage an interaction between the user requesting support and the external operator offering that support.
These systems generally include only the sending of static images.
In other words, the expert operator works on a static frame and the operator requesting support receives a picture on which "illustrative" contents have been affixed.
Lastly, systems can be hypothesised in which a video-call is used to offer live video support.
In these cases, however, the calculations are carried out entirely locally on the processor of the device which either captures the video or adds augmented reality contents. This approach might be effective in a case where the device carrying out the calculations is a powerful processor. In these cases, however two types of problem would arise, due to the fact that computer vision calculations used in parallel to a video stream take up much of the CPU of mobile devices.
Firstly, a temporal deviation between the moment when the environment in which the non-expert operator has to operate and the response information on what to do causes great difficulty in relation to the correct interpretation by the non-expert operator on what to do and where to do it. Secondly, the use of fast - though not very sophisticated - algorithms generates a large number of false positives and false negatives (in other terms erroneous detections).
These problems would emerge very strongly should the user requesting support have modified his or her point of observation with respect to the point where the detection of the environment and the object in which to operate was determined.
This is because the calculations in this case have to flow more rapidly and must at the same time offer robust results.
The technical task of the invention is to obviate the above-described drawbacks in the prior art, while still providing a system that operates in real time.
In the scope of this technical task, an object of the present invention is to provide a method and a system for real-time remote support with use of computer vision and augmented reality, which ensure efficient and reliable support in real time.
This and other aims of the present invention are attained by a real-time remote support method with use of computer vision and augmented reality, characterised in that it comprises following steps:
- providing a cloud computing data network
- providing an augmented reality engine having computer vision programs residing in a mobile electronic device of a user requesting support and in said cloud computing data network;
and wherein the following steps are carried out in real time:
- acquiring, with a camera of the electronic device of the user requesting support, a video of a work environment;
- transmitting and displaying said video on an electronic device of a support provider;
- affixing, by said support provider, a graphic marker in augmented reality on a dot that is to be tagged in said video;
- running said computer vision programs so as to display said video containing said graphic marker on the electronic device of said user requesting support, the spatial coordinates of the graphic marker being recalculated so that, in the sequence of images of the video displayed on said electronic device of said user requesting support, the graphic marker permanently points on the tagged dot.
In a preferred embodiment of the invention, said programs residing in said cloud computing data network operate in an on-demand mode when said programs residing in said electronic device of said user requesting support do not provide a reliable outcome.
In a preferred embodiment of the invention said cloud computing network comprises a streaming server.
In a preferred embodiment of the invention said programs comprise a program for extracting images from a video and transformation of the images into dot matrices.
In a preferred embodiment of the invention said programs comprise an augmented reality filter.
In a preferred embodiment of the invention, said augmented reality filter operates on said dot matrices so as to recalculate the spatial coordinates of the graphic marker.
In a preferred embodiment of the invention, said video acquired by said electronic device of said user requesting support is transmitted to said streaming server from which it is in turn transmitted to said electronic device of said support provider which affixes said graphic marker and returns the spatial coordinates of the tagged dot to said streaming server.
In a preferred embodiment of the invention said spatial coordinates are made available to said augmented reality filter in order to be processed.
In a preferred embodiment of the invention said processed spatial coordinates are made available to said electronic device requesting support and to said electronic device providing support.
The present invention also relates to a system for real-time remote support with use of computer vision and augmented reality, characterised in that it comprises following components: - an electronic device of a user requesting support provided with a camera and display screen and an electronic device of a support provider provided with a display screen;
- a cloud computing data network comprising at least one streaming server;
- an augmented reality engine having computer vision programs residing in said electronic device of said user requesting support and in said cloud computing data network; said augmented reality engine being configured so as to display on said electronic device of the user requesting support a video acquired by said user requesting support to which the support provider has affixed in augmented reality a graphic marker on a dot, the spatial coordinates of which are recalculated so that in the sequence of images of the video the graphic marker points permanently on the tagged dot.
The augmented reality engine thus has computer vision programs divided into two different parts, a first part which draws the augmented reality contents, and a second part, different from the first pail, which performs the computer vision calculations necessary for establishing where to draw the augmented reality contents.
The remote support method of the invention is based on a detection of the environment, which is done in duplicate mode on the electronic device of the user requesting support and in the cloud. This system, by maintaining a real-time modality in the interaction between the support provider and the user requesting support, has been shown to be more reliable, including in cases in which the scene changes by changing the position of the user requesting support, and has also been demonstrated to be faster than known systems.
These advantages originate from transferring part of the calculation (the heaviest and most complex part) onto a cloud computing network and therefore using, in parallel, both distributed calculation resources, i.e. local resources in the electronic device of the user requesting support, and central resources, i.e. resources of the cloud computing network. The complex calculations, which use heavy computer vision algorithms, instead of being "locally" performed are performed in the cloud in an on-demand mode when the locally-performed calculations give results that are not substantially reliable, while the lighter calculations are performed locally in a continuous manner.
Therefore, by performing the more laborious calculations in the cloud an integration of the service in third-party applications becomes more accessible and faster.
However, as the calculations performed in clouds are performed only on demand, when the probability of error in the local calculations show to be too high, the architecture enables to limit the infrastructural costs of the cloud.
Further, as the calculations are performed in the cloud, it becomes simpler to refine the quality of the positioning process of the graphic marker.
This is for two different reasons: on the one hand, there are no stringent limitations to the calculating resources and various algorithms can be used in parallel; on the other hand, the cloud algorithms are used when the "faster and lighter" algorithms locally used fail.
This improves the reliability and precision of the system.
Part of the augmented reality engine can be improved and implemented without requiring the users to install updates or new software versions given that the software updates can be directly made on the cloud.
These and other aspects of the invention will be more fully clarified upon reading of the following description of a preferred embodiment thereof, in which:
- figure 1 shows a block diagram of the functioning of the support system.
The system for real-time remote support with use of computer vision and augmented reality comprises an electronic device 1 of a user requesting support, an electronic device 2 of a support provider, a cloud computing data network 3 comprising at least one streaming server 4, and an augmented reality engine 5, 6, 7, 8, 9 having computer vision programs residing in the electronic device 1 of the user requesting support and in the cloud computing data network 3;
The electronic devices 1, 2 to which reference will be made in the following are in particular portable devices such as smartphones, though it is understood that any electronic device of another type suitable for the purpose can be used.
The augmented reality engine 5, 6, 7, 8, 9 therefore includes a part 7, 8, 9 that locally resides on the smartphone 1 of the user requesting support and a part 5, 6 residing in the cloud computing network 3.
The support method is articulated in following steps.
The user requesting support, using his or her smartphone, contacts the support provider on his or her smartphone.
When the support provider responds to the request for support, a one-way video call is activated in which the user requesting support sends the video, captured by the camera of his or her smartphone, to the smartphone of the support provider.
The user requesting support frames on his/her smartphone the object on which he/she desires to receive support.
The support provider receives the video and uses a special interface to affix tags on the object in augmented reality.
The tags function as references which indicate specific parts of the scene with respect to which it is giving support.
The tags are kept in the correct position even when the user requesting support moves his/her smartphone.
This is made possible by the support of the above-mentioned computer vision programs, for example but not necessarily programs made available by the open source library OpenCV. The computer vision programs are activated as follows.
The support provider, when placing a tag on the video it receives from the smartphone of the user requesting support, sends the x and y coordinates of the dot to be "tagged" via the streaming server 4.
Therefore, these coordinates tag a dot on an image.
The computer vision algorithms can thus analyse a part of the scene by proceeding to extract characteristic features that will serve, as the images of the video flow, to recalculate the x and y coordinates of the dot to be tagged, and are in this way able to move the tag according to the evolution of the situation so as to maintain it on the object or part of the object to be tagged.
Therefore, the computer vision algorithms used serve to collect the characteristic features that can be used for denoting a part of significant information within the images which are flowing in the video streaming.
These characteristic features thus enable the algorithms to calculate and progressively recalculate the correct position in which the tags must be drawn and inserted with an augmented reality method.
Specifically, the algorithms used perform "proximity" operations applied to an image, specific structures of the image itself, dots or lines in an image or even complex structures such as objects in images.
The characteristic features can also relate to a sequence of images in movement, forms defined in terms of curves or borders between different regions of the image or specific properties of a region of the image.
Some algorithms are very fast and "light" in terms of calculation resources, such as for example algorithms based on geometric transformations.
These algorithms can therefore be run on the smartphone of the user requesting support. Other algorithms are instead slower and take up greater resources. These heavier algorithms are ran in clouds and are therefore more reliable.
It is worthy of note that the faster and lighter algorithms tend to have greater margins of error in specific conditions, such as for example in cases in which the scene completely changes and then returns to the original scene.
In other terms the support method includes using efficient and economical algorithms where resources are limited, i.e. on the smartphone of the user requesting support, and more reliable and powerful algorithms where the resources are scalable, i.e. in a cloud network.
In this way results are rapidly available and these results, when considered inadequate, are adjusted by more robust calculations.
At the same time, where the fast calculations fail, more complex calculations can be attempted, which can instead give results.
In the following, a more detailed description of the functioning of the support system is given.
The smartphone 1 of the user requesting support sends the video stream captured by its camera, based for example on the Wowza proprietary software, to the streaming server 4.
The streaming server 4 makes the video flow available to the smartphone 2 of the support provider which displays it.
The smartphone 2 of the support provider is appropriately provided with a graphic interface and commands that enable to position a graphic marker on a dot to be tagged on the screen. Once the graphic marker is positioned, the metadata is sent to the streaming server 4, which metadata represent the spatial coordinates x and y of the dot to be tagged.
The metadata (spatial coordinates x and y) is made available and used by the program 7, 8 of the augmented reality engine residing in the smartphone 1 of the user requesting support in order to perform positioning calculations and computer visions that are "light and rapid". These calculations enable to reposition the dot to be tagged in the flow of video images.
This mechanism enables to "tye" a graphic marker to elements, or parts of elements, present in the scene as captured by the smartphone 1 of the user requesting support.
In practice, a program 7 of the augmented reality engine residing in the smartphone 1 of the user requesting support deconstructs the video into images and transforms the images into dot matrices, and a program 8 of the augmented reality engine, again residing in the smartphone 1 of the user requesting support, performs the calculations on the dot matrices so as to recalculate the spatial coordinates x and y to be tagged on the images of the video.
Lastly, a program 9 of the augmented reality engine, again residing in the smartphone 1 of the user requesting support, uses the spatial coordinates x and y, recalculated in this way, to draw the tag in augmented reality on the images of the video.
The calculations used locally by the smartphone 1 of the user requesting support for the positioning are calculations based on dot matrices and on transformations thereof in the sequence of the video images and require a relatively moderate extraction of features characteristic of the image.
This choice enables great rapidity and low consumption of calculating resources.
However, in unusual or complex situations it is susceptible to an increase of false positives or false negatives regarding recognition.
For this reason, the augmented reality engine also has support programs residing in the cloud computing network which is not subject to stringent limitations in the calculating resources and intervenes when the augmented reality engine programs residing locally in the smartphone 1 of the user requesting support give evidence of a recognition that is not sufficiently reliable.
As they have available the video images which via the streaming server 4 arrive from the smartphone 1 of the user requesting support and the metadata relative to the spatial coordinates x, y of the dot to be tagged which via the streaming server 4 arrive from the smartphone 2 of the support provider, the support programs residing in the cloud computing network 3 at this point possess the data necessary for them to function.
There are at least two support programs residing in the cloud computing network 3.
The first support program 5 residing in the cloud computing network has the task of deconstructing, from the video flow sent from the smartphone 1 of the user requesting support, the single images which will be transformed into matrices.
The second support program 6 residing in the cloud computing network, for example written in Java and based on the Computer Vision library known as OpenCV, has the task of analysing the matrices created starting from the video images and, using the metadata sent from the smartphone 2 of the support provider, performing the calculations necessary for providing new metadata, i.e. the spatial coordinates x and y of the dot to be tagged for each video image.
The first support program 5 becomes necessary as the computer vision algorithms analyse images (i.e. frames) transformed into matrices.
The matrices describe relations between images of a same scene.
Given the projection of a dot of the scene in one of the images, it becomes possible to search for the corresponding dot in the other image.
The nonnal flow of the streaming through the streaming server 4 would not enable to perform calculation operations on dot matrices.
In fact, the streaming server 4, in itself, serves only for receiving a video flow from a source and making it available to an audience.
The first support program 5, relying on a function made available by the streaming server 4 to access the video images, transforms the images into dot matrices which at this point can be processed by the second support program 6.
Therefore, the second support program 6, by using various computer vision algorithms based on the image recognition concept, can remap the x and y coordinates positioned by the smartphone of the support provider 2 on each subsequent image of the video flow.
The second support program 6 makes new metadata available when the calculations performed locally are considered to be not sufficiently reliable.
In this way both the smartphones 1, 2 receive, from the streaming server 4, new x and y coordinates which can then be used for drawing the markers on the video images.
The method and system for real-time remote support with use of computer vision and augmented reality as they are conceived are susceptible to numerous modifications and variants, all falling within the inventive concept described and claimed.

Claims

1. A method for real-time remote support with use of computer vision and augmented reality, characterised in that it comprises following steps:
- providing a cloud computing data network;
- providing an augmented reality engine having computer vision programs residing in an electronic device of a user requesting support and in said cloud computing data network; and wherein the following steps are carried out in real time:
- acquiring, with a camera of the electronic device of the user requesting support, a video of a work environment;
- transmitting and displaying said video on an electronic device of a support provider;
- affixing, by said support provider, a graphic marker in augmented reality on a dot that is to be tagged in said video;
- running said computer vision programs so as to display said video containing said graphic marker on the electronic device of said user requesting support, the spatial coordinates of the graphic marker being recalculated so that, in the sequence of images of the video displayed on said electronic device of said user requesting support, the graphic marker permanently points on the tagged dot.
2. The method for real-time remote support with use of computer vision and augmented reality according to the preceding claim, characterised in that said programs residing in said cloud computing data network operate in an on-demand mode when said programs residing in said electronic device of said user requesting support do not provide a reliable outcome.
3. The method for real-time remote support with use of computer vision and augmented reality according to claim 1, characterised in that said cloud computing network comprises a streaming server.
4. The method for real-time remote support with use of computer vision and augmented reality according to any preceding claim, characterised in that said programs comprise a program for extracting images from a video and transformation of the images into dot matrices.
5. The method for real-time remote support with use of computer vision and augmented reality according to any preceding claim, characterised in that said programs comprise an augmented reality filter.
6. The method for real-time remote support with use of computer vision and augmented reality according to the preceding claim, characterised in that said augmented reality filter operates on said dot matrices so as to recalculate the spatial coordinates of the graphic marker.
7. The method for real-time remote support with use of computer vision and augmented reality according to any preceding claim, characterised in that said video acquired by said electronic device of said user requesting support is transmitted to said streaming server from which it is in turn transmitted to said electronic device of said support provider which affixes said graphic marker and returns the spatial coordinates of the tagged dot to said streaming server.
8. The method for real-time remote support with use of computer vision and augmented reality according to the preceding claim, characterised in that said spatial coordinates are made available to said augmented reality filter in order to be processed.
9. The method for real-time remote support with use of computer vision and augmented reality according to the preceding claim, characterised in that said processed spatial coordinates are made available to said electronic device requesting support and to said electronic device providing support.
10. The system for real-time remote support with use of computer vision and augmented reality, characterised in that it comprises the following components:
- an electronic device of a user requesting support provided with a camera and display screen and an electronic device of a support provider provided with a display screen;
- a cloud computing data network comprising at least one streaming server;
- an augmented reality engine having computer vision programs and residing in said electronic device of said user requesting support and in said cloud computing data network; said augmented reality engine being configured so as to display on said electronic device of the user requesting support a video acquired by said user requesting support to which the support provider has affixed in augmented reality a graphic marker on a dot, the spatial coordinates of which are recalculated so that in the sequence of images of the video the graphic marker points permanently on the tagged dot.
PCT/EP2017/059290 2016-04-20 2017-04-19 A method and a system for real-time remote support with use of computer vision and augmented reality WO2017182523A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
IT102016000040879 2016-04-20
ITUA2016A002756A ITUA20162756A1 (en) 2016-04-20 2016-04-20 METHOD AND SERVICE SYSTEM AT DISTANCE IN REAL TIME WITH THE USE OF COMPUTER VISION AND INCREASED REALITY
CH00526/16 2016-04-20
CH00526/16A CH712380A2 (en) 2016-04-20 2016-04-20 Real-time remote assistance method and system using computer vision and augmented reality.

Publications (1)

Publication Number Publication Date
WO2017182523A1 true WO2017182523A1 (en) 2017-10-26

Family

ID=58638843

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2017/059290 WO2017182523A1 (en) 2016-04-20 2017-04-19 A method and a system for real-time remote support with use of computer vision and augmented reality

Country Status (1)

Country Link
WO (1) WO2017182523A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107872655A (en) * 2017-11-21 2018-04-03 北京恒华伟业科技股份有限公司 A kind of method and system for determining hidden danger point
CN113711260A (en) * 2019-04-12 2021-11-26 脸谱公司 Automated visual suggestion, generation and evaluation using computer vision detection
WO2022145655A1 (en) * 2020-12-29 2022-07-07 주식회사 딥파인 Augmented reality system
DE102022130357A1 (en) 2022-11-16 2024-05-16 Rheinisch-Westfälische Technische Hochschule Aachen, abgekürzt RWTH Aachen, Körperschaft des öffentlichen Rechts XR-based wireless control and management

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007066166A1 (en) * 2005-12-08 2007-06-14 Abb Research Ltd Method and system for processing and displaying maintenance or control instructions
WO2009036782A1 (en) * 2007-09-18 2009-03-26 Vrmedia S.R.L. Information processing apparatus and method for remote technical assistance
US20130120449A1 (en) * 2010-04-28 2013-05-16 Noboru IHARA Information processing system, information processing method and program
WO2015101393A1 (en) * 2013-12-30 2015-07-09 Telecom Italia S.P.A. Augmented reality for supporting intervention of a network apparatus by a human operator

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007066166A1 (en) * 2005-12-08 2007-06-14 Abb Research Ltd Method and system for processing and displaying maintenance or control instructions
WO2009036782A1 (en) * 2007-09-18 2009-03-26 Vrmedia S.R.L. Information processing apparatus and method for remote technical assistance
US20130120449A1 (en) * 2010-04-28 2013-05-16 Noboru IHARA Information processing system, information processing method and program
WO2015101393A1 (en) * 2013-12-30 2015-07-09 Telecom Italia S.P.A. Augmented reality for supporting intervention of a network apparatus by a human operator

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107872655A (en) * 2017-11-21 2018-04-03 北京恒华伟业科技股份有限公司 A kind of method and system for determining hidden danger point
CN113711260A (en) * 2019-04-12 2021-11-26 脸谱公司 Automated visual suggestion, generation and evaluation using computer vision detection
WO2022145655A1 (en) * 2020-12-29 2022-07-07 주식회사 딥파인 Augmented reality system
DE102022130357A1 (en) 2022-11-16 2024-05-16 Rheinisch-Westfälische Technische Hochschule Aachen, abgekürzt RWTH Aachen, Körperschaft des öffentlichen Rechts XR-based wireless control and management
WO2024104739A1 (en) 2022-11-16 2024-05-23 Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Körperschaft des öffentlichen Rechts Xr-based wireless control and management

Similar Documents

Publication Publication Date Title
US11710279B2 (en) Contextual local image recognition dataset
CN109584276B (en) Key point detection method, device, equipment and readable medium
US8917908B2 (en) Distributed object tracking for augmented reality application
WO2017182523A1 (en) A method and a system for real-time remote support with use of computer vision and augmented reality
CN109743626B (en) Image display method, image processing method and related equipment
JP2021508123A (en) Remote sensing Image recognition methods, devices, storage media and electronic devices
US11961271B2 (en) Multi-angle object recognition
US11700417B2 (en) Method and apparatus for processing video
EP2972950B1 (en) Segmentation of content delivery
WO2013088637A2 (en) Information processing device, information processing method and program
WO2016089502A1 (en) Automatic processing of images using adjustment parameters determined based on semantic data and a reference image
US9600720B1 (en) Using available data to assist in object recognition
US20210352343A1 (en) Information insertion method, apparatus, and device, and computer storage medium
CN110598139A (en) Web browser augmented reality real-time positioning method based on 5G cloud computing
CN114390368B (en) Live video data processing method and device, equipment and readable medium
KR20210121515A (en) Method, system, and computer program for extracting and providing text color and background color in image
CN111489284B (en) Image processing method and device for image processing
CN109871465B (en) Time axis calculation method and device, electronic equipment and storage medium
CN112585957A (en) Station monitoring system and station monitoring method
CN110942056A (en) Clothing key point positioning method and device, electronic equipment and medium
US11683453B2 (en) Overlaying metadata on video streams on demand for intelligent video analysis
KR20040006612A (en) Video geographic information system
US11436826B2 (en) Augmented reality experience for shopping
KR101525409B1 (en) Augmented method of contents using image-cognition modules
US10281294B2 (en) Navigation system and navigation method

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17719829

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17719829

Country of ref document: EP

Kind code of ref document: A1