WO2013174231A1 - 增强现实交互的实现方法和系统 - Google Patents
增强现实交互的实现方法和系统 Download PDFInfo
- Publication number
- WO2013174231A1 WO2013174231A1 PCT/CN2013/075784 CN2013075784W WO2013174231A1 WO 2013174231 A1 WO2013174231 A1 WO 2013174231A1 CN 2013075784 W CN2013075784 W CN 2013075784W WO 2013174231 A1 WO2013174231 A1 WO 2013174231A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- template image
- feature points
- template
- augmented reality
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
Definitions
- the present invention relates to analog simulation techniques, and in particular, to an implementation method and system for augmented reality interaction.
- Augmented reality technology is a simulation technology that applies virtual information to the real world. It superimposes the image and virtual environment of the real environment into the same picture in real time.
- Various applications using augmented reality technology can integrate users into a virtual environment, and the traditional augmented reality technology interaction process can be implemented by various clients running on the terminal device.
- a certain poster is used as a mark, and the marked image is captured by the camera, and the recognition result is obtained, and the movie trailer related to the content in the specific poster is obtained according to the recognition result, and played.
- the augmented reality technology interaction process is used to identify the logic involved, it is very complicated, and contains a variety of files, which leads to the client is too large, so the interactive application of augmented reality technology implemented by various clients
- Each type of interactive application can only correspond to a single tag, and is implemented by a corresponding client.
- each client can only correspond to one type of tag.
- the corresponding client needs to be developed in a targeted manner.
- a single client cannot implement the augmented reality interaction process for multiple tags, which in turn results in users having to repeatedly download and install multiple clients, which lacks flexibility.
- the traditional augmented reality technology interaction can also be realized by a host connected to a large screen indoors or outdoors.
- a host connected to a large screen indoors or outdoors.
- the outdoor large screen will play the video image of the viewer and the virtual star character learning to dance or take a photo; for example, at a certain Indoor large-screen advertisements are launched in the museum, and viewers can see video images of dinosaurs or astronauts passing by from a large indoor screen in a specific area.
- the host connected to the indoor or outdoor large screen has stronger background computing capability than the client running on the terminal device, and can handle complex logic in the process of augmented reality technology interaction, but through the large screen and the host connected to it
- the implementation of the augmented reality technology interaction is also limited to a single mark due to the limitations of use, and lacks flexibility.
- An implementation method for augmented reality interaction includes the following steps:
- the media data corresponding to the template image is superimposed on the marked area, and the superimposed image is displayed.
- An implementation system for augmented reality interaction comprising a client and a server; the client includes an acquisition module, a detection module, and a presentation processing module;
- the collecting module is configured to collect a frame image and upload the frame image
- the server is configured to identify a template image that matches the frame image, and return the template image;
- the detecting module is configured to detect a marked area of the frame image according to the template image
- the presentation processing module is configured to superimpose the media data corresponding to the template image onto the marked area and display the superimposed image.
- the method and system for realizing the augmented reality interaction are uploaded after the frame image is collected, and the template image is matched according to the uploaded frame image, and the template image matched with the template image is returned, and the marked area is detected according to the returned template image, and then the media is detected.
- the data is superimposed on the marked area, and the superimposed image is displayed.
- the frame image is uploaded to the remote server to perform the recognition matching process with the template image, so that the relatively complicated identification matching process does not need to be completed locally, thereby greatly enhancing the enhancement.
- the recognition ability in the real interaction can identify the template image matching with each mark, which greatly improves the flexibility.
- FIG. 1 is a flow chart of an implementation method of an augmented reality interaction in an embodiment
- FIG. 2 is a flow chart of a method for identifying a template image that matches a frame image and returning a template image in FIG. 1;
- FIG. 3 is a flow chart of a method for detecting a marked area of a frame image according to a template image in FIG. 1;
- FIG. 4 is a flow chart of a method for implementing an augmented reality interaction in another embodiment
- FIG. 5 is a flow chart of a method for implementing an augmented reality interaction in another embodiment
- FIG. 6 is a flow chart of a method for implementing an augmented reality interaction in another embodiment
- FIG. 7 is a schematic structural diagram of an implementation system of an augmented reality interaction in an embodiment
- FIG. 8 is a schematic structural diagram of a server in FIG. 7;
- FIG. 9 is a schematic structural view of the detecting module of FIG. 7;
- FIG. 10 is a schematic structural diagram of a client in an embodiment
- FIG. 11 is a schematic structural diagram of a server in another embodiment
- FIG. 12 is a schematic structural diagram of an implementation system of an augmented reality interaction in another embodiment.
- an implementation method of an augmented reality interaction includes the following steps:
- step S110 a frame image is acquired, and a frame image is uploaded.
- image acquisition is performed to obtain a frame image
- the frame image may be in a two-dimensional or three-dimensional form, and is an image in a sequence of images corresponding to the video stream obtained in the image acquisition process.
- the image acquisition is continued to obtain a video stream, and the video stream is formed by a sequence of images, that is, the image sequence includes a plurality of frame images, and the frame image collected and uploaded to the server is the currently acquired image in the image sequence.
- Step S130 performing a recognition to obtain a template image that matches the frame image, and returning the template image.
- a template image matching the uploaded frame image is identified from the pre-stored template image, and may be used by the recognition algorithm based on the SIFT-based pattern recognition algorithm.
- the image identifies the template image.
- the frame image is a poster image of an XX movie
- the pre-stored template image contains images of hundreds of movie posters.
- a poster of the XX movie is obtained from the stored template image by recognizing the stored template image.
- the image, the poster image obtained by this recognition is a template image that matches the frame image.
- the server After identifying the template image that matches the frame image, the server returns the identified template image to the client that uploaded the frame image.
- Step S150 detecting a marked area of the frame image according to the template image.
- a marker is captured during image acquisition to obtain a frame image of the marker, and a region formed by the marker in the frame image is a marker region.
- the template image is used to detect the marked area in the frame image, where there is also an image of the marker.
- the mark area of the frame image can be obtained by the comparison of the template image and the frame image, and in addition, the point where the mark area is formed in the template image can be recorded in advance, and the recorded point is more quickly The marked area in the frame image is obtained.
- Step S170 superimposing the media data corresponding to the template image on the mark area, and displaying the superimposed image.
- the media data is corresponding to the template image, and may be a video stream or a three-dimensional video model.
- the template image is a movie poster
- the media data is a play file of the movie.
- the media data is superimposed on the mark area.
- the play of the media data constitutes a virtual environment, and a series of frame images outside the mark area will constitute a real environment, realizing the effect of augmented reality. .
- step S130 includes:
- Step S131 Acquire attribute information of the uploaded frame image.
- the attribute information of the uploaded frame image is used to record the description information related to the frame image.
- the attribute information includes user information and user information, wherein the user information is registered personal identity information, such as gender, age, educational background, and hobbies; and the device information is used when the user uploads the frame image.
- the information returned by the hardware device for example, assumes that the user uploads a frame image to the server using a certain mobile terminal, and the device information includes GPS geographic information, device manufacturer, and network environment.
- Step S133 defining a matching range in the stored template image according to the attribute information.
- the range of the plurality of template images stored is defined based on the attribute information.
- the attribute information records that the user who uploads the frame image is a female, and the GPS geographic information is Beijing.
- the limited matching range is a template image related to women and Beijing.
- the template images in the matching range are cosmetic advertisement images and Beijing concert images. Defining the matching range is advantageous for quickly obtaining a template image that matches the frame image and improving the accuracy of the matching.
- Step S135 searching for the template image in the matching range, determining whether the frame image matches the searched template image, and if yes, proceeding to step S137, and if no, returning to step S110.
- the template images in the matching range are searched one by one to obtain a template image that matches the frame image, and the searched template image is returned to the user who uploaded the frame image.
- Step S137 returning the template image of the search.
- step S150 includes:
- Step S151 obtaining feature points in the frame image according to the training data corresponding to the template image.
- the training data is used to record feature points of the marked area in the template image, and the marked area in the template image may be identified by a series of feature points. Since the template image is matched with the frame image, the feature points in the frame image for identifying the marked area are obtained by the feature points recorded in the training data, that is, the feature points recorded in the training data and the feature points in the frame image are Matching feature point pairs.
- Step S153 acquiring a contour position of the marked area in the frame image by the feature point.
- the contour position of the marked area in the frame image is obtained by a series of feature points in the frame image, and the contour of the marked area and the coordinates in the frame image are obtained by the contour position.
- the process of obtaining the above-mentioned marked area is processed on the client side, but it is not limited to this, and it can also be processed on the server side.
- the method further includes:
- step S210 it is determined whether the training data and the media data corresponding to the template image exist in the local file. If not, the process proceeds to step S230, and if yes, the process proceeds to step S250.
- the local file is a file stored locally on the client. After obtaining the marked area of the frame image, it is determined whether the training data and the media data corresponding to the template image exist locally in the client. If not, the training data and the media data need to be downloaded from the server, if the template image exists locally on the client. The corresponding training data and media data are directly loaded.
- Step S230 downloading training data and media data.
- the user may perform detection of the marked area and superposition and playback of the media data after the download is completed, and may perform subsequent processing while transmitting in the streaming data transmission process of the training data and the media data.
- Step S250 loading training data and media data.
- the method further includes:
- Step S310 detecting the stored template image to obtain feature points, and determining whether the number of feature points is less than a threshold. If not, proceeding to step S330, and if so, ending.
- the feature points in the frame image are obtained by the feature points corresponding to the template image.
- the template image is an image collected and stored by the server and uploaded and saved by the user. For the image stored as the template image, there is no corresponding training data and media data in the data stored by the server. At this time, the template image needs to be trained to obtain the training data, and the template image and the media data are established. Correspondence between them.
- the training of the template image can be performed on the server side or on the client side. However, it is preferable to implement training of the template image on the server side to implement a lightweight client.
- the image stored as the template image needs to be detected by the feature point detection algorithm to obtain feature points in the image.
- the feature detection algorithm may be a FAST feature point detection algorithm or a similar SURF feature point detection algorithm, and may also be other feature point detection algorithms, which are not enumerated here.
- the threshold selected is 100.
- Step S330 acquiring a sample image corresponding to the template image, and detecting feature points in the sample image.
- a plurality of sample images corresponding to the template image are acquired to detect the feature points, thereby ensuring the robustness of the feature points.
- the sample image is subjected to feature point detection for each sample.
- Step S350 processing the feature points in the template image and the sample image to generate training data of the recorded feature points.
- the feature points in the merge template image and the sample image form training data in which feature points are recorded. Specifically, the same feature points in the template image and the sample image are combined into one feature point, and then the position of the feature point is recorded to obtain training data.
- the clipping of the feature points will also be performed to ensure the accuracy of the feature points.
- Some feature points in multiple sample images are very low frequency, and these less reproducible feature points are more likely to detect errors, which will cause interference in the detection of subsequent frame image mark areas. Therefore, it should be trimmed to eliminate the feature points.
- the specific process of merging and trimming the feature points in the template image and the sample image is: adding random noise and blurring to the template image and the sample image, and then performing feature point detection on the image with added noise and blur, Obtain corresponding feature points; determine whether the feature points of the template image and the sample image exist in the feature points corresponding to the image to which the noise and the blur are added, and if not, trim the image, and if not, merge.
- the number of times of recurrence of the feature points is also determined. If the number of recurrences is greater than the recurrence threshold, the feature points are recorded, and if not, the feature points are eliminated to be more effective. The accuracy of the feature points is guaranteed.
- the generation process of the above training data is implemented on the server side, but is not limited thereto, and can also be implemented in the client.
- the method further includes:
- Step S410 selecting a template image and corresponding media data.
- the user may also select the template image and the corresponding media data to implement personalized augmented reality interaction.
- the template image may be an image obtained by a user, or may be an image obtained by other methods;
- the media data may be a video stream captured by the user or a three-dimensional video model, or may be a video stream acquired by the user on the Internet.
- the 3D video model is edited. For example, the user can change the background music in the downloaded video stream and replace it with his own voice.
- step S430 it is determined whether the selected template image and the corresponding media data are shared according to the upload operation of the login user. If yes, the process goes to step S450, and if no, the process goes to step S470.
- the user information needs to be verified to enter the login state.
- the upload operation of the login user is obtained, and the upload operation includes the upload instruction triggered by the user and/or Sharing instructions, users can choose whether to share according to their needs.
- step S450 the selected template image and the corresponding media data are uploaded and stored in a common storage space.
- the selected template image and the corresponding media data are shared, they are uploaded and stored in a common storage space, and other users can also use the template image and media data uploaded by the login user.
- Step S470 uploading and storing to the storage space corresponding to the login user.
- the uploaded template image and the corresponding media data are stored in the storage space corresponding to the login user.
- the storage space corresponding to the login user has a higher priority than the public storage space.
- the priority of the storage space corresponding to the login user and the priority of the common storage space determine the priority of the template image stored therein, in other words, the process of identifying the template image that matches the frame image. If two template images matching the frame image are identified, the two template images are respectively stored in the storage space corresponding to the login user and the common storage space. At this time, the storage space corresponding to the login user has a higher priority than the public. The storage space will preferentially adopt the template image stored in the storage space corresponding to the login user and return it to the login user.
- an implementation system of augmented reality interaction includes a client 10 and a server 30 , wherein the client 10 includes an acquisition module 110 , a detection module 130 , and a presentation processing module 150 .
- the client is installed in the terminal device, and is divided into a computer client, a mobile client, and a web client according to the type of the terminal device, wherein the computer client is installed in the computer, and the mobile client is installed on the mobile client.
- a web client is implemented based on a browser.
- the acquisition module 110 is configured to collect a frame image and upload a frame image.
- the acquisition module 110 performs image acquisition to obtain a frame image, and the frame image may be in a two-dimensional or three-dimensional form, and acquires an image in the image sequence corresponding to the video stream obtained by the module 110.
- the acquisition module 110 continuously performs image acquisition to obtain a video stream, and the video stream is formed by a sequence of images, that is, the image sequence includes a plurality of frame images, and the frame image collected and uploaded to the server is the image currently acquired in the image sequence.
- the collection module 110 can be a camera in the terminal device.
- the server 30 is configured to perform recognition to obtain a template image that matches the frame image, and return the template image.
- a plurality of different template images are pre-stored in the server 30, and a template image matching the uploaded frame image is identified from the pre-stored template image, and can be used for identification by a SIFT-based pattern recognition algorithm or the like.
- the algorithm recognizes the template image from the frame image.
- the frame image is a poster image of an XX movie
- the template image pre-stored by the server 30 contains images of hundreds of movie posters.
- the XX movie is obtained from the stored template image by recognizing the stored template image.
- the poster image, the poster image obtained by this recognition is a template image that matches the frame image.
- the server 30 After identifying the template image that matches the frame image, the server 30 returns the identified template image to the client 10 that uploaded the frame image.
- the detecting module 130 is configured to detect a marked area of the frame image according to the template image.
- the acquisition module 10 captures a marker to obtain a frame image of the marker, and the region formed by the marker in the frame image is a marker region.
- the template image is used to detect the marked area in the frame image, where there is also an image of the marker.
- the mark area of the frame image can be obtained by the comparison of the template image and the frame image, and in addition, the point where the mark area is formed in the template image can be recorded in advance, and the recorded point is more quickly The marked area in the frame image is obtained.
- the display processing module 150 is configured to superimpose the media data corresponding to the template image to the marked area, and display the superimposed image.
- the media data is corresponding to the template image, and may be a video stream or a three-dimensional video model.
- the template image is a movie poster
- the media data is a play file of the movie.
- the media data is superimposed on the mark area.
- the play of the media data constitutes a virtual environment, and a series of frame images outside the mark area will constitute a real environment, realizing the effect of augmented reality. .
- the server 30 includes an attribute obtaining module 310, a range defining module 330, and a searching module 350.
- the attribute obtaining module 310 is configured to acquire attribute information of the uploaded frame image.
- the attribute information of the uploaded frame image is used to record the description information related to the frame image.
- the attribute information includes user information and user information, wherein the user information is registered personal identity information, such as gender, age, educational background, and hobbies; and the device information is used when the user uploads the frame image.
- the information returned by the hardware device for example, assuming that the user uploads a frame image to the server using a certain mobile terminal, the device information includes GPS geographic information, a device manufacturer, and a network environment.
- the range defining module 330 is configured to define a matching range in the stored template image according to the attribute information.
- the range defining module 330 performs range definition on the stored plurality of template images based on the attribute information.
- the attribute information records that the user who uploads the frame image is a female, and the GPS geographic information is Beijing.
- the limited matching range is a template image related to women and Beijing.
- the template images in the matching range are cosmetic advertisement images and Beijing concert images. Defining the matching range is advantageous for quickly obtaining a template image that matches the frame image and improving the accuracy of the matching.
- the searching module 350 is configured to search for the template image in the matching range, determine whether the frame image matches the searched template image, and if yes, return the template image to the client 10, and if not, notify the collecting module 110.
- the searching module 350 searches for a template image in a matching range one by one to obtain a template image that matches the frame image, and returns the searched template image to the user who uploads the frame image.
- the above detection module 130 includes a feature detection unit 131 and a contour acquisition unit 133.
- the feature detecting unit 131 is configured to obtain feature points in the frame image according to the training data corresponding to the template image.
- the training data is used to record feature points of the marked area in the template image, and the marked area in the template image may be identified by a series of feature points. Since the template image is matched with the frame image, the feature detecting unit 131 obtains the feature points in the frame image for identifying the marked area by the feature points recorded in the training data, that is, the feature points and the frame images recorded in the training data.
- the feature points are feature point pairs that match each other.
- the contour obtaining unit 133 is configured to acquire a contour position of the marked area in the frame image by using the feature point.
- the contour acquiring unit 133 obtains the contour position of the marked area in the frame image by a series of feature points in the frame image, and further obtains the contour of the marked area and the coordinates in the frame image by the contour position.
- the above detection module 130 may be provided in the server 30 in addition to the client 10.
- the client 10 further includes a data acquisition module 170.
- the data obtaining module 170 is configured to determine whether the training data and the media data corresponding to the template image exist in the local file, and if not, download the training data and the media data, and if so, load the training data and the media data.
- the local file is a file stored locally on the client.
- the data obtaining module 170 determines whether the training data and the media data corresponding to the template image exist locally in the client. If not, the training data and the media data need to be downloaded. If the client has the training data and the media data corresponding to the template image locally, Then load directly.
- the user can perform detection of the marked area and superposition and playback of the media data after the download is completed, and can also perform subsequent processing while transmitting in the streaming data transmission process of the training data and the media data.
- the server 30 further includes a feature processing module 370 and a training data generating module 390.
- the feature processing module 370 is configured to detect the stored template image to obtain a feature point, and determine whether the number of feature points is less than a threshold. If not, obtain a sample image corresponding to the template image, and detect a feature point of the sample image, if Then it ends.
- the feature processing module 370 obtains feature points in the frame image by using feature points corresponding to the template image.
- the template image is an image collected and stored by the server and uploaded and saved by the user. For the image stored as the template image, there is no corresponding training data and media data in the data stored by the server. At this time, the template image needs to be trained to obtain the training data, and the template image and the media data are established. Correspondence between them.
- the training of the template image can be performed at the server or at the client. However, it is preferable to implement training of the template image on the server to implement a lightweight client.
- the feature processing module 370 is required to detect the image stored as the template image by a feature point detection algorithm to obtain feature points in the image.
- the feature detection algorithm may be a FAST feature point detection algorithm or a similar SURF feature point detection algorithm, and may also be other feature point detection algorithms, which are not enumerated here.
- the feature processing module 370 also needs to determine whether the number of feature points is sufficient for detecting the marked area of the frame image to ensure the validity of the template image.
- the threshold selected is 100.
- the training data generating module 390 is configured to process the feature points in the template image and the sample image to generate training data for recording the feature points.
- the training data generating module 390 acquires a plurality of sample images corresponding to the template image to perform feature point detection, thereby ensuring robustness of the feature points.
- the sample image is an image of various different rotation angles and/or scales corresponding to the template image.
- the sample image of the angle and the scaled scale is used to detect the feature points for each sample.
- the training data generation module 390 will also perform clipping of the feature points to ensure the accuracy of the feature points. Some feature points in multiple sample images are very low frequency, and these less reproducible feature points are more likely to detect errors, which will cause interference in the detection of subsequent frame image mark areas. Therefore, the training data generation module 390 should trim it to eliminate the feature points.
- the training data generating module 390 adds random noise and blur processing to the template image and the sample image, and then performs feature point detection on the image with added noise and blur to obtain corresponding feature points; and determines the template image and the sample image. Whether the feature point exists in the feature point corresponding to the image to which the noise and the blur are added, if not, it is trimmed, and if not, the merge is performed.
- the training data generating module 390 determines that the feature points of the template image and the sample image still exist in the feature points corresponding to the image to which the noise and the blur are added, it indicates that the feature point is reproducible, if a certain feature point is not present When a feature point corresponding to a noise and a blurred image is added, it is considered to be less reproducible.
- the training data generating module 390 also determines the number of times of recurrence of the feature point. If the number of times of recurrence is greater than the recurring threshold, the feature point is recorded, and if not, the feature point is removed. To more effectively ensure the accuracy of feature points.
- the feature processing module 370 and the training data generating module 390 may also be disposed in the client 10, and uploaded to the server 30 after the training data is generated.
- the client 10 is further configured to select a template image and corresponding media data.
- the user may also select the template image and the corresponding media data to implement personalized augmented reality interaction.
- the template image may be an image obtained by a user, or may be an image obtained by other methods;
- the media data may be a video stream captured by the user or a three-dimensional video model, or may be a video stream acquired by the user on the Internet.
- the 3D video model is edited. For example, the user can change the background music in the downloaded video stream and replace it with his own voice.
- the implementation system of the augmented reality interaction described above further includes a user database 50 and a shared database 70.
- the server 30 is further configured to determine whether the selected template image and the corresponding media data are shared according to the uploading operation of the login user, and if yes, upload and store the selected template image and the corresponding media data to the shared database 70, and if not, Then upload and store to the user database 50 corresponding to the login user.
- the server 30 obtains the upload operation of the login user, and the upload operation includes the upload instruction triggered by the user. / or share instructions, users can choose whether to share or not.
- the selected template image and the corresponding media data are shared, they are uploaded and stored in the shared database 70, and other users can also use the template image and media data uploaded by the login user.
- the uploaded template image and the corresponding media data are stored in the user database 50 corresponding to the login user.
- the priority of the user database 50 is set to the priority of the shared database 70.
- the priority of the user database 50 corresponding to the login user and the priority level of the shared database 70 determine the priority of the template image stored therein, in other words, the process of identifying the template image that matches the frame image. If the server 30 identifies two template images that match the frame image, the two template images are respectively stored in the user database 50 corresponding to the login user and the shared database 70. At this time, due to the user database 50 corresponding to the login user. If the priority is higher than the shared database, the template image stored in the user database 50 corresponding to the login user will be preferentially used and returned to the login user.
- the method and system for realizing the augmented reality interaction are uploaded after the frame image is collected, and the template image is matched according to the uploaded frame image, and the template image matched with the template image is returned, and the marked area is detected according to the returned template image, and then the media is detected.
- the data is superimposed on the marked area, and the superimposed image is displayed.
- the frame image is uploaded to the remote server to perform the recognition matching process with the template image, so that the relatively complicated identification matching process does not need to be completed locally, thereby greatly enhancing the enhancement.
- the recognition ability in the real interaction can identify the template image matching with each mark, which greatly improves the flexibility.
- the storage medium may be a magnetic disk, an optical disk, or a read-only storage memory (Read-Only) Memory, ROM) or Random Access Memory (RAM).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Processing Or Creating Images (AREA)
- Image Analysis (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
Claims (21)
- 一种增强现实交互的实现方法,包括如下步骤:采集帧图像,并上传所述帧图像;进行识别得到与所述帧图像匹配的模板图像,并返回所述模板图像;根据所述模板图像检测所述帧图像的标记区域;将与所述模板图像相对应的媒体数据叠加到所述标记区域,并展示叠加得到的图像。
- 根据权利要求1所述的增强现实交互的实现方法,其特征在于,所述进行识别得到与所述帧图像匹配的模板图像,并返回所述模板图像的步骤包括:获取上传所述帧图像的属性信息;根据所述属性信息在存储的模板图像中限定匹配范围;查找处于所述匹配范围的模板图像,判断所述帧图像是否与查找的模板图像匹配,若是,则返回所述查找的模板图像。
- 根据权利要求1所述的增强现实交互的实现方法,其特征在于,所述根据所述模板图像检测所述帧图像的标记区域的步骤包括:根据所述模板图像对应的训练数据得到所述帧图像中的特征点;通过所述特征点获取所述帧图像中标记区域的轮廓位置。
- 根据权利要求3所述的增强现实交互的实现方法,其特征在于,所述根据所述模板图像检测所述帧图像的标记区域的步骤之前包括:判断本地文件中是否存在所述模板图像对应的训练数据和媒体数据,若否,则下载所述训练数据和媒体数据,若是,则加载所述训练数据和媒体数据。
- 根据权利要求3所述的增强现实交互的实现方法,其特征在于,所述根据所述模板图像检测所述帧图像的标记区域的步骤之前还包括:对存储的模板图像进行检测得到特征点,并判断所述特征点的数量是否小于阈值,若否,则获取所述模板图像对应的样本图像,并检测得到所述样本图像中的特征点;处理所述模板图像和样本图像中的特征点生成记录所述特征点的训练数据。
- 根据权利要求5所述的增强现实交互的实现方法,其特征在于,所述处理所述模板图像和样本图像中的特征点生成记录所述特征点的训练数据的步骤包括:合并或剪裁模板图像和样本图像中的特征点形成记录了特征点的训练数据。
- 根据权利要求6所述的增强现实交互的实现方法,其特征在于,所述合并或剪裁模板图像和样本图像中的特征点形成记录了特征点的训练数据的步骤之前包括:对模板图像和样本图像添加随机噪声和进行模糊处理,对添加了噪声和模糊的图像再次进行特征点的检测,得到相应的特征点;判断模板图像和样本图像的特征点是否存在于添加了噪声和模糊的图像对应的特征点中,若是,则对所述模板图像和样本图像的特征点进行剪裁,若否,则进行合并。
- 根据权利要求7所述的增强现实交互的实现方法,其特征在于,所述对所述模板图像和样本图像的特征点进行剪裁的步骤之前还包括:进一步判断所述模板图像和样本图像的特征点的复现次数是否大于复现阈值,若否,则剔除所述特征点,若是,则进入所述对所述模板图像和样本图像的特征点进行剪裁的步骤。
- 根据权利要求5所述的增强现实交互的实现方法,其特征在于,所述对存储的模板图像进行检测得到特征点的步骤之前还包括:选定模板图像和对应的媒体数据;根据登录用户的上传操作判断所述选定的模板图像和对应的媒体数据是否共享,若是,则将所述选定的模板图像和对应的媒体数据上传并存储到公共存储空间,若否,则上传并存储到所述登录用户对应的存储空间。
- 根据权利要求9所述的增强现实交互的实现方法,其特征在于,所述登录用户对应的存储空间的优先级高于所述公共存储空间的优先级。
- 根据权利要求1所述的增强现实交互的实现方法,其特征在于,所述展示叠加得到的图像的步骤包括:将媒体数据的播放构成虚拟环境,标记区域之外的帧图像构成现实环境。
- 一种增强现实交互的实现系统,其特征在于,包括客户端以及服务器;所述客户端包括采集模块、检测模块以及展示处理模块;所述采集模块用于采集帧图像,并上传所述帧图像;所述服务器用于进行识别得到与所述帧图像匹配的模板图像,并返回所述模板图像;检测模块用于根据所述模板图像检测所述帧图像的标记区域;展示处理模块用于将与所述模板图像相对应的媒体数据叠加到所述标记区域,并展示叠加得到的图像。
- 根据权利要求12所述的增强现实交互的实现系统,其特征在于,所述服务器包括:属性获取模块,用于获取上传所述帧图像的属性信息;范围限定模块,用于根据所述属性信息在存储的模板图像中限定匹配范围;查找模块,用于查找处于所述匹配范围的模板图像,判断所述帧图像是否与查找的模板图像匹配,若是,则向客户端返回所述模板图像。
- 根据权利要求12所述的增强现实交互的实现系统,其特征在于,所述检测模块包括:特征检测单元,用于根据所述模板图像对应的训练数据得到所述帧图像中的特征点;轮廓获取单元,用于通过所述特征点获取所述帧图像中标记区域的轮廓位置。
- 根据权利要求12所述的增强现实交互的实现系统,其特征在于,所述客户端还包括:数据获取模块,用于判断本地文件中是否存在所述模板图像对应的训练数据和媒体数据,若否,则下载所述训练数据和媒体数据,若是,则加载所述训练数据和媒体数据。
- 根据权利要求14所述的增强现实交互的实现系统,其特征在于,所述服务器还包括:特征处理模块,用于对存储的模板图像进行检测得到特征点,并判断所述特征点的数量是否小于阈值,若否,则获取所述模板图像对应的样本图像,并检测得到所述样本图像中的特征点;训练数据生成模块,用于处理所述模板图像和样本图像中的特征点生成记录所述特征点的训练数据。
- 根据权利要求16所述的增强现实交互的系统,其特征在于,所述训练数据生成模块还用于合并或剪裁模板图像和样本图像中的特征点生成记录所述特征点的训练数据。
- 根据权利要求17所述的增强现实交互的系统,其特征在于,所述训练数据生成模块还用于对模板图像和样本图像添加随机噪声和进行模糊处理,对添加了噪声和模糊的图像再次进行特征点的检测,得到相应的特征点,判断模板图像和样本图像的特征点是否存在于添加了噪声和模糊的图像对应的特征点中,若是,则对所述模板图像和样本图像的特征点进行剪裁,若否,则进行合并。
- 根据权利要求18所述的增强现实交互的系统,其特征在于,所述训练数据生成模块还用于进一步判断所述模板图像和样本图像的特征点的复现次数是否大于复现阈值,若否,则剔除所述特征点,若是,则对所述模板图像和样本图像的特征点进行剪裁。
- 根据权利要求16所述的增强现实交互的实现系统,其特征在于,所述客户端还用于选定模板图像和对应的媒体数据;所述系统还包括用户数据库和共享数据库;所述服务器还用于根据登录用户的上传操作判断所述选定的模板图像和对应的媒体数据是否共享,若是,则将所述选定的模板图像和对应的媒体数据上传并存储到共享数据库,若否,则上传并存储到所述登录用户对应的用户数据库。
- 根据权利要求20所述的增强现实交互的实现系统,其特征在于,所述用户数据库的优先级高于所述共享数据库的优先级。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/403,115 US9189699B2 (en) | 2012-05-22 | 2013-05-17 | Augmented reality interaction implementation method and system |
JP2015513001A JP5827445B2 (ja) | 2012-05-22 | 2013-05-17 | 拡張現実インタラクションを実現する方法およびシステム |
KR1020147035808A KR101535579B1 (ko) | 2012-05-22 | 2013-05-17 | 증강 현실 인터액션 구현 방법 및 시스템 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210160524.8A CN103426003B (zh) | 2012-05-22 | 2012-05-22 | 增强现实交互的实现方法和系统 |
CN201210160524.8 | 2012-05-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013174231A1 true WO2013174231A1 (zh) | 2013-11-28 |
Family
ID=49623107
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2013/075784 WO2013174231A1 (zh) | 2012-05-22 | 2013-05-17 | 增强现实交互的实现方法和系统 |
Country Status (5)
Country | Link |
---|---|
US (1) | US9189699B2 (zh) |
JP (1) | JP5827445B2 (zh) |
KR (1) | KR101535579B1 (zh) |
CN (1) | CN103426003B (zh) |
WO (1) | WO2013174231A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180004109A (ko) * | 2015-05-11 | 2018-01-10 | 구글 엘엘씨 | 모바일 디바이스 현지화를 위한 영역 서술 파일의 크라우드-소싱 생성 및 업데이트 |
CN107845122A (zh) * | 2017-09-08 | 2018-03-27 | 百度在线网络技术(北京)有限公司 | 一种确定建筑物的面状信息的方法与装置 |
Families Citing this family (40)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8493408B2 (en) | 2008-11-19 | 2013-07-23 | Apple Inc. | Techniques for manipulating panoramas |
US10303945B2 (en) | 2012-12-27 | 2019-05-28 | Panasonic Intellectual Property Corporation Of America | Display method and display apparatus |
US8965066B1 (en) * | 2013-09-16 | 2015-02-24 | Eye Verify LLC | Biometric template security and key generation |
WO2015075937A1 (ja) | 2013-11-22 | 2015-05-28 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 情報処理プログラム、受信プログラムおよび情報処理装置 |
CN105023266B (zh) * | 2014-04-29 | 2018-03-30 | 高德软件有限公司 | 增强现实实现方法、装置和终端设备 |
HK1201682A2 (zh) * | 2014-07-11 | 2015-09-04 | Idvision Ltd | 增强現實圖像之系統 |
WO2016031190A1 (ja) | 2014-08-27 | 2016-03-03 | 日本電気株式会社 | 情報処理装置および認識支援方法 |
CN107111894B (zh) * | 2014-09-08 | 2022-04-29 | 西姆克斯有限责任公司 | 用于专业和教育训练的增强或虚拟现实模拟器 |
CN105574006A (zh) * | 2014-10-10 | 2016-05-11 | 阿里巴巴集团控股有限公司 | 建立拍照模板数据库、提供拍照推荐信息的方法及装置 |
WO2016075948A1 (ja) * | 2014-11-14 | 2016-05-19 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 再生方法、再生装置およびプログラム |
US9977493B2 (en) * | 2015-06-17 | 2018-05-22 | Microsoft Technology Licensing, Llc | Hybrid display system |
EP3376772B1 (en) | 2015-11-12 | 2023-01-25 | Panasonic Intellectual Property Corporation of America | Display method, program and display device |
EP3393132B1 (en) | 2015-12-17 | 2022-11-02 | Panasonic Intellectual Property Corporation of America | Display method and display device |
CN105912555B (zh) * | 2016-02-04 | 2019-03-05 | 北京通感科技有限公司 | 一种数据信息与实景空间的交互再现方法 |
GB2551473A (en) * | 2016-04-29 | 2017-12-27 | String Labs Ltd | Augmented media |
CN106127858B (zh) * | 2016-06-24 | 2020-06-23 | 联想(北京)有限公司 | 一种信息处理方法及电子设备 |
CN106302444A (zh) * | 2016-08-16 | 2017-01-04 | 深圳市巴古科技有限公司 | 智能云识别方法 |
CN110114988B (zh) | 2016-11-10 | 2021-09-07 | 松下电器(美国)知识产权公司 | 发送方法、发送装置及记录介质 |
CN106777083A (zh) * | 2016-12-13 | 2017-05-31 | 四川研宝科技有限公司 | 一种标记图片中物体的方法及装置 |
US10375130B2 (en) | 2016-12-19 | 2019-08-06 | Ricoh Company, Ltd. | Approach for accessing third-party content collaboration services on interactive whiteboard appliances by an application using a wrapper application program interface |
US10250592B2 (en) | 2016-12-19 | 2019-04-02 | Ricoh Company, Ltd. | Approach for accessing third-party content collaboration services on interactive whiteboard appliances using cross-license authentication |
CN106685941A (zh) * | 2016-12-19 | 2017-05-17 | 宇龙计算机通信科技(深圳)有限公司 | 一种优化ar注册的方法、装置及服务器 |
CN106683196A (zh) * | 2016-12-30 | 2017-05-17 | 上海悦会信息科技有限公司 | 一种增强现实的展示方法、系统以及智能终端 |
CN106859956B (zh) * | 2017-01-13 | 2019-11-26 | 北京安云世纪科技有限公司 | 一种人体穴位识别按摩方法、装置及ar设备 |
US10395405B2 (en) * | 2017-02-28 | 2019-08-27 | Ricoh Company, Ltd. | Removing identifying information from image data on computing devices using markers |
WO2018177134A1 (zh) * | 2017-03-29 | 2018-10-04 | 腾讯科技(深圳)有限公司 | 用户生成内容处理方法、存储介质和终端 |
CN107168619B (zh) * | 2017-03-29 | 2023-09-19 | 腾讯科技(深圳)有限公司 | 用户生成内容处理方法和装置 |
CN107194817B (zh) * | 2017-03-29 | 2023-06-23 | 腾讯科技(深圳)有限公司 | 用户社交信息的展示方法、装置和计算机设备 |
CN107146275B (zh) * | 2017-03-31 | 2020-10-27 | 北京奇艺世纪科技有限公司 | 一种设置虚拟形象的方法及装置 |
US10824866B2 (en) * | 2017-06-13 | 2020-11-03 | The Marketing Store Worldwife, LP | System, method, and apparatus for augmented reality implementation |
CN107358658B (zh) * | 2017-07-20 | 2021-08-20 | 深圳市大象文化科技产业有限公司 | 一种乳房整形ar预测方法、装置和系统 |
CN109429002A (zh) * | 2017-08-28 | 2019-03-05 | 中国科学院深圳先进技术研究院 | 一种拍摄人像的方法及装置 |
PT3456905T (pt) | 2017-09-19 | 2023-08-02 | Univ Evora | Sistema de abrigo transportável com capacidade de capturar, coletar e converter névoa e humidade ambientais em água potável |
CN108369731B (zh) * | 2018-02-02 | 2023-07-21 | 达闼机器人股份有限公司 | 模板优化方法、装置、电子设备和计算机程序产品 |
CN108388637A (zh) * | 2018-02-26 | 2018-08-10 | 腾讯科技(深圳)有限公司 | 一种用于提供增强现实服务的方法、装置以及相关设备 |
CN109490843B (zh) * | 2018-11-15 | 2020-08-04 | 成都傅立叶电子科技有限公司 | 一种归一化雷达屏幕监测方法及系统 |
KR102111499B1 (ko) * | 2019-09-19 | 2020-05-18 | (주)자이언트스텝 | 얼굴 애니메이션을 위한 얼굴 형태 변화 전사 방법 및 컴퓨터 판독 가능한 저장매체 |
CN114450967B (zh) * | 2020-02-28 | 2024-07-09 | 谷歌有限责任公司 | 用于由图像识别触发的增强现实内容的回放的系统和方法 |
CN112153422B (zh) * | 2020-09-25 | 2023-03-31 | 连尚(北京)网络科技有限公司 | 视频融合方法和设备 |
WO2024108026A1 (en) * | 2022-11-16 | 2024-05-23 | Aveva Software, Llc | Computerized systems and methods for an industrial metaverse |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101551732A (zh) * | 2009-03-24 | 2009-10-07 | 上海水晶石信息技术有限公司 | 带有交互功能的增强现实的方法及其系统 |
CN102129708A (zh) * | 2010-12-10 | 2011-07-20 | 北京邮电大学 | 增强现实环境中快速多层次虚实遮挡处理方法 |
CN102156808A (zh) * | 2011-03-30 | 2011-08-17 | 北京触角科技有限公司 | 增强现实实时虚拟饰品试戴系统及方法 |
CN102332095A (zh) * | 2011-10-28 | 2012-01-25 | 中国科学院计算技术研究所 | 一种人脸运动跟踪方法和系统以及一种增强现实方法 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6522312B2 (en) * | 1997-09-01 | 2003-02-18 | Canon Kabushiki Kaisha | Apparatus for presenting mixed reality shared among operators |
KR100912264B1 (ko) * | 2008-02-12 | 2009-08-17 | 광주과학기술원 | 사용자 반응형 증강 영상 생성 방법 및 시스템 |
EP2591460A1 (en) * | 2010-06-22 | 2013-05-15 | Nokia Corp. | Method, apparatus and computer program product for providing object tracking using template switching and feature adaptation |
JP5686673B2 (ja) * | 2010-07-22 | 2015-03-18 | 富士フイルム株式会社 | 画像処理装置、画像処理方法およびプログラム |
US8681179B2 (en) * | 2011-12-20 | 2014-03-25 | Xerox Corporation | Method and system for coordinating collisions between augmented reality and real reality |
KR20140082610A (ko) * | 2014-05-20 | 2014-07-02 | (주)비투지 | 휴대용 단말을 이용한 증강현실 전시 콘텐츠 재생 방법 및 장치 |
-
2012
- 2012-05-22 CN CN201210160524.8A patent/CN103426003B/zh active Active
-
2013
- 2013-05-17 KR KR1020147035808A patent/KR101535579B1/ko active IP Right Grant
- 2013-05-17 WO PCT/CN2013/075784 patent/WO2013174231A1/zh active Application Filing
- 2013-05-17 JP JP2015513001A patent/JP5827445B2/ja active Active
- 2013-05-17 US US14/403,115 patent/US9189699B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101551732A (zh) * | 2009-03-24 | 2009-10-07 | 上海水晶石信息技术有限公司 | 带有交互功能的增强现实的方法及其系统 |
CN102129708A (zh) * | 2010-12-10 | 2011-07-20 | 北京邮电大学 | 增强现实环境中快速多层次虚实遮挡处理方法 |
CN102156808A (zh) * | 2011-03-30 | 2011-08-17 | 北京触角科技有限公司 | 增强现实实时虚拟饰品试戴系统及方法 |
CN102332095A (zh) * | 2011-10-28 | 2012-01-25 | 中国科学院计算技术研究所 | 一种人脸运动跟踪方法和系统以及一种增强现实方法 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180004109A (ko) * | 2015-05-11 | 2018-01-10 | 구글 엘엘씨 | 모바일 디바이스 현지화를 위한 영역 서술 파일의 크라우드-소싱 생성 및 업데이트 |
JP2018519558A (ja) * | 2015-05-11 | 2018-07-19 | グーグル エルエルシー | 移動体装置ローカリゼーションのためのエリア記述ファイルのクラウドソーシングによる作成および更新 |
KR102044491B1 (ko) * | 2015-05-11 | 2019-11-13 | 구글 엘엘씨 | 모바일 디바이스 현지화를 위한 영역 서술 파일의 크라우드-소싱 생성 및 업데이트 |
CN107845122A (zh) * | 2017-09-08 | 2018-03-27 | 百度在线网络技术(北京)有限公司 | 一种确定建筑物的面状信息的方法与装置 |
Also Published As
Publication number | Publication date |
---|---|
CN103426003B (zh) | 2016-09-28 |
KR20150011008A (ko) | 2015-01-29 |
JP2015524103A (ja) | 2015-08-20 |
CN103426003A (zh) | 2013-12-04 |
KR101535579B1 (ko) | 2015-07-09 |
US20150139552A1 (en) | 2015-05-21 |
US9189699B2 (en) | 2015-11-17 |
JP5827445B2 (ja) | 2015-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2013174231A1 (zh) | 增强现实交互的实现方法和系统 | |
EP3625774A1 (en) | Augmented reality | |
TWI752502B (zh) | 一種分鏡效果的實現方法、電子設備及電腦可讀儲存介質 | |
JP6369909B2 (ja) | 顔表情採点装置、ダンス採点装置、カラオケ装置、およびゲーム装置 | |
JP5612310B2 (ja) | 顔認識のためのユーザーインターフェース | |
WO2018135881A1 (en) | Vision intelligence management for electronic devices | |
WO2012131653A2 (en) | Devices, systems, methods, and media for detecting, indexing, and comparing video signals from a video display in a background scene using a camera-enabled device | |
JP6016322B2 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
JP2008257460A (ja) | 情報処理装置、情報処理方法、およびプログラム | |
US11670099B2 (en) | Validating objects in volumetric video presentations | |
JP2016119059A (ja) | 画像処理装置および画像処理方法 | |
CN111491187A (zh) | 视频的推荐方法、装置、设备及存储介质 | |
WO2012118259A1 (ko) | 이미지에 기반한 동영상 관련 서비스 제공 시스템 및 방법 | |
BR112020003189A2 (pt) | método, sistema, e, mídia legível por computador não transitória | |
JP6268722B2 (ja) | 表示装置、表示方法、及び、プログラム | |
JP2007004427A (ja) | 画像表示システムおよび画像表示装置ならびにプログラム | |
JP2010262620A (ja) | 検索向け動態画像処理方法及びウェブサーバ | |
KR20140037439A (ko) | 음악의 분위기를 이용한 슬라이드 쇼 생성 방법 및 장치 | |
JP6217696B2 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
CN111625101B (zh) | 一种展示控制方法及装置 | |
WO2018035829A1 (zh) | 一种广告播放装置 | |
JP2019219988A (ja) | 意味情報付与装置、意味情報付与方法、およびプログラム | |
WO2018035832A1 (zh) | 一种视频广告播放装置 | |
KR102594976B1 (ko) | 증강 현실을 위한 동영상 컨텐츠 선택 장치, 사용자 단말기 및 동영상 컨텐츠 제공 방법 | |
US20230148007A1 (en) | System and method for playing audio corresponding to an image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13793270 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2015513001 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14403115 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 20147035808 Country of ref document: KR Kind code of ref document: A |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 09/04/2015) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13793270 Country of ref document: EP Kind code of ref document: A1 |