CN110322569A - Multi-modal AR processing method, device, equipment and readable storage medium storing program for executing - Google Patents

Multi-modal AR processing method, device, equipment and readable storage medium storing program for executing Download PDF

Info

Publication number
CN110322569A
CN110322569A CN201910592876.2A CN201910592876A CN110322569A CN 110322569 A CN110322569 A CN 110322569A CN 201910592876 A CN201910592876 A CN 201910592876A CN 110322569 A CN110322569 A CN 110322569A
Authority
CN
China
Prior art keywords
model
processing
task
processing model
video frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910592876.2A
Other languages
Chinese (zh)
Other versions
CN110322569B (en
Inventor
陈思利
刘赵梁
张永杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910592876.2A priority Critical patent/CN110322569B/en
Publication of CN110322569A publication Critical patent/CN110322569A/en
Application granted granted Critical
Publication of CN110322569B publication Critical patent/CN110322569B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/012Walk-in-place systems for allowing a user to walk in a virtual environment while constraining him to a given position in the physical environment

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The present invention provides a kind of multi-modal AR processing method, device, equipment and readable storage medium storing program for executing, by obtaining video frame images and AR task attribute according to user request information;According to the AR task attribute, object module combination is determined in preset multiple processing model combinations, wherein the object module combination includes preset shared tracking processing model and correction process model corresponding with the task attribute, maps processing model;With object module combination in three thread architectures of data sharing, AR processing corresponding with the AR task attribute is carried out to the video frame images, so as to select corresponding processing model to combine according to application demand, to provide the multi-modal AR processing adapted under several scenes, and data sharing between each model in processing model combination, it reduces and computes repeatedly, improve treatment effeciency and the flexibility of AR processing.

Description

Multi-modal AR processing method, device, equipment and readable storage medium storing program for executing
Technical field
The present invention relates to signal processing technology more particularly to a kind of multi-modal AR processing method, device, equipment and readable deposit Storage media.
Background technique
Augmented reality (Augmented Reality, referred to as: AR) is a kind of skill for merging real scene with virtual information Art, its purpose are that outdoor scene (display environment or user image) and empty scape (meter are realized by computer graphical, image processing techniques Calculation machine generate virtual environment or dummy object) synthesis.With the increase of development and the user demand of AR technology, to various AR application scenarios develop more and more AR algorithms, such as concurrently build figure and positioning (simultaneous localization And mapping, referred to as: SLAM), 2D tracking and 3D tracking etc..How to realize under application environment complicated and changeable and in real time may be used The AR processing leaned on, is always one of the target of AR technology.
Existing AR processing method usually integrates a variety of AR functional modules, each AR function mould in the same application program It is run independently of each other between block and different AR functions is provided.For example, in an AR application program include SLAM module, 2D with Track module and 3D tracking module then only call 2D tracking module if it is desired to tracking to the plane pattern in mobile phone shooting environmental; If it is desired to not only to track to the plane pattern in mobile phone shooting environmental, position of the locating cellphone in shooting environmental is also wanted, Then need to call SLAM module and 2D tracking module the two modules.
However, different AR functional modules is integrally called as independent algorithm, there are repetitions, redundancy Calculation amount.As it can be seen that the treatment effeciency of existing AR processing method and flexibility are not high enough.
Summary of the invention
The embodiment of the present invention provides a kind of multi-modal AR processing method, device, equipment and readable storage medium storing program for executing, improves AR The efficiency and flexibility of processing.
The embodiment of the present invention in a first aspect, providing a kind of multi-modal AR processing method, comprising:
According to user request information, video frame images and AR task attribute are obtained;
According to the AR task attribute, object module combination is determined in preset multiple processing model combinations, wherein institute State object module combination include preset shared tracking handle model and correction process model corresponding with the task attribute, Maps processing model;
In three thread architectures of data sharing with the object module combination, to the video frame images carry out with it is described The corresponding AR processing of AR task attribute.
The second aspect of the embodiment of the present invention provides a kind of multi-modal AR processing unit, comprising:
Module is obtained, for obtaining video frame images and AR task attribute according to user request information;
Composite module, for determining target mould in preset multiple processing model combinations according to the AR task attribute Type combination, wherein object module combination includes that preset shared tracking handles model and corresponding with the task attribute Correction process model, maps processing model;
Processing module, for being combined in three thread architectures of data sharing with the object module, to the video frame Image carries out AR processing corresponding with the AR task attribute.
The third aspect of the embodiment of the present invention provides a kind of terminal, comprising: memory, processor and computer program, In the memory, the processor runs the computer program and executes first party of the present invention for the computer program storage The multi-modal AR processing method of face and the various possible designs of first aspect.
The fourth aspect of the embodiment of the present invention provides a kind of readable storage medium storing program for executing, is stored in the readable storage medium storing program for executing Computer program, when the computer program is executed by processor for realizing first aspect present invention and first aspect are various can The multi-modal AR processing method that can be designed.
A kind of multi-modal AR processing method, device, equipment and readable storage medium storing program for executing provided by the invention, by according to user Solicited message obtains video frame images and AR task attribute;According to the AR task attribute, in preset multiple processing model groups Object module combination is determined in conjunction, wherein object module combination include preset shared tracking handle model and with it is described The corresponding correction process model of task attribute, maps processing model;With the target in three thread architectures of data sharing Model combination carries out AR corresponding with the AR task attribute to the video frame images and handles, so as to according to application demand It selects corresponding processing model to combine, to provide the multi-modal AR processing adapted under several scenes, and handles each in model combination Data sharing between model, reduces and computes repeatedly, and improves treatment effeciency and the flexibility of AR processing.
Detailed description of the invention
Fig. 1 is a kind of application scenarios schematic diagram provided in an embodiment of the present invention;
Fig. 2 is a kind of multi-modal AR processing method flow diagram provided in an embodiment of the present invention;
Fig. 3 is a kind of multi-modal AR processing device structure diagram provided in an embodiment of the present invention;
Fig. 4 is the multi-modal AR processing device structure diagram of another kind provided in an embodiment of the present invention;
Fig. 5 is a kind of multi-modal AR processing device structure diagram provided in an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only It is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill Personnel's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
Description and claims of this specification and term " first " in above-mentioned attached drawing, " second ", " third " etc. are It is used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that the data used in this way It is interchangeable under appropriate circumstances, so that the embodiment of the present invention described herein can be in addition to illustrating herein or describing Sequence other than those is implemented.
It should be appreciated that in various embodiments of the present invention, the size of the serial number of each process is not meant to execute sequence It is successive, the execution of each process sequence should be determined by its function and internal logic, the implementation without coping with the embodiment of the present invention Journey constitutes any restriction.
It should be appreciated that in the present invention, " comprising " and " having " and their any deformation, it is intended that covering is not arranged His includes, for example, the process, method, system, product or equipment for containing a series of steps or units are not necessarily limited to clearly Those of list step or unit, but may include be not clearly listed or for these process, methods, product or equipment Intrinsic other step or units.
It should be appreciated that in the present invention, " multiple " refer to two or more."and/or" is only a kind of description pass Join the incidence relation of object, indicates may exist three kinds of relationships, for example, A and/or B, can indicate: individualism A is deposited simultaneously In A and B, these three situations of individualism B.Character "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or"." include A, B and C ", " including A, B, C " refer to that A, B, C three include, and " including A, B or C " refers to comprising A, B, C thrin, " packet Refer to containing A, B and/or C " comprising wantonly 1 in A, B, C three or 2 or 3 wantonly.
It should be appreciated that in the present invention, " B corresponding with A ", " B corresponding with A ", " A and B are corresponding " or " B and A It is corresponding ", it indicates that B is associated with A, B can be determined according to A.It determines that B is not meant to determine B only according to A according to A, may be used also To determine B according to A and/or other information.The correction of A and B is that the similarity of A and B is greater than or equal to preset threshold value.
Technical solution of the present invention is described in detail with specifically embodiment below.These specific implementations below Example can be combined with each other, and the same or similar concept or process may be repeated no more in some embodiments.
In existing AR processing method, generally comprise there are many AR processing method to realize different calculating demands.Example Such as vision positioning (Visual Positioning System, abbreviation VPS) method, concurrently build figure and positioning (simultaneous Localization and mapping, referred to as: SLAM) method, two dimension 2D tracking, three-dimensional 3D tracking etc..In reality In the product of border, each AR processing method is integrated usually in the same application program with individual AR Implement of Function Module A variety of AR functional modules are run with and provide different AR functions independently of each other between each AR functional module.For example, an AR is answered With including SLAM module, 2D tracking module and 3D tracking module in program, if it is desired to the plan view in mobile phone shooting environmental Case tracking, then only call 2D tracking module.If not only to track to the plane pattern in mobile phone shooting environmental, hand is also positioned Position of the machine in shooting environmental then needs to call SLAM module and 2D tracking module the two modules simultaneously.As it can be seen that existing It is inner that multiple independent algorithms (such as SLAM, 2D tracking, 3D tracking etc.) are potted directly into an application by AR product, then basis Different application demand tune plays different algorithms.Since various AR algorithms usually have respectively different realization principle and method, than Such as SLAM algorithm has ORB-SLAM, LSD-SLAM, and 3D track algorithm has PWP3D and some tracing algorithms etc. based on edge. These algorithm realization principles and method difference are very big, can only be as independent algoritic module one by one.
However, since each AR functional module is integrally called as independent algorithm, since module encapsulates and calculates The reasons such as method principle difference, two independent modules can not share some bottom datas inside algorithm, cause to carry out repeating meter Calculation leads to the problem of redundant data.Such as not only the plane pattern in mobile phone shooting environmental is being tracked, also want locating cellphone In the application scenarios of position in shooting environmental, need to call SLAM module and 2D tracking module simultaneously, but SLAM module and 2D tracking module all can result in computing repeatedly for map datum to the update that local map data are generated around user. And these compute repeatedly the occupancy of the raising and computing resource that result in power consumption, it is also possible to reduce calculating speed and processing effect Rate.As it can be seen that the treatment effeciency of existing AR processing method and flexibility are not high enough.
In order to solve the problems in the existing technology, the embodiment of the invention provides a kind of multi-modal AR processing method, By carrying out dismantling and multiple combinations to existing multiple AR algorithms, multiple processing model combinations are obtained, these processing model groups Conjunction can individually be called according to various application demands, and include that tracking needed for AR function is calculated in the combination of each processing model Method, correcting algorithm and map rejuvenation algorithm are achieved in the multi-modal AR processing being suitable under several scenes, improve at AR The treatment effeciency of reason and flexibility.
It is a kind of application scenarios schematic diagram provided in an embodiment of the present invention referring to Fig. 1.In application scenarios shown in Fig. 1, User shoots the video of ambient enviroment by the AR application program installed in user terminal 1, and user terminal 1 is according to captured in real-time Video issues the user request information for carrying video frame images to server 2.For example, the AR application program is for building to specific The amusement class application program that object carries out the superposition of AR effect is built, then server 2 is after receiving video frame images, for its distribution pair It answers the model of task attribute to combine, classroom building in video is shot to user and is tracked, and in the upper empty position of classroom building The welcome vertically hung scroll of AR Overlapping display " XX school welcomes you ".
It referring to fig. 2, is a kind of multi-modal AR processing method flow diagram provided in an embodiment of the present invention, side shown in Fig. 2 The executing subject of method can be software and/or hardware device.It is specific as follows including step S101 to step S104:
S101 obtains video frame images and AR task attribute according to user request information.
For example, video frame images and AR task attribute can be carried in user request information itself, so that server is straight Connect reading.In another example user terminal only can request that fixed AR task attribute (such as the AR application program having a single function), then When server gets video frame images, the corresponding AR task attribute of the user terminal is also got.
In further embodiments, user request information may include the video frame images.Server obtains user and asks The video frame images in information are sought, and object to be tracked is obtained to video frame images identification.Such as according to preset multiple The object information that can carry out AR processing, carries out target detection in video frame images, if detecting target object, as to Track object.In general, object to be tracked can be divided into 2D object and 3D object, server can be according to the shape of the object to be tracked Shape feature determines AR task attribute, wherein the shape feature includes 2D object features and/or 3D object features.For example, if to Tracking object is 2D object, such as specific pattern, then AR task attribute is 2D class AR task, i.e. progress 2D tracking.If regarding It can't detect target object in frequency frame image, then can not track temporarily, or can be generated simultaneously according only to video frame images Update ambient enviroment local map.
S102 determines object module combination in preset multiple processing model combinations according to the AR task attribute, In, the object module combination includes preset shared tracking processing model and correction process corresponding with the task attribute Model, maps processing model.
For example, each processing model combination includes identical total there are many preset processing model combinations of server Model, and respectively different correction process model and maps processing model are handled with tracking.And just because of correction process mould The difference of type and maps processing model, and each processing model group is properly handled with different AR, that is, it corresponds to different AR and appoints Business attribute.
In application scenes, server can be to a plurality of AR application offer services in the present embodiment, it is only necessary to according to The difference of its task attribute, and corresponding object module is selected to combine.Certainly, server can also be applied to single a AR Service is provided, then only needs to select fixed object module combination.
Before step S102, first the combination of multiple processing models can also be preset.Handle the default of model combination Process, it can be understood as be dismantling to existing a variety of AR algorithms and the process reconfigured.For example, it is contemplated that AR processing master If carrying out tracking and positioning to 2D type objects and/or 3D type objects, and the tracking part in various AR algorithms is can to lead to , therefore a default shared tracking processing model handles model as tracking general in the combination of each model.And for not Same AR application, applicable attitude matching bearing calibration and map rejuvenation optimization method have larger difference, therefore each service It is mutually different that reason model combination lieutenant colonel, which is just handling model or maps processing model, everywhere in device is default.It is every in the present embodiment A model can be understood as method, and thread executes the corresponding algorithm steps of model by accessing these methods and necessary data Suddenly.
For example, server is inputted according to user, create and store 2D correction process model for handling 2D class AR task, 2D maps processing model, wherein the 2D correction process model corresponds to 2D bearing calibration type, the 2D maps processing model pair Answer 2D map Method type.For example, 2D bearing calibration type can be the label information of 2D correction process model, likewise, 2D Map Method type is also possible to the label information of 2D maps processing model, contributes to the algorithm function of instruction model.
For example, server is inputted according to user, create and store 3D correction process model for handling 3D class AR task, 3D maps processing model, wherein the 3D correction process model corresponds to 3D bearing calibration type, the 3D maps processing model pair Answer 3D map Method type.For example, 3D bearing calibration type can be the label information of 3D correction process model, likewise, 3D Map Method type is also possible to the label information of 3D maps processing model, contributes to the algorithm function of instruction model.
For example, server is inputted according to user, creates and store the fusion maps processing for handling fusion class AR task Model, wherein the fusion maps processing model corresponds to 2D and 3D map Method type.For example, 2D and 3D map Method type It is also possible to merge the label information of maps processing model, is the algorithm function for being used to indicate model, for example, fusion maps processing The algorithm of model is obtained after mutually merging 2D map rejuvenation algorithm with 3D map rejuvenation algorithm, to realize with carrying out 3D To the local map for the shared tracking processing mold sync 2D for carrying out 2D tracking while figure updates optimization.
The creation and storage of above-mentioned various correction process models and maps processing model can be while carry out, can also Successively being executed with random order, the present embodiment does not limit.
On the basis of establishing above-mentioned various models, server is obtained more by the selection and combination to various models A processing model combination, wherein each processing model combination is made of 3 models.First model is preset shared tracking Model is handled, each processing model combination shared tracking having the same handles model.Second model is at above-mentioned 2D correction One of model and 3D correction process model are managed, i.e., selects one from above-mentioned 2D correction process model and 3D correction process model.The Three models are one of above-mentioned 2D maps processing model, 3D maps processing model and fusion maps processing model, also in upper It states 2D maps processing model, 3D maps processing model and fusion maps processing model and selects one.Due to each processing model group The combination for including in conjunction is different, therefore the task attribute of each processing model group properly is also just different.For example, processing model combination A includes: to share tracking processing model, 3D correction process model, 3D maps processing model.So processing model combination A corresponds to 3D Bearing calibration type and 3D map Method type, that is, can handle 3D tracking, 3D local map updates the 3D class AR tasks such as optimization Task.Processing model combination B includes: to share tracking processing model, 2D correction process model, fusion maps processing model.So Processing model combination B correspond to 2D bearing calibration type, 2D and 3D map Method type, that is, can handle 2D track, 2D and 3D innings The fusion class AR job such as portion's map rejuvenation optimization.
Above-mentioned steps S102 (according to the AR task attribute, determines target mould in preset multiple processing model combinations Type combination, wherein object module combination includes that preset shared tracking handles model and corresponding with the task attribute Correction process model, maps processing model) a kind of optional embodiment may is that server obtain for handling the AR The bearing calibration type of task attribute, map Method type.Such as to fusion class AR job, get 2D bearing calibration class Type, 2D and 3D map Method type.Then, server, will be with the bearing calibration in preset multiple processing model combinations Type and the corresponding processing model combination of the map Method type, are combined as object module.Such as it will be with 2D bearing calibration The corresponding processing model of type, 2D and 3D map Method type combines B, combines as object module.
In the above-described embodiments, task attribute can there are many, such as may include 2D class AR task, 3D class AR task with And at least one of fusion class AR task.Below to three kinds of task attributes and its corresponding correction process model and maps processing Model is illustrated.
In some embodiments, task attribute for example may include 2D class AR task.It is corresponding with the 2D class AR task The 2D correction process model, comprising: 2D correction process mould corresponding with the plane correction Method type based on ORB feature Type, or 2D correction process model corresponding with the plane correction Method type based on SURF feature.It is to be understood that 2D is corrected The corresponding algorithm of processing model is that 2D tracks distinctive correcting algorithm, for example the plane matching algorithm based on ORB feature is either Plane matching algorithm etc. based on SURF feature.The 2D maps processing model corresponding with the 2D class AR task, comprising: 2D maps processing model corresponding with the map Method type of Calculation Plane picture depth.It is to be understood that 2D maps processing The corresponding algorithm of model is the distinctive map rejuvenation optimization algorithm of 2D, for example, with share tracking processing model obtain pose, to The depth for the point map that processing frame image and plane equation calculation increase newly.
In further embodiments, task attribute for example may include 3D class AR task.It is opposite with the 3D class AR task The 3D correction process model answered, comprising: 3D correction process mould corresponding with the 3D bearing calibration type based on PWP3D Type, or 3D correction process model corresponding with the 3D bearing calibration type based on marginal information.It is to be understood that at 3D correction The corresponding algorithm of reason model is that 3D tracks distinctive correcting algorithm, such as based on PWP3D or the matching algorithm based on edge etc.. The 3D maps processing model corresponding with the 3D class AR task, comprising: determine that point map depth is believed with gestures of object The corresponding 3D maps processing model of the map Method type of breath.It is to be understood that the corresponding algorithm of 3D maps processing model is The distinctive map rejuvenation optimization algorithm of 3D, for example rendered with the pose for tracking processing model and obtaining, frame image to be processed is shared 3D model obtains the depth of each point in 3D model, thus increases new point map.
In still other embodiments, task attribute for example may include fusion class AR task.Class AR task is merged with described The corresponding fusion maps processing model, comprising: fusion ground corresponding with 2D and 3D fusion map Method type is determined Figure processing model.Such as it is initialized in 2D expansion algorithm with 2D image, and user is determined in ambient enviroment in real time Position.
By the above-mentioned multiple combinations to correction process model and maps processing model, model is handled in conjunction with shared tracking, The combination of various types of AR functions can be neatly realized, be consequently adapted to a variety of application demands, reduction computes repeatedly.
S103 carries out the video frame images with object module combination in three thread architectures of data sharing AR processing corresponding with the AR task attribute.
Three thread architectures are, for example, the framework of tri- thread of TMC, and wherein TMC refers to the thread of three different function.It can manage Xie Wei, T (Tracking) refer to the track thread for the Attitude estimation to real time sequence frame;M (Mapping) refer to for pair The map thread that local map optimization updates;C (Calibration) refers to for being directed to the absolute of global map, rigid objects The correction thread of matching primitives.
It is combined by three thread architecture performance objective models, the stratum in object module combination between each model may be implemented Data sharing, to get up tracking processing model, correction process model and maps processing Unified Model to be combined calling.
It is to be understood that three thread architectures include track thread, correction thread and the map thread of data sharing.Wherein, The track thread is used to execute the tracking processing model of the object module combination, and the correction thread is for executing the mesh The correction process model of model combination is marked, the map thread is used to execute the maps processing model of the object module combination. For example, tracking processing model corresponding method of the track thread for the combination of access target model, correction thread is for accessing mesh The corresponding method of correction process model of model combination is marked, map thread is used for the maps processing model of access target model combination Corresponding method.
On the basis of three thread architectures, after object module combination has been determined, a kind of specific implementation of step S103 Mode can be with are as follows:
Server obtains the video frame images first with track thread, and the video frame images are synchronized to the school Positive thread.
Then server corrects correction process model, the pre-stored elder generation that thread is combined according to the object module with described Knowledge information and the video frame images are tested, obtains the absolute position map with dimensional information and camera posture information, and will The absolute position map and the camera posture information are synchronized to the track thread.
Server then handles model, the absolute position according to the tracking that the object module combines with the track thread Map and the camera posture information are set, AR initial alignment is carried out to the video frame images, obtains the video frame images Gestures of object, and the gestures of object of the video frame images and the video frame images is synchronized to the map thread.
Maps processing model that server is then combined according to the object module with the map thread, the video frame The gestures of object of image and the video frame images, creation or update local map, and the local map is synchronized to institute State track thread.
Tracking processing model that server is finally combined with the track thread according to the object module, it is described locally Figure carries out AR tracking to the video frame images.
The present embodiment provides a kind of multi-modal AR processing methods, by obtaining video frame images according to user request information With AR task attribute;According to the AR task attribute, object module combination is determined in preset multiple processing model combinations, In, the object module combination includes preset shared tracking processing model and correction process corresponding with the task attribute Model, maps processing model;With object module combination in three thread architectures of data sharing, to the video frame images AR processing corresponding with the AR task attribute is carried out, so as to select corresponding processing model to combine according to application demand, with The multi-modal AR processing adapted under several scenes is provided, and handles data sharing between each model in model combination, reduces weight It is multiple to calculate, improve treatment effeciency and the flexibility of AR processing.
It is a kind of multi-modal AR processing device structure diagram provided in an embodiment of the present invention referring to Fig. 3, it is shown in Fig. 3 Multi-modal AR processing unit 30 includes:
Module 31 is obtained, for obtaining video frame images and AR task attribute according to user request information;
Composite module 32, for determining target in preset multiple processing model combinations according to the AR task attribute Model combination, wherein object module combination includes that preset shared tracking handles model and opposite with the task attribute Correction process model, the maps processing model answered;
Processing module 33, for being combined in three thread architectures of data sharing with the object module, to the video Frame image carries out AR processing corresponding with the AR task attribute.
The present embodiment provides a kind of multi-modal AR processing units, by obtaining video frame images according to user request information With AR task attribute;According to the AR task attribute, object module combination is determined in preset multiple processing model combinations, In, the object module combination includes preset shared tracking processing model and correction process corresponding with the task attribute Model, maps processing model;With object module combination in three thread architectures of data sharing, to the video frame images AR processing corresponding with the AR task attribute is carried out, so as to select corresponding processing model to combine according to application demand, with The multi-modal AR processing adapted under several scenes is provided, and handles data sharing between each model in model combination, reduces weight It is multiple to calculate, improve treatment effeciency and the flexibility of AR processing.
Optionally, composite module 32, for obtaining bearing calibration type for handling the AR task attribute, map side Method type;It, will be corresponding with the bearing calibration type and the map Method type in preset multiple processing model combinations Processing model combination, as object module combine.
It referring to fig. 4, is the multi-modal AR processing device structure diagram of another kind provided in an embodiment of the present invention, shown in Fig. 4 Multi-modal AR processing unit 30 further include: built in advance module 34.
In the composite module 32 according to the AR task attribute, target is determined in preset multiple processing model combinations Model combination, wherein object module combination includes that preset shared tracking handles model and opposite with the task attribute Before the correction process model answered, maps processing model, built in advance module 34, for creating and storing for handling 2D class AR task 2D correction process model, 2D maps processing model, wherein the 2D correction process model corresponds to 2D bearing calibration type, institute It states 2D maps processing model and corresponds to 2D map Method type;It creates and stores the 3D correction process mould for handling 3D class AR task Type, 3D maps processing model, wherein the 3D correction process model corresponds to 3D bearing calibration type, the 3D maps processing mould Type corresponds to 3D map Method type;It creates and stores the fusion maps processing model for handling fusion class AR task, wherein institute It states fusion maps processing model and corresponds to 2D and 3D map Method type;Obtain multiple processing model combinations, wherein the processing mould Type combines
Preset shared tracking handles model,
One of the 2D correction process model and the 3D correction process model,
And one of the 2D maps processing model, the 3D maps processing model and described fusion maps processing model.
Optionally, the task attribute includes 2D class AR task.
The 2D correction process model corresponding with the 2D class AR task, comprising: with the plane school based on ORB feature The corresponding 2D correction process model of correction method type, or 2D corresponding with the plane correction Method type based on SURF feature Correction process model.
The 2D maps processing model corresponding with the 2D class AR task, comprising: with Calculation Plane picture depth The corresponding 2D maps processing model of map Method type.
Optionally, the task attribute includes 3D class AR task.
The 3D correction process model corresponding with the 3D class AR task, comprising: with the correction side 3D based on PWP3D At the corresponding 3D correction process model of method type, or 3D correction corresponding with the 3D bearing calibration type based on marginal information Manage model.
The 3D maps processing model corresponding with the 3D class AR task, comprising: determine map with gestures of object The corresponding 3D maps processing model of map Method type of point depth information.
Optionally, the task attribute includes fusion class AR task.
The fusion maps processing model corresponding with the fusion class AR task, comprising: merged with 2D and 3D is determined The corresponding fusion maps processing model of map Method type.
Optionally, three thread architecture includes track thread, correction thread and the map thread of data sharing;Wherein, The track thread is used to execute the tracking processing model of the object module combination, and the correction thread is for executing the mesh The correction process model of model combination is marked, the map thread is used to execute the maps processing model of the object module combination.
Correspondingly, processing module 33, for obtaining the video frame images with track thread, and by the video frame images It is synchronized to the correction thread;The correction process model, pre-stored combined with the correction thread according to the object module Priori knowledge information and the video frame images obtain the absolute position map with dimensional information and camera posture information, and The absolute position map and the camera posture information are synchronized to the track thread;With the track thread according to Tracking processing model, the absolute position map and the camera posture information of object module combination, to the video frame figure As carrying out AR initial alignment, the gestures of object of the video frame images is obtained, and by the video frame images and the video The gestures of object of frame image is synchronized to the map thread;At the map combined with the map thread according to the object module The gestures of object of model, the video frame images and the video frame images, creation or update local map are managed, and will be described Local map is synchronized to the track thread;The tracking processing model that is combined with the track thread according to the object module, The local map carries out AR tracking to the video frame images.
Optionally, the user request information includes the video frame images.
Correspondingly, module 31 is obtained, for obtaining the video frame images in user request information;To the video frame images Identification obtains object to be tracked;According to the shape feature of the object to be tracked, AR task attribute is determined, wherein the shape is special Sign includes 2D object features and/or 3D object features.
It is a kind of hardware structural diagram of equipment provided in an embodiment of the present invention referring to Fig. 5, which includes: place Manage device 51, memory 52 and computer program;Wherein,
Memory 52, for storing the computer program, which can also be flash memory (flash).The calculating Machine program is, for example, to realize application program, the functional module etc. of the above method.
Processor 51, for executing the computer program of the memory storage, to realize the above-mentioned multi-modal processing side AR Each step that server executes in method.It specifically may refer to the associated description in previous methods embodiment.
Optionally, memory 52 can also be integrated with processor 51 either independent.
When the memory 52 is independently of the device except processor 51, the equipment can also include:
Bus 53, for connecting the memory 52 and processor 51.
The present invention also provides a kind of readable storage medium storing program for executing, computer program is stored in the readable storage medium storing program for executing, it is described The multi-modal AR processing method provided when computer program is executed by processor for realizing above-mentioned various embodiments.
Wherein, readable storage medium storing program for executing can be computer storage medium, be also possible to communication media.Communication media includes just In from a place to any medium of another place transmission computer program.Computer storage medium can be general or special Any usable medium enough accessed with computer capacity.For example, readable storage medium storing program for executing is coupled to processor, to enable a processor to Information is read from the readable storage medium storing program for executing, and information can be written to the readable storage medium storing program for executing.Certainly, readable storage medium storing program for executing can also be with It is the component part of processor.Processor and readable storage medium storing program for executing can be located at specific integrated circuit (Application Specific Integrated Circuits, referred to as: ASIC) in.In addition, the ASIC can be located in user equipment.Certainly, Processor and readable storage medium storing program for executing can also be used as discrete assembly and be present in communication equipment.Readable storage medium storing program for executing can be read-only Memory (ROM), random access memory (RAM), CD-ROM, tape, floppy disk and optical data storage devices etc..
The present invention also provides a kind of program product, the program product include execute instruction, this execute instruction be stored in it is readable In storage medium.At least one processor of equipment can read this from readable storage medium storing program for executing and execute instruction, at least one processing Device executes this and executes instruction so that equipment implements the multi-modal AR processing method that above-mentioned various embodiments provide.
In the embodiment of above equipment, it should be appreciated that processor can be central processing unit (English: Central Processing Unit, referred to as: CPU), it can also be other general processors, digital signal processor (English: Digital Signal Processor, referred to as: DSP), specific integrated circuit (English: Application Specific Integrated Circuit, referred to as: ASIC) etc..General processor can be microprocessor or the processor is also possible to any conventional place Manage device etc..It can be embodied directly in hardware processor in conjunction with the step of the method disclosed in the present and execute completion or use Hardware and software module combination in reason device execute completion.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims (11)

1. a kind of multi-modal AR processing method characterized by comprising
According to user request information, video frame images and AR task attribute are obtained;
According to the AR task attribute, object module combination is determined in preset multiple processing model combinations, wherein the mesh Marking model combination includes preset shared tracking processing model and correction process model corresponding with the task attribute, map Handle model;
With object module combination in three thread architectures of data sharing, the video frame images appoint with the AR The corresponding AR processing of business attribute.
2. the method according to claim 1, wherein described according to the AR task attribute, preset multiple It handles and determines object module combination in model combination, wherein the object module combination includes that preset shared tracking handles mould Type and correction process model corresponding with the task attribute, maps processing model, comprising:
Obtain bearing calibration type, the map Method type for handling the AR task attribute;
It, will place corresponding with the bearing calibration type and the map Method type in preset multiple processing model combinations Model combination is managed, is combined as object module.
3. according to the method described in claim 2, it is characterized in that, described according to the AR task attribute, preset multiple It handles and determines object module combination in model combination, wherein the object module combination includes that preset shared tracking handles mould Before type and correction process model corresponding with the task attribute, maps processing model, further includes:
It creates and stores 2D correction process model, the 2D maps processing model for handling 2D class AR task, wherein the school 2D Model corresponds to 2D bearing calibration type, the 2D maps processing model corresponds to 2D map Method type for positive processing;
It creates and stores 3D correction process model, the 3D maps processing model for handling 3D class AR task, wherein the school 3D Model corresponds to 3D bearing calibration type, the 3D maps processing model corresponds to 3D map Method type for positive processing;
It creates and stores the fusion maps processing model for handling fusion class AR task, wherein the fusion maps processing mould Type corresponds to 2D and 3D map Method type;
Obtain multiple processing model combinations, wherein the processing model, which combines, includes:
Preset shared tracking handles model,
One of the 2D correction process model and the 3D correction process model,
And one of the 2D maps processing model, the 3D maps processing model and described fusion maps processing model.
4. according to the method described in claim 3, it is characterized in that, the task attribute includes 2D class AR task;
The 2D correction process model corresponding with the 2D class AR task, comprising: with the plane correction side based on ORB feature The corresponding 2D correction process model of method type, or 2D corresponding with the plane correction Method type based on SURF feature correction Handle model;
The 2D maps processing model corresponding with the 2D class AR task, comprising: the map with Calculation Plane picture depth The corresponding 2D maps processing model of Method type.
5. according to the method described in claim 3, it is characterized in that, the task attribute includes 3D class AR task;
The 3D correction process model corresponding with the 3D class AR task, comprising: with the 3D bearing calibration class based on PWP3D The corresponding 3D correction process model of type, or 3D correction process mould corresponding with the 3D bearing calibration type based on marginal information Type;
The 3D maps processing model corresponding with the 3D class AR task, comprising: determine point map depth with gestures of object Spend the corresponding 3D maps processing model of map Method type of information.
6. according to the method described in claim 3, it is characterized in that, the task attribute includes fusion class AR task;
The fusion maps processing model corresponding with the fusion class AR task, comprising: merge map with 2D and 3D is determined The corresponding fusion maps processing model of Method type.
7. method according to any one of claims 1 to 6, which is characterized in that three thread architecture includes data sharing Track thread, correction thread and map thread;Wherein, the track thread is used to execute at the tracking of the object module combination Model is managed, the correction thread is used to execute the correction process model of the object module combination, and the map thread is for holding The maps processing model of the row object module combination;
It is described in three thread architectures of data sharing with the object module combination, to the video frame images carry out with it is described The corresponding AR processing of AR task attribute, comprising:
The video frame images are obtained with track thread, and the video frame images are synchronized to the correction thread;
Correction process model, pre-stored priori knowledge information and the institute combined with the correction thread according to the object module Video frame images are stated, obtain the absolute position map with dimensional information and camera posture information, and by the absolute position Figure and the camera posture information are synchronized to the track thread;
Tracking processing model, the absolute position map and the phase combined with the track thread according to the object module Machine posture information carries out AR initial alignment to the video frame images, obtains the gestures of object of the video frame images, and by institute The gestures of object for stating video frame images and the video frame images is synchronized to the map thread;
Maps processing model, the video frame images and the view combined with the map thread according to the object module The gestures of object of frequency frame image, creation or update local map, and the local map is synchronized to the track thread;
The tracking combined with the track thread according to the object module handles model, the local map to the video frame Image carries out AR tracking.
8. method according to any one of claims 1 to 6, which is characterized in that the user request information includes the video Frame image;
It is described according to user request information, obtain video frame images and AR task attribute, comprising:
Obtain the video frame images in user request information;
Object to be tracked is obtained to video frame images identification;
According to the shape feature of the object to be tracked, AR task attribute is determined, wherein the shape feature includes that 2D object is special Sign and/or 3D object features.
9. a kind of multi-modal AR processing unit characterized by comprising
Module is obtained, for obtaining video frame images and AR task attribute according to user request information;
Composite module, for determining object module group in preset multiple processing model combinations according to the AR task attribute It closes, wherein the object module combination includes preset shared tracking processing model and school corresponding with the task attribute Positive processing model, maps processing model;
Processing module, for being combined in three thread architectures of data sharing with the object module, to the video frame images Carry out AR processing corresponding with the AR task attribute.
10. a kind of equipment characterized by comprising memory, processor and computer program, the computer program are deposited In the memory, the processor runs the computer program perform claim and requires 1 to 8 any multi-modal AR for storage Processing method.
11. a kind of readable storage medium storing program for executing, which is characterized in that be stored with computer program, the meter in the readable storage medium storing program for executing For realizing any multi-modal AR processing method of claim 1 to 8 when calculation machine program is executed by processor.
CN201910592876.2A 2019-07-03 2019-07-03 Multi-modal AR processing method, device, equipment and readable storage medium Active CN110322569B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910592876.2A CN110322569B (en) 2019-07-03 2019-07-03 Multi-modal AR processing method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910592876.2A CN110322569B (en) 2019-07-03 2019-07-03 Multi-modal AR processing method, device, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN110322569A true CN110322569A (en) 2019-10-11
CN110322569B CN110322569B (en) 2023-03-31

Family

ID=68122396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910592876.2A Active CN110322569B (en) 2019-07-03 2019-07-03 Multi-modal AR processing method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN110322569B (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101639793A (en) * 2009-08-19 2010-02-03 南京邮电大学 Grid load predicting method based on support vector regression machine
US20110311100A1 (en) * 2010-06-22 2011-12-22 Nokia Corporation Method, Apparatus and Computer Program Product for Providing Object Tracking Using Template Switching and Feature Adaptation
CN102915112A (en) * 2011-06-23 2013-02-06 奥美可互动有限责任公司 System and method for close-range movement tracking
CN104050475A (en) * 2014-06-19 2014-09-17 樊晓东 Reality augmenting system and method based on image feature matching
CN106325980A (en) * 2015-06-30 2017-01-11 中国石油化工股份有限公司 Multi-thread concurrent system
CN107667331A (en) * 2015-05-28 2018-02-06 微软技术许可有限责任公司 Shared haptic interaction and user security in the more people's immersive VRs of the communal space
CN107885871A (en) * 2017-11-24 2018-04-06 南京华捷艾米软件科技有限公司 Synchronous superposition method, system, interactive system based on cloud computing
CN108227929A (en) * 2018-01-15 2018-06-29 廖卫东 Augmented reality setting-out system and implementation method based on BIM technology
US20180204061A1 (en) * 2017-01-19 2018-07-19 Samsung Electronics Co., Ltd. Vision intelligence management for electronic devices
US10043076B1 (en) * 2016-08-29 2018-08-07 PerceptIn, Inc. Visual-inertial positional awareness for autonomous and non-autonomous tracking
CN109063774A (en) * 2018-08-03 2018-12-21 百度在线网络技术(北京)有限公司 Picture charge pattern effect evaluation method, device, equipment and readable storage medium storing program for executing
CN109168034A (en) * 2018-08-28 2019-01-08 百度在线网络技术(北京)有限公司 Merchandise news display methods, device, electronic equipment and readable storage medium storing program for executing
CN109189986A (en) * 2018-08-29 2019-01-11 百度在线网络技术(北京)有限公司 Information recommendation method, device, electronic equipment and readable storage medium storing program for executing
CN109584275A (en) * 2018-11-30 2019-04-05 哈尔滨理工大学 A kind of method for tracking target, device, equipment and storage medium
CN109887003A (en) * 2019-01-23 2019-06-14 亮风台(上海)信息科技有限公司 A kind of method and apparatus initialized for carrying out three-dimensional tracking

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101639793A (en) * 2009-08-19 2010-02-03 南京邮电大学 Grid load predicting method based on support vector regression machine
US20110311100A1 (en) * 2010-06-22 2011-12-22 Nokia Corporation Method, Apparatus and Computer Program Product for Providing Object Tracking Using Template Switching and Feature Adaptation
CN102915112A (en) * 2011-06-23 2013-02-06 奥美可互动有限责任公司 System and method for close-range movement tracking
CN104050475A (en) * 2014-06-19 2014-09-17 樊晓东 Reality augmenting system and method based on image feature matching
CN107667331A (en) * 2015-05-28 2018-02-06 微软技术许可有限责任公司 Shared haptic interaction and user security in the more people's immersive VRs of the communal space
CN106325980A (en) * 2015-06-30 2017-01-11 中国石油化工股份有限公司 Multi-thread concurrent system
US10043076B1 (en) * 2016-08-29 2018-08-07 PerceptIn, Inc. Visual-inertial positional awareness for autonomous and non-autonomous tracking
US20180204061A1 (en) * 2017-01-19 2018-07-19 Samsung Electronics Co., Ltd. Vision intelligence management for electronic devices
CN107885871A (en) * 2017-11-24 2018-04-06 南京华捷艾米软件科技有限公司 Synchronous superposition method, system, interactive system based on cloud computing
CN108227929A (en) * 2018-01-15 2018-06-29 廖卫东 Augmented reality setting-out system and implementation method based on BIM technology
CN109063774A (en) * 2018-08-03 2018-12-21 百度在线网络技术(北京)有限公司 Picture charge pattern effect evaluation method, device, equipment and readable storage medium storing program for executing
CN109168034A (en) * 2018-08-28 2019-01-08 百度在线网络技术(北京)有限公司 Merchandise news display methods, device, electronic equipment and readable storage medium storing program for executing
CN109189986A (en) * 2018-08-29 2019-01-11 百度在线网络技术(北京)有限公司 Information recommendation method, device, electronic equipment and readable storage medium storing program for executing
CN109584275A (en) * 2018-11-30 2019-04-05 哈尔滨理工大学 A kind of method for tracking target, device, equipment and storage medium
CN109887003A (en) * 2019-01-23 2019-06-14 亮风台(上海)信息科技有限公司 A kind of method and apparatus initialized for carrying out three-dimensional tracking

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄福山: "基于ARM处理器的TLD目标跟踪算法实现和优化", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Also Published As

Publication number Publication date
CN110322569B (en) 2023-03-31

Similar Documents

Publication Publication Date Title
CN109387204B (en) Mobile robot synchronous positioning and composition method facing indoor dynamic environment
US9934612B2 (en) Methods and systems for determining the pose of a camera with respect to at least one object of a real environment
CN107633526A (en) A kind of image trace point acquisition methods and equipment, storage medium
CN109242961A (en) A kind of face modeling method, apparatus, electronic equipment and computer-readable medium
CN103875024B (en) Systems and methods for navigating camera
CN110517319A (en) A kind of method and relevant apparatus that camera posture information is determining
CN110006343A (en) Measurement method, device and the terminal of object geometric parameter
CN109543489A (en) Localization method, device and storage medium based on two dimensional code
WO2023056544A1 (en) Object and camera localization system and localization method for mapping of the real world
CN107329962B (en) Image retrieval database generation method, and method and device for enhancing reality
JP2007042102A (en) Transfer of attribute between geometry-surfaces of arbitrary topology reducing distortion and storing discontinuity
CN110335351B (en) Multi-modal AR processing method, device, system, equipment and readable storage medium
US10733777B2 (en) Annotation generation for an image network
CN110648363A (en) Camera posture determining method and device, storage medium and electronic equipment
CN110275968A (en) Image processing method and device
US11508098B2 (en) Cross-device supervisory computer vision system
EP3959694A1 (en) Perimeter estimation from posed monocular video
CN108961423A (en) Virtual information processing method, device, equipment and storage medium
CN113706373A (en) Model reconstruction method and related device, electronic equipment and storage medium
US6570568B1 (en) System and method for the coordinated simplification of surface and wire-frame descriptions of a geometric model
CN111583381A (en) Rendering method and device of game resource map and electronic equipment
CN114202632A (en) Grid linear structure recovery method and device, electronic equipment and storage medium
CN110706332B (en) Scene reconstruction method based on noise point cloud
CN109040525A (en) Image processing method, device, computer-readable medium and electronic equipment
CN110533777B (en) Three-dimensional face image correction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant