CN108874360A - Panorama content positioning method and device - Google Patents

Panorama content positioning method and device Download PDF

Info

Publication number
CN108874360A
CN108874360A CN201810679316.6A CN201810679316A CN108874360A CN 108874360 A CN108874360 A CN 108874360A CN 201810679316 A CN201810679316 A CN 201810679316A CN 108874360 A CN108874360 A CN 108874360A
Authority
CN
China
Prior art keywords
current page
entity
panorama
operation object
matched
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810679316.6A
Other languages
Chinese (zh)
Other versions
CN108874360B (en
Inventor
杨茗名
王群
王宇亮
张苗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810679316.6A priority Critical patent/CN108874360B/en
Publication of CN108874360A publication Critical patent/CN108874360A/en
Application granted granted Critical
Publication of CN108874360B publication Critical patent/CN108874360B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04815Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the present invention proposes a kind of panorama content positioning method and device.This method includes:Semantic analysis is carried out to the control voice of input, to determine that user demand, user demand include at least one in operation pages, operation object and action type that user needs to operate;If user demand is to operate to the current page of panorama content, image recognition is carried out to the current page, to have searched whether in current page and the matched entity of operation object;If have in current page with the matched entity of operation object, matched entity is operated according to interbehavior rule and action type in current page.The embodiment of the present invention is to compensate for the blank that voice browses this part in panorama to the interactive experience that user's one kind is more natural, intelligent is provided, save the use step-length of user, more accurately meet user demand.

Description

Panorama content positioning method and device
Technical field
The present invention relates to technical field of virtual reality more particularly to a kind of panorama content positioning methods and device.
Background technique
With the continuous development of VR (virtual reality, Virtual Reality) technology, panorama content can be more and more Equipment on show.Wherein, VR panorama is shown on network (web), has not only enriched original two dimension content of pages on web, but also make User can enjoy more three-dimensional, immersion user experience, closer to true living scene.
On web, user browses the mode of VR panorama content and its limitation is:
A. mode finger sliding or clicked:User is slided in panorama content with finger, checks panorama content;Or point Other panorama material linking inlet ports are hit, new panorama content is opened.
Limitation:User is needed directly to contact with equipment, not convenient enough in operation, intelligence;Browsing is also based on visible Content shows that the content seen is not shown in current visible region if wanting, user needs repeatedly to pull on the page just see, It can not be accurately positioned, increase using step-length, affect user experience.
B. the mode of gyroscope gravity sensing:The gravity sensing function for opening equipment, by changing the position of equipment, positioning To specific panorama content.
Limitation:User needs rotating machinery, is adapted to different angles, can just see the full panorama content of comparison.Extremely The case where be if user wants to see the aft section of panorama content, to need handheld device, turning round to the back side can just see, pole Big affects user experience.
Summary of the invention
The embodiment of the present invention provides a kind of panorama content positioning method and device, to solve one in the prior art or more A technical problem.
In a first aspect, the embodiment of the invention provides a kind of panorama content positioning methods, including:
Semantic analysis is carried out to the control voice of input, to determine that user demand, the user demand include user's needs At least one of in the operation pages of operation, operation object and action type;
If the user demand is to operate to the current page of panorama content, figure is carried out to the current page As identification, to have been searched whether in the current page and the matched entity of the operation object;
If had in the current page with the matched entity of the operation object, the basis in the current page Interbehavior rule and the action type operate the matched entity.
With reference to first aspect, the embodiment of the present invention further includes in the first implementation of first aspect:
If the user demand is to operate to the scene except current page, searched according to panorama relation data Whether matched scene is had;
If finding matched scene, the matching is operated according to the interbehavior rule and the action type Scene.
With reference to first aspect or the first implementation of first aspect, second in first aspect of the embodiment of the present invention In implementation, further include:
According to preset thingness rule, by the feature of machine learning different entities, image recognition mould is obtained Type;
Wherein, described image identification model includes in panorama content for identification each entity simultaneously records each entity in panorama Coordinate in content.
Second of implementation with reference to first aspect, the third implementation of the embodiment of the present invention in first aspect In, searched whether in the current page with the matched entity of the operation object, including:
The three-dimensional corresponding two dimensional image of current panorama content is inputted into described image identification model;
It is searched by described image identification model in each entity attributes of the current page with the presence or absence of the operation The attribute of object;
If it is present obtaining the existing corresponding entity of attribute in the coordinate of the current page.
The third implementation with reference to first aspect, four kind implementation of the embodiment of the present invention in first aspect In, it is searched by described image identification model in each entity attributes of the current page with the presence or absence of the operation object Attribute, including:
Network-based graphic language technology is used to rebuild three-dimensional environment for the two dimensional image;
In each entity attributes for searching the current page under the three-dimensional environment by described image identification model With the presence or absence of the attribute of the operation object.
With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect The 4th kind of implementation of the third implementation in face, first aspect, five kind reality of the embodiment of the present invention in first aspect It include the corresponding JSON character string of various action types in the interbehavior rule in existing mode.
Second aspect, the embodiment of the invention provides a kind of panorama content positioning devices, including:
Speech analysis module, for carrying out semantic analysis to the control voice of input, to determine user demand, the user Demand includes at least one in operation pages, operation object and action type that user needs to operate;
Picture recognition module, if being to panorama content for the user demand that the speech analysis module obtains Current page is operated, then to the current page carry out image recognition, with searched whether in the current page with The matched entity of operation object;
Page interactive module, if for have in the current page with the matched entity of the operation object, The matched entity is operated according to interbehavior rule and the action type in the current page.
In conjunction with second aspect, the embodiment of the present invention further includes in the first implementation of second aspect:
If it is to operate to the scene except current page that the speech analysis module, which is also used to the user demand, Matched scene has then been searched whether according to panorama relation data;
If the page interactive module is also used to the speech analysis module and finds matched scene, according to Interbehavior rule and the action type operate the matched scene.
In conjunction with the first of second aspect or second aspect implementation, second in second aspect of the embodiment of the present invention In implementation, further include:
Machine learning module, for passing through the spy of machine learning different entities according to preset thingness rule Sign, obtains image recognition model;
Wherein, described image identification model includes in panorama content for identification each entity simultaneously records each entity in panorama Coordinate in content.
In conjunction with second of implementation of second aspect, the third implementation of the embodiment of the present invention in second aspect In, described image identification module is also used to:
The three-dimensional corresponding two dimensional image of current panorama content is inputted into described image identification model;Known by described image Other model searches the attribute that whether there is the operation object in each entity attributes of the current page;If it is present The existing corresponding entity of attribute is obtained in the coordinate of the current page.
In conjunction with the third implementation of second aspect, four kind implementation of the embodiment of the present invention in second aspect In, it is searched by described image identification model in each entity attributes of the current page with the presence or absence of the operation object Attribute, including:
Network-based graphic language technology is used to rebuild three-dimensional environment for the two dimensional image;
In each entity attributes for searching the current page under the three-dimensional environment by described image identification model With the presence or absence of the attribute of the operation object.
The first implementation, second of implementation of second aspect, second party in conjunction with second aspect, second aspect The 4th kind of implementation of the third implementation in face, second aspect, five kind reality of the embodiment of the present invention in second aspect It include the corresponding JSON character string of various action types in the interbehavior rule in existing mode.
The third aspect, the embodiment of the invention provides a kind of panorama content positioning device, the function of described device can lead to Hardware realization is crossed, corresponding software realization can also be executed by hardware.The hardware or software include it is one or more with it is upper State the corresponding module of function.
It is described to deposit including processor and memory in the structure of panorama content positioning device in a possible design Reservoir is used to store the program for supporting panorama content positioning device to execute above-mentioned panorama content positioning method, and the processor is matched It is set to for executing the program stored in the memory.The panorama content positioning device can also include communication interface, use In panorama content positioning device and other equipment or communication.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, fixed for storing panorama content Computer software instructions used in the device of position comprising for executing program involved in above-mentioned panorama content positioning method.
A technical solution in above-mentioned technical proposal has the following advantages that or beneficial effect:It is supplied to user's one kind more certainly So, intelligent interactive experience, compensates for the blank that voice browses this part in panorama, saves the use step-length of user, more precisely Ground meets user demand.
Another technical solution in above-mentioned technical proposal has the following advantages that or beneficial effect:Pass through AI (Artificial Intelligence, artificial intelligence) technology, training voice and iconic model, being capable of mass processing voice Interactive task does not need manually to mark the entity under 3d scene.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 is the flow chart according to the panorama content positioning method of the embodiment of the present invention.
Fig. 2 is the flow chart according to the panorama content positioning method of the embodiment of the present invention.
Fig. 3 is the flow chart according to the panorama content positioning method of the embodiment of the present invention.
Fig. 4 is the block diagram according to the panorama content positioning device of the embodiment of the present invention.
Fig. 5 is the exemplary diagram according to the panorama content positioning method of the embodiment of the present invention.
Fig. 6 is the exemplary diagram according to the panorama content positioning method of the embodiment of the present invention.
Fig. 7 is the exemplary diagram according to the panorama content positioning method of the embodiment of the present invention.
Fig. 8 is the flow chart according to the panorama content positioning method of the embodiment of the present invention.
Fig. 9 is the flow chart according to the panorama content positioning method of the embodiment of the present invention.
Figure 10 is the structural block diagram according to the panorama content positioning device of the embodiment of the present invention.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
Fig. 1 is the flow chart according to the panorama content positioning method of the embodiment of the present invention.As shown in Figure 1, this method includes:
Step S110, semantic analysis is carried out to the control voice of input, to determine that user demand, the user demand include At least one of in operation pages, operation object and the action type that user needs to operate;
If step S120, the described user demand is to operate to the current page of panorama content, to described current The page carries out image recognition, to have searched whether in the current page and the matched entity of the operation object;
If step S130, have in the current page with the matched entity of the operation object, described current The matched entity is operated according to interbehavior rule and the action type in the page.
In the present embodiment, user speech input can be determined into user demand by semantic analysis.For example, if working as The panorama content of preceding display is the office's image for including desk, and the control voice of user's input includes " amplification front Desk ", it can be determined that user demand is to operate to the current page of panorama content.For another example, if current display it is complete Scape content is the teaching building of XX university, and the control voice of user's input includes " school gate for being switched to XX university ", it can be determined that is used Family demand is to operate to the scene except current page.Certainly, user demand may also include such as operation object, operation Type etc. can be set, it is not limited here according to the scene of practical application.
If user needs the current page to panorama content to operate, image recognition is carried out to current page, is known Not Chu the current page various entities that include type and position etc..Also, according to the operation object for including in user demand, It has been searched whether in current page and the matched entity of the operation object.Operation object may include the various realities shown in the page Object such as animal, plant, articles, place etc..Find operation it is corresponding after, can be according to action type in user demand and pre- If interbehavior rule, the entity being matched to the operation object operates.Action type may include:Amplify, reduce, Handoff scenario checks that the object in content etc. acts.
Such as user inputs voice:" amplifying desk ", obtains through image recognition, needs the current page to panorama content It is operated, operation object is desk, and action type is amplification.In this manner it is possible to which whether match in current page has desk This entity, according to corresponding interbehavior rule is amplified, desk is amplified if being matched to desk.
In one possible implementation, as shown in Fig. 2, further including:
If step S140, the described user demand is to operate to the scene except current page, closed according to panorama It is whether data search has matched scene;
If step S150, finding matched scene, grasped according to the interbehavior rule and the action type Make the matched scene.
Panorama relation data can recorde as textual form, such as scene { gate of XX school, XX school, building }, { XX The teaching building of school, XX school, building }, { dining room of XX school, XX school, building } etc.;Interbehavior rule in save be Description to the text of panorama relation data;The two matching, then send operational order and give page interactive module.
For example, the control voice of user's input includes if the panorama content of current display is the teaching building of XX university " school gate for being switched to XX university ", can be searched whether in panorama relation data with the matched scene of the school gate of XX university, If so, then the corresponding panorama content of the scene is opened.
In one possible implementation, further include:
According to preset thingness rule, by the feature of machine learning different entities, image recognition mould is obtained Type;
Wherein, described image identification model includes in panorama content for identification each entity simultaneously records each entity in panorama Coordinate in content.
For example, according to preset thingness rule, in conjunction with AI (Artificial Intelligence, artificial intelligence Can) technology can identify the entity under 3D (three-dimensional) scene by the feature of machine learning different entities, such as sky, Face, river, plant, animal, house etc., and record coordinate of the entity under panoramic scene.
In one possible implementation, as shown in figure 3, step S120 includes:
Step S121, the three-dimensional corresponding two dimensional image of current panorama content is inputted into described image identification model;
Step S122, being searched by described image identification model whether there is in each entity attributes of the current page The attribute of the operation object;
Step S123, if it is present obtaining the existing corresponding entity of attribute in the coordinate of the current page.
In one possible implementation, each entity of the current page is searched by described image identification model It whether there is the attribute of the operation object in attribute, including:
Network-based graphic language technology is used to rebuild three-dimensional environment for the two dimensional image;
In each entity attributes for searching the current page under the three-dimensional environment by described image identification model With the presence or absence of the attribute of the operation object.
It in one possible implementation, may include that various action types are corresponding in the interbehavior rule JSON (JavaScript Object Notation, JavaScript object representation) character string.
Fig. 4 is according to the block diagram of the panorama content positioning device of the embodiment of the present invention, which includes:
Speech analysis module 41, for carrying out semantic analysis to the control voice of input, to determine user demand, the use Family demand includes at least one in operation pages, operation object and action type that user needs to operate;
Picture recognition module 43, if being to panorama content for the user demand that the speech analysis module obtains Current page operated, then to the current page carry out image recognition, to have been searched whether in the current page With the matched entity of the operation object;
Page interactive module 45, if for have in the current page with the matched entity of the operation object, The matched entity is operated according to interbehavior rule and the action type in the current page.
In one possible implementation, further include:
If it is to operate to the scene except current page that the speech analysis module, which is also used to the user demand, Matched scene has then been searched whether according to panorama relation data;
If the page interactive module is also used to the speech analysis module and finds matched scene, according to Interbehavior rule and the action type operate the matched scene.
In one possible implementation, further include:
Machine learning module, for passing through the spy of machine learning different entities according to preset thingness rule Sign, obtains image recognition model;
Wherein, described image identification model includes in panorama content for identification each entity simultaneously records each entity in panorama Coordinate in content.
In one possible implementation, described image identification module is also used to:
The three-dimensional corresponding two dimensional image of current panorama content is inputted into described image identification model;Known by described image Other model searches the attribute that whether there is the operation object in each entity attributes of the current page;If it is present The existing corresponding entity of attribute is obtained in the coordinate of the current page.
In one possible implementation, each entity of the current page is searched by described image identification model It whether there is the attribute of the operation object in attribute, including:
Network-based graphic language technology is used to rebuild three-dimensional environment for the two dimensional image;
In each entity attributes for searching the current page under the three-dimensional environment by described image identification model With the presence or absence of the attribute of the operation object.
It in one possible implementation, include the corresponding JSON word of various action types in the interbehavior rule Symbol string.
The function of each module in each device of the embodiment of the present invention may refer to the corresponding description in the above method, herein not It repeats again.
In a kind of application example, the application scenarios packet of web (network) panorama content positioning method based on interactive voice It includes:User clicks VR mode icon when browsing the panorama page, prompts user to open equipment voice permission, as shown in Figure 5.? After user opens voice permission, voice prompting is shown, guidance user carries out interactive voice, as shown in Figure 6.User speech inputted Cheng Zhong, page synchronization show the content of user speech input, as shown in Figure 7.After input, corresponding operation is executed, is such as shown In example, next panorama page is jumped to, or check the invisible part on the page of panorama content.
By taking above-mentioned application scenarios as an example, as shown in Figure 8 and Figure 9, the panorama of the embodiment of the present invention is realized by multiple modules The principle of content positioning method includes:
1. inputting panorama two-dimensional image data to picture recognition module, inputs panorama relation data and interbehavior rule arrives Speech analysis module.
2. picture recognition module uses webgl (Web-based Graphics Language, network-based figure language Speech) technology, rebuild 3D environment.According to preset thingness rule, diagram technology is known in conjunction with AI, passes through machine learning difference The feature of entity.Picture recognition module can identify the entity under 3D scene, such as sky, ground, river, plant, animal, room Son etc., and record coordinate of the entity under panoramic scene.
3. speech analysis module parses the voice input of user, and carries out semantic analysis.According to preset interactive row For rule, the demand of user is specified.It is broadly divided into two classes:First is that the operation in current panorama content;Second is that current panorama content Outer operation.
If the first kind, the current page by picture recognition module in panorama content is matched.If being matched to use Family needs the entity operated, can return to the operational order operated to the entity to page interactive module.
If the second class, can be searched in panorama relation data by speech analysis module user need operate other Panorama content, if hit, the operational order that can be operated to page interactive module to other panorama contents.
4. the instruction that page interactive module is returned according to speech analysis module relies on interbehavior rule, operates current page Face.
Interactive voice can with mass be handled by the great ability of machine learning and AI by above-mentioned process module Behavior provides a kind of more intelligent, convenient, accurate panorama content positioning method for user.
In embodiments of the present invention, interbehavior rule can edit in advance, specify the range of interactive voice, for example edit Amplification, handoff scenario, the corresponding JSON character string of action types such as checks at diminution.In addition, constantly training voice by AI technology And iconic model, it identifies the entity information under user demand and 3D scene, improves accuracy rate.
Figure 10 shows the structural block diagram of panorama content positioning device according to an embodiment of the invention.As shown in Figure 10, should Device includes:Memory 910 and processor 920 are stored with the computer journey that can be run on processor 920 in memory 910 Sequence.The processor 920 realizes the panorama content positioning method in above-described embodiment when executing the computer program.It is described to deposit The quantity of reservoir 910 and processor 920 can be one or more.
The device further includes:
Communication interface 930 carries out data interaction for being communicated with external device.
Memory 910 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.
If memory 910, processor 920 and the independent realization of communication interface 930, memory 910,920 and of processor Communication interface 930 can be connected with each other by bus and complete mutual communication.The bus can be Industry Standard Architecture Structure (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component) bus or extended industry-standard architecture (EISA, Extended Industry Standard Component) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for expression, Figure 10 In only indicated with a thick line, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 910, processor 920 and communication interface 930 are integrated in one piece of core On piece, then memory 910, processor 920 and communication interface 930 can complete mutual communication by internal interface.
The embodiment of the invention provides a kind of computer readable storage mediums, are stored with computer program, the program quilt Processor realizes any method in above-described embodiment when executing.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium includes following:Electricity with one or more wiring Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable read-only memory (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable Jie Matter, because can then be edited, be interpreted or when necessary with other for example by carrying out optical scanner to paper or other media Suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized:With for realizing the logic gates of logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement, These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim It protects subject to range.

Claims (14)

1. a kind of panorama content positioning method, which is characterized in that including:
Semantic analysis is carried out to the control voice of input, to determine that user demand, the user demand include that user needs to operate Operation pages, in operation object and action type at least one of;
If the user demand is to operate to the current page of panorama content, image knowledge is carried out to the current page Not, to have been searched whether in the current page and the matched entity of the operation object;
If have in the current page with the matched entity of the operation object, according to interaction in the current page Rule of conduct and the action type operate the matched entity.
2. the method according to claim 1, wherein further including:
If the user demand is to operate to the scene except current page, searched whether according to panorama relation data There is matched scene;
If finding matched scene, the matched field is operated according to the interbehavior rule and the action type Scape.
3. method according to claim 1 or 2, which is characterized in that further include:
According to preset thingness rule, by the feature of machine learning different entities, image recognition model is obtained;
Wherein, described image identification model includes in panorama content for identification each entity simultaneously records each entity in panorama content In coordinate.
4. according to the method described in claim 3, it is characterized in that, having been searched whether in the current page and the operation The entity of object matching, including:
The three-dimensional corresponding two dimensional image of current panorama content is inputted into described image identification model;
It is searched by described image identification model in each entity attributes of the current page with the presence or absence of the operation object Attribute;
If it is present obtaining the existing corresponding entity of attribute in the coordinate of the current page.
5. according to the method described in claim 4, it is characterized in that, searching the current page by described image identification model Each entity attributes in whether there is the operation object attribute, including:
Network-based graphic language technology is used to rebuild three-dimensional environment for the two dimensional image;
Searched under the three-dimensional environment by described image identification model in each entity attributes of the current page whether There are the attributes of the operation object.
6. the method according to any one of claims 1 to 5, which is characterized in that include each in the interbehavior rule The corresponding JSON character string of kind action type.
7. a kind of panorama content positioning device, which is characterized in that including:
Speech analysis module, for carrying out semantic analysis to the control voice of input, to determine user demand, the user demand At least one of in operation pages, operation object and the action type for needing to operate including user;
Picture recognition module, if being to the current of panorama content for the user demand that the speech analysis module obtains The page is operated, then to the current page carry out image recognition, with searched whether in the current page with it is described The matched entity of operation object;
Page interactive module, if for have in the current page with the matched entity of the operation object, described The matched entity is operated according to interbehavior rule and the action type in current page.
8. device according to claim 7, which is characterized in that further include:
If it is to operate to the scene except current page that the speech analysis module, which is also used to the user demand, root Matched scene has been searched whether according to panorama relation data;
If the page interactive module is also used to the speech analysis module and finds matched scene, according to the interaction Rule of conduct and the action type operate the matched scene.
9. device according to claim 7 or 8, which is characterized in that further include:
Machine learning module, for by the feature of machine learning different entities, obtaining according to preset thingness rule To image recognition model;
Wherein, described image identification model includes in panorama content for identification each entity simultaneously records each entity in panorama content In coordinate.
10. device according to claim 9, which is characterized in that described image identification module is also used to:
The three-dimensional corresponding two dimensional image of current panorama content is inputted into described image identification model;Mould is identified by described image Type searches the attribute that whether there is the operation object in each entity attributes of the current page;If it is present obtaining Coordinate of the corresponding entity of existing attribute in the current page.
11. device according to claim 10, which is characterized in that search the current page by described image identification model It whether there is the attribute of the operation object in each entity attributes in face, including:
Network-based graphic language technology is used to rebuild three-dimensional environment for the two dimensional image;
Searched under the three-dimensional environment by described image identification model in each entity attributes of the current page whether There are the attributes of the operation object.
12. device according to any one of claims 7 to 11, which is characterized in that include in the interbehavior rule The corresponding JSON character string of various action types.
13. a kind of panorama content positioning device, which is characterized in that described device includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors Realize such as method described in any one of claims 1 to 6.
14. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor Such as method described in any one of claims 1 to 6 is realized when row.
CN201810679316.6A 2018-06-27 2018-06-27 Panoramic content positioning method and device Active CN108874360B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810679316.6A CN108874360B (en) 2018-06-27 2018-06-27 Panoramic content positioning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810679316.6A CN108874360B (en) 2018-06-27 2018-06-27 Panoramic content positioning method and device

Publications (2)

Publication Number Publication Date
CN108874360A true CN108874360A (en) 2018-11-23
CN108874360B CN108874360B (en) 2023-04-07

Family

ID=64295221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810679316.6A Active CN108874360B (en) 2018-06-27 2018-06-27 Panoramic content positioning method and device

Country Status (1)

Country Link
CN (1) CN108874360B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020107813A1 (en) * 2018-11-30 2020-06-04 北京市商汤科技开发有限公司 Method and apparatus for positioning descriptive statement in image, electronic device and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103136545A (en) * 2011-11-22 2013-06-05 中国科学院电子学研究所 High resolution remote sensing image analysis tree automatic extraction method based on space consistency
US20150269420A1 (en) * 2014-03-19 2015-09-24 Qualcomm Incorporated Method and Apparatus for Establishing Connection Between Electronic Devices
CN105590627A (en) * 2014-11-12 2016-05-18 三星电子株式会社 Image display apparatus, method for driving same, and computer readable recording medium
CN105979035A (en) * 2016-06-28 2016-09-28 广东欧珀移动通信有限公司 AR image processing method and device as well as intelligent terminal
CN106033435A (en) * 2015-03-13 2016-10-19 北京贝虎机器人技术有限公司 Article identification method and apparatus, and indoor map generation method and apparatus
US20160306606A1 (en) * 2015-04-14 2016-10-20 Hon Hai Precision Industry Co., Ltd. Audio control system and control method thereof
CN107608652A (en) * 2017-08-28 2018-01-19 三星电子(中国)研发中心 A kind of method and apparatus of Voice command graphical interfaces
CN107632814A (en) * 2017-09-25 2018-01-26 珠海格力电器股份有限公司 Player method, device and system, storage medium, the processor of audio-frequency information
CN107977183A (en) * 2017-11-16 2018-05-01 百度在线网络技术(北京)有限公司 voice interactive method, device and equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103136545A (en) * 2011-11-22 2013-06-05 中国科学院电子学研究所 High resolution remote sensing image analysis tree automatic extraction method based on space consistency
US20150269420A1 (en) * 2014-03-19 2015-09-24 Qualcomm Incorporated Method and Apparatus for Establishing Connection Between Electronic Devices
CN105590627A (en) * 2014-11-12 2016-05-18 三星电子株式会社 Image display apparatus, method for driving same, and computer readable recording medium
CN106033435A (en) * 2015-03-13 2016-10-19 北京贝虎机器人技术有限公司 Article identification method and apparatus, and indoor map generation method and apparatus
US20160306606A1 (en) * 2015-04-14 2016-10-20 Hon Hai Precision Industry Co., Ltd. Audio control system and control method thereof
CN105979035A (en) * 2016-06-28 2016-09-28 广东欧珀移动通信有限公司 AR image processing method and device as well as intelligent terminal
CN107608652A (en) * 2017-08-28 2018-01-19 三星电子(中国)研发中心 A kind of method and apparatus of Voice command graphical interfaces
CN107632814A (en) * 2017-09-25 2018-01-26 珠海格力电器股份有限公司 Player method, device and system, storage medium, the processor of audio-frequency information
CN107977183A (en) * 2017-11-16 2018-05-01 百度在线网络技术(北京)有限公司 voice interactive method, device and equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020107813A1 (en) * 2018-11-30 2020-06-04 北京市商汤科技开发有限公司 Method and apparatus for positioning descriptive statement in image, electronic device and storage medium
TWI728564B (en) * 2018-11-30 2021-05-21 大陸商北京市商湯科技開發有限公司 Method, device and electronic equipment for image description statement positioning and storage medium thereof
US11455788B2 (en) 2018-11-30 2022-09-27 Beijing Sensetime Technology Development Co., Ltd. Method and apparatus for positioning description statement in image, electronic device, and storage medium

Also Published As

Publication number Publication date
CN108874360B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
Noh Imagining library 4.0: Creating a model for future libraries
CN110020411B (en) Image-text content generation method and equipment
AU2020311360A1 (en) Enhancing tangible content on physical activity surface
JP6381002B2 (en) Search recommendation method and apparatus
CN104573099B (en) The searching method and device of topic
CN109635077A (en) Calculation method, device, electronic equipment and the storage medium of text similarity
CN109478185A (en) Map notes
CN108491421A (en) A kind of method, apparatus, equipment and computer storage media generating question and answer
CN106557554B (en) The display methods and device of search result based on artificial intelligence
CN108829371A (en) interface control method, device, storage medium and electronic equipment
CN108388650A (en) Need-based search processing method, device and smart machine
CN109710845A (en) Information recommended method, device, computer equipment and readable storage medium storing program for executing
JP2006065754A (en) Information processor, information processing method, and program
Wellner From cellphones to machine learning. A shift in the role of the user in algorithmic writing
CN109582882A (en) Search result shows method, apparatus and electronic equipment
JP6832322B2 (en) Search device, search method, search program and recording medium
CN102930048A (en) Data abundance automatically found by semanteme and using reference and visual data
CN106294481A (en) A kind of air navigation aid based on collection of illustrative plates and device
CN108614872A (en) Course content methods of exhibiting and device
CN111125550B (en) Point-of-interest classification method, device, equipment and storage medium
CN108874360A (en) Panorama content positioning method and device
CN109657043A (en) Automatically generate the method, apparatus, equipment and storage medium of article
CN110362688A (en) Examination question mask method, device, equipment and computer readable storage medium
CN111723177B (en) Modeling method and device of information extraction model and electronic equipment
CN111241236B (en) Task-oriented question-answering method, system, electronic device and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant