CN113178019B - Indication information identification method, system and storage medium based on video content - Google Patents

Indication information identification method, system and storage medium based on video content Download PDF

Info

Publication number
CN113178019B
CN113178019B CN202110642375.8A CN202110642375A CN113178019B CN 113178019 B CN113178019 B CN 113178019B CN 202110642375 A CN202110642375 A CN 202110642375A CN 113178019 B CN113178019 B CN 113178019B
Authority
CN
China
Prior art keywords
video
axis
freedom
constraint
degree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110642375.8A
Other languages
Chinese (zh)
Other versions
CN113178019A (en
Inventor
徐异凌
管云峰
柳宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202110642375.8A priority Critical patent/CN113178019B/en
Publication of CN113178019A publication Critical patent/CN113178019A/en
Application granted granted Critical
Publication of CN113178019B publication Critical patent/CN113178019B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/012Walk-in-place systems for allowing a user to walk in a virtual environment while constraining him to a given position in the physical environment

Abstract

The invention provides an indication information identification method based on video content, which comprises the following steps: a video type judging step: judging whether the video is three-degree-of-freedom plus video; and (3) constraint establishment step: and establishing constraint on the three degrees of freedom plus the video. The invention can restrain the 3DoF + video and carry out information identification on the 3DoF + content of the video media, the identification information indicates the specific 3DoF + restraint information of the part of video, and in specific applications, such as virtual navigation, virtual stadium and the like, the identification information provided by the invention can be further used for processing and presenting client application or service.

Description

Indication information identification method, system and storage medium based on video content
Technical Field
The invention relates to the technical field of virtual reality, in particular to an indication information identification method based on video content, and particularly relates to an indication information identification method, system and storage medium based on six degrees of freedom and used for a specific application scene during video presentation and consumption.
Background
With the rapid development of Virtual Reality (VR) technology, demand for VR systems is increasing, and technically, the development from three degrees of Freedom (3 Degree of Freedom,3 DoF) to three degrees of Freedom plus (3 Degree of Freedom +,3DoF +) to six degrees of Freedom (6 Degree of Freedom,6 DoF) is being implemented, and 6DoF tracking technology makes interaction in the Virtual Reality world possible by allowing a user to move in the VR space to create an immersive experience. The 3DoF supports the user's head to make three rotations, yaw, roll, pitch (i.e., yaw, roll, pitch) as shown in fig. 1. The 3DoF + supports the head of the user to perform small-range translational motion in six directions of the X axis, the Y axis and the Z axis, namely up, down, left, right, front and back on the basis of the 3DoF, as shown in figure 2. The 6DoF can not only perform three rotations of Yaw, roll and Pitch like the 3DoF, but also track the translation of the user on the X, Y and Z axes. There are three main definitions of 6DoF currently defined, namely window 6DoF, omnidirectional 6DoF and full 6DoF, as shown in fig. 3, 4 and 5. For the window 6DoF, the head of the user can rotate within the restricted range of Yaw and Pitch, and the unlimited range of Roll, and the user can perform translational motion within the restricted range of the forward X-axis and the unrestricted ranges of other five directions; for the omnidirectional 6DoF, the head of the user can rotate within the unlimited ranges of Yaw, roll and Pitch, and the user can perform limited translation motion within the ranges of six directions of up, down, left, right, front and back of an X axis, a Y axis and a Z axis; for the complete 6DoF, the head of the user can rotate without limitation by Yaw, roll and Pitch, and the user can perform translational motion without limitation in the range of six directions of up, down, left, right, front and back of X, Y and Z axes. In order to meet the requirements of different application scenarios, additional identification of the belonging 3DoF + and 6DoF information of the video content is required to meet the further indication of the specific information.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a method and a system for indicating information identification based on video content.
The method for identifying the indication information based on the video content comprises the following steps:
a video type judging step: judging whether the video is added with three degrees of freedom or not, and establishing a constraint: establishing constraint on the three degrees of freedom plus the video;
further, establishing constraints on the three degrees of freedom plus the video means establishing constraints on the movement of the head of the viewer on the three degrees of freedom plus the video;
further, the method for establishing the constraint comprises the following steps: respectively setting a maximum value and a minimum value on an x axis, a y axis and a z axis;
further, the fields for setting the maximum value and the minimum value on the x-axis, the y-axis, and the z-axis respectively are: HXmax, HXmin, HYmax, HYmin, HZmax, HZmin, the field represents depth information of a virtual scene, and the virtual scene is a scene presented in virtual reality VR.
The invention also provides an indication information identification system based on video content, which comprises:
the video type judging module: judging whether the video is added in three degrees of freedom, and a constraint establishing module: establishing constraint on the three degrees of freedom plus the video;
further, the establishing of the constraint on the three degrees of freedom plus the video means that the constraint on the head movement of the viewer is established on the three degrees of freedom plus the video;
further, the method for establishing the constraint comprises the following steps: respectively setting a maximum value and a minimum value on an x axis, a y axis and a z axis;
further, the fields for setting the maximum value and the minimum value on the x-axis, the y-axis, and the z-axis respectively are: HXmax, HXmin, HYmax, HYmin, HZmax, HZmin, the field representing depth information of the virtual scene; the virtual scene is a scene presented in a virtual reality VR.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention can classify and constrain videos such as 3DoF +, 6DoF and the like, and carry out information identification on the contents such as 3DoF +, 6DoF and the like of the video media, wherein the identification information indicates specific 3DoF +, 6DoF type information and constraint information of the part of videos.
2. In specific applications, such as virtual navigation, virtual stadium, etc., the identification information provided by the present invention may be further used for processing and presentation of client applications or services.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a schematic view of 3DoF degrees of freedom;
FIG. 2 is a schematic diagram of 3DoF + degrees of freedom;
FIG. 3 is a diagram of DOF degrees of freedom for window 6;
FIG. 4 is a schematic view of an omnidirectional 6DoF degree of freedom;
FIG. 5 is a schematic view of a full 6DoF degree of freedom;
FIG. 6 is a logic flow diagram of a method for indicating information identification based on video content;
FIG. 7 is an organization of VR information in a preferred embodiment;
FIG. 8 is a logic flow diagram of a method for identifying indication information based on video content in response to video source and various video constraints.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will aid those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any manner. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the concept of the invention. All falling within the scope of the present invention.
In the description of the present invention, it is to be understood that the terms "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention.
As shown in fig. 6, the method for identifying indication information based on video content provided by the present invention includes the following steps: video storage step: storing the video on a server side; analyzing and shaping: analyzing the video, and setting a video parent type according to an analysis result; and an attribute setting step: and setting video attributes according to the video parent type.
In the analyzing and sizing step, the video is analyzed according to the following factors: whether the video content changes with the translation; whether the video content changes with the rotation. The parent type of video contains the following: a non-panoramic video; three-degree-of-freedom video; adding video in three degrees of freedom; six degree of freedom video. The attribute setting step comprises: setting the attribute of the non-panoramic video to common; setting the attribute of the three-degree-of-freedom video as 3DoF; setting the attribute of the three degrees of freedom plus the video as 3DoF +; the attribute of the six-degree-of-freedom video is set to 6DoF.
The method for identifying the indication information based on the video content further comprises the following steps: setting a subtype with six degrees of freedom: setting a video subtype of a six degree of freedom video, the subtype comprising the following: a window six-degree-of-freedom video; omnidirectional six-degree-of-freedom video; full six-degree-of-freedom video. In the label setting step: setting a label 0x01 for a window six-degree-of-freedom video; setting a label 0x02 for the omnidirectional six-degree-of-freedom video; the label 0x00 is set for a full six-degree-of-freedom video. Preferably, the video content-based indication information identification method further comprises a constraint establishing step of: for any of a plurality of video types: the method comprises the following steps of establishing any one or more of the following constraints: a viewer head rotation constraint, a viewer head translation constraint, a viewer body translation constraint. In practical applications, the viewer body translation constraint corresponds to the viewer foot movement constraint.
Correspondingly, the invention also provides an indication information identification system based on the video content, which comprises the following modules: the video storage module: storing the video on a server side; analyzing and shaping module: analyzing the video, and setting a video parent type according to an analysis result; an attribute setting module: and setting video attributes according to the video parent type.
In the analysis and shaping module, the video is analyzed according to the following factors: whether the video content changes with the translation; whether the video content changes with the rotation. The parent type of video contains the following: a non-panoramic video; three-degree-of-freedom video; adding video in three degrees of freedom; six degree of freedom video. In the attribute setting module: setting the attribute of the non-panoramic video to common; setting the attribute of the three-degree-of-freedom video as 3DoF; setting the attribute of the three degrees of freedom plus the video as 3DoF +; the attribute of the six-degree-of-freedom video is set to 6DoF.
The video content-based indication information identification system further comprises the following modules: the six-degree-of-freedom subtype setting module: setting a video subtype of a six degree of freedom video, the subtype comprising the following: a window six-degree-of-freedom video; omnidirectional six-degree-of-freedom video; complete six-degree-of-freedom video. In the label setting module: setting a label 0x01 for a window six-degree-of-freedom video; setting a label 0x02 for the omnidirectional six-degree-of-freedom video; the label 0x00 is set for a full six-degree-of-freedom video. Preferably, the video content-based indication information identification system further comprises a constraint building module: for any of a plurality of video types: the method comprises the following steps of establishing any one or more of the following constraints: a viewer head rotation constraint, a viewer head translation constraint, a viewer body translation constraint. In practical application, the body translation constraint of the viewer corresponds to the foot movement constraint of the viewer.
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the above-described method for identifying indication information based on video content.
The preferred embodiment:
the invention is applied in a specific protocol, taking MMT transmission signaling in an OMAF standard as an example, the following fields can be reasonably added according to requirements:
6DoF _type: and indicating the label type of the 6DoF to which the video content of the area belongs, wherein the values and meanings of the label type are shown in the following table.
Value taking Description of the invention
0x00 Complete 6DoF
0x01 Window 6DoF
0x02 Omnidirectional 6DoF
0x03~0xFF The part is a reserved field
The constraints added to different video types may be as follows:
3DoF+{
HXmax
HXmin
HYmax
HYmin
HZmax
HZmin
}
window 6DoF
Xmax
Yawmax
Yawmin
Pitchmax
Pitchmin
}
Omnidirectional 6DoF
Xmax
Xmin
Ymax
Ymin
Zmax
Zmin
}
HXmax, HXmin, HYmax, HYmin, HZmax, HZmin indicate a maximum value of head movement of the viewer on the x-axis, a minimum value of head movement on the x-axis, a maximum value of head movement on the y-axis, a minimum value of head movement on the y-axis, a maximum value of head movement on the z-axis, a minimum value of head movement on the z-axis, respectively;
xmax, xmin, ymax, ymin, zmax, zmin indicate the maximum value of the movement of the steps of the viewer on the x-axis, the minimum value of the movement of the steps on the x-axis, the maximum value of the movement of the steps on the y-axis, the minimum value of the movement of the steps on the y-axis, the maximum value of the movement of the steps on the z-axis, and the minimum value of the movement of the steps on the z-axis, respectively;
yawmax, yawmin, pitchmax, pitchmin indicate the maximum value of head rotation on yaw, the minimum value of head rotation on yaw, the maximum value of head rotation on pitch, the minimum value of head rotation on pitch, respectively, for the viewer.
Based on the above Information, fig. 7 gives an organization structure of VR Asset Information descriptor in MMT transmission signaling in the OMAF standard for these Information.
Virtual stadium applications of application instances:
in panoramic video applications, although a 360-degree view angle range is included, the control of the user side for switching the viewing direction is limited, and is often limited to consuming the prepared 360-degree panoramic content, and the view of the user does not change along with the movement of the user. The indication information provided by the invention can identify the 6DoF video content associated label, so that the user preference is combined, an immersive feeling is provided for people through a 3D display screen and a surround sound system, the video is positioned to a corresponding visual angle along with the movement of the user, and the video of the part is presented to the user.
In particular, by positioning multiple sensors to capture multiple views of a live event, the cameras can be fixed at the perimeter of the stadium or along tracks that allow the cameras to move to capture multiple views of the scene, even taking a photograph overhead using a helicopter. When the user consumes the panoramic video, the media content is presented to the user by locating the tag with 6DoF _, type 0X00 and according to the corresponding restrictions in 6DoF for three rotations Yaw, roll, pitch and X, Y, Z axis translations.
In summary, in many video media applications, people pay more attention to the immersion of users, the interaction between users and the environment and among users, and the like, for example, in virtual navigation, by setting a window 6DoF with a limited forward range of an X axis for a user, a better audio-visual experience can be obtained than that of a traditional VR video; in a virtual stadium, the same game is experienced by using another empty stadium by using the 6DoF technology without limitation, so that a very good immersion experience is brought to the user, which cannot be achieved by the traditional three-degree-of-freedom.
It is known to those skilled in the art that, in addition to implementing the system, apparatus and its various modules provided by the present invention in pure computer readable program code, the system, apparatus and its various modules provided by the present invention can be implemented in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like by completely programming the method steps. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (2)

1. A method for indicating information identification based on video content is characterized by comprising the following steps:
a video type judging step: judging whether the video is a three-degree-of-freedom video;
and (3) constraint establishment step: establishing constraint on the three degrees of freedom plus the video;
the three-degree-of-freedom plus video building constraint is that the three-degree-of-freedom plus video building constraint on the head movement of a viewer;
the method for establishing the constraint comprises the following steps: respectively setting a maximum value and a minimum value on an x axis, a y axis and a z axis;
the fields for respectively setting the maximum value and the minimum value on the x axis, the y axis and the z axis are as follows: HXmax, HXmin, HYmax, HYmin, HZmax and HZmin, wherein the field represents the depth information of the virtual scene; the virtual scene is a scene presented in a virtual reality VR.
2. An indication information identification system based on video content, comprising:
a video type judging module: judging whether the video is three-degree-of-freedom plus video;
a constraint establishing module: establishing constraint on the three degrees of freedom plus the video;
the three-degree-of-freedom plus video building constraint is that the three-degree-of-freedom plus video building constraint on the head movement of a viewer;
the method for establishing the constraint comprises the following steps: respectively setting a maximum value and a minimum value on an x axis, a y axis and a z axis;
the fields for respectively setting the maximum value and the minimum value on the x axis, the y axis and the z axis are as follows: HXmax, HXmin, HYmax, HYmin, HZmax and HZmin, wherein the field represents the depth information of the virtual scene; the virtual scene is a scene presented in a virtual reality VR.
CN202110642375.8A 2018-07-09 2018-07-20 Indication information identification method, system and storage medium based on video content Active CN113178019B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110642375.8A CN113178019B (en) 2018-07-09 2018-07-20 Indication information identification method, system and storage medium based on video content

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201810745689 2018-07-09
CN202110642375.8A CN113178019B (en) 2018-07-09 2018-07-20 Indication information identification method, system and storage medium based on video content
CN201810806586.9A CN110706355B (en) 2018-07-09 2018-07-20 Indication information identification method, system and storage medium based on video content

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201810806586.9A Division CN110706355B (en) 2018-07-09 2018-07-20 Indication information identification method, system and storage medium based on video content

Publications (2)

Publication Number Publication Date
CN113178019A CN113178019A (en) 2021-07-27
CN113178019B true CN113178019B (en) 2023-01-03

Family

ID=69192666

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201810806586.9A Active CN110706355B (en) 2018-07-09 2018-07-20 Indication information identification method, system and storage medium based on video content
CN202110642375.8A Active CN113178019B (en) 2018-07-09 2018-07-20 Indication information identification method, system and storage medium based on video content

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201810806586.9A Active CN110706355B (en) 2018-07-09 2018-07-20 Indication information identification method, system and storage medium based on video content

Country Status (1)

Country Link
CN (2) CN110706355B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113542907B (en) * 2020-04-16 2022-09-23 上海交通大学 Multimedia data transceiving method, system, processor and player
CN116248642A (en) * 2020-10-14 2023-06-09 腾讯科技(深圳)有限公司 Media file encapsulation method, media file decapsulation method and related equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201713885D0 (en) * 2017-08-30 2017-10-11 Nokia Technologies Oy Moving between spatially limited video content and omnidirectional video content
CN107368192A (en) * 2017-07-18 2017-11-21 歌尔科技有限公司 The outdoor scene observation procedure and VR glasses of VR glasses
CN107918482A (en) * 2016-10-08 2018-04-17 天津锋时互动科技有限公司深圳分公司 The method and system of overstimulation is avoided in immersion VR systems

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101521655B1 (en) * 2007-10-13 2015-05-20 삼성전자주식회사 Apparatus and method for providing stereoscopic three-dimension image/video contents on terminal based on Lightweight Application Scene Representation
CN101616334A (en) * 2008-08-21 2009-12-30 青岛海信电器股份有限公司 The display packing of vision signal and device
CN101763877B (en) * 2009-12-23 2012-11-14 无锡中星微电子有限公司 Method and device for rapidly verifying chip by multimedia player
KR102545195B1 (en) * 2016-09-12 2023-06-19 삼성전자주식회사 Method and apparatus for delivering and playbacking content in virtual reality system
CN112770178A (en) * 2016-12-14 2021-05-07 上海交通大学 Panoramic video transmission method, panoramic video receiving method, panoramic video transmission system and panoramic video receiving system
CN108154533A (en) * 2017-12-08 2018-06-12 北京奇艺世纪科技有限公司 A kind of position and attitude determines method, apparatus and electronic equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107918482A (en) * 2016-10-08 2018-04-17 天津锋时互动科技有限公司深圳分公司 The method and system of overstimulation is avoided in immersion VR systems
CN107368192A (en) * 2017-07-18 2017-11-21 歌尔科技有限公司 The outdoor scene observation procedure and VR glasses of VR glasses
GB201713885D0 (en) * 2017-08-30 2017-10-11 Nokia Technologies Oy Moving between spatially limited video content and omnidirectional video content

Also Published As

Publication number Publication date
CN110706355A (en) 2020-01-17
CN110706355B (en) 2021-05-11
CN113178019A (en) 2021-07-27

Similar Documents

Publication Publication Date Title
US11037601B2 (en) Spherical video editing
US10861249B2 (en) Methods and system for manipulating digital assets on a three-dimensional viewing platform
CN102253711A (en) Enhancing presentations using depth sensing cameras
US20060114251A1 (en) Methods for simulating movement of a computer user through a remote environment
WO2019069156A1 (en) Transmission of real-time visual data to a remote recipient
CN108960947A (en) Show house methods of exhibiting and system based on virtual reality
CN113178019B (en) Indication information identification method, system and storage medium based on video content
US20230073750A1 (en) Augmented reality (ar) imprinting methods and systems
CN110944222B (en) Method and system for immersive media content as user moves
TWI795762B (en) Method and electronic equipment for superimposing live broadcast character images in real scenes
US20030038814A1 (en) Virtual camera system for environment capture
CN117635815A (en) Initial visual angle control and presentation method and system based on three-dimensional point cloud
Jacob et al. Arduino object follower with augmented reality
EP3712751A1 (en) Method and apparatus for incorporating location awareness in media content
Huang et al. Intelligent video surveillance of tourist attractions based on virtual reality technology
CN113269781A (en) Data generation method and device and electronic equipment
CN110704673B (en) Feedback information identification method, system and storage medium based on video content consumption
CN115512046A (en) Panorama display method and device for model outer point positions, equipment and medium
CN112929685B (en) Interaction method and device for VR live broadcast room, electronic device and storage medium
Yu et al. Action input interface of IntelligentBox using 360-degree VR camera and OpenPose for multi-persons’ collaborative VR applications
Xin et al. Application of 3D tracking and registration in exhibition hall navigation interaction
CN110349270B (en) Virtual sand table presenting method based on real space positioning
CN113191462A (en) Information acquisition method, image processing method and device and electronic equipment
Zhang et al. Virtual Museum Scene Design Based on VRAR Realistic Interaction under PMC Artificial Intelligence Model
EP4072149A1 (en) Media resource playing and text rendering method, apparatus and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant