WO2022111005A1 - 虚拟现实设备及vr场景图像识别方法 - Google Patents

虚拟现实设备及vr场景图像识别方法 Download PDF

Info

Publication number
WO2022111005A1
WO2022111005A1 PCT/CN2021/119318 CN2021119318W WO2022111005A1 WO 2022111005 A1 WO2022111005 A1 WO 2022111005A1 CN 2021119318 W CN2021119318 W CN 2021119318W WO 2022111005 A1 WO2022111005 A1 WO 2022111005A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
recognition
recognized
result
source
Prior art date
Application number
PCT/CN2021/119318
Other languages
English (en)
French (fr)
Inventor
孟亚洲
Original Assignee
海信视像科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 海信视像科技股份有限公司 filed Critical 海信视像科技股份有限公司
Publication of WO2022111005A1 publication Critical patent/WO2022111005A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes

Definitions

  • the present application relates to the technical field of virtual reality devices, and in particular, to a virtual reality device and a method for recognizing a VR scene image.
  • Virtual Reality (VR) technology is a display technology that simulates a virtual environment through a computer, thereby giving people a sense of immersion in the environment.
  • a virtual reality device is a device that uses virtual display technology to present virtual images to users to achieve immersion.
  • a virtual reality device includes two display screens for presenting virtual picture content, corresponding to the left and right eyes of the user respectively. When the contents displayed on the two display screens come from images of the same object from different viewing angles, a three-dimensional viewing experience can be brought to the user.
  • image recognition can be performed on the content displayed by the virtual reality device, for example, through image analysis, locating portraits, special objects, etc. in the image.
  • the virtual reality device can take a screenshot of the displayed content, and perform an image recognition program on the obtained screenshot image.
  • the virtual reality device adapts to the distortion effect of optical components, the content displayed on the screen is deformed, and the deviation from the actual pattern is large, and for different types of film sources, the degree of deformation of the displayed content is different, so that the image recognition result cannot be displayed correctly.
  • the virtual reality device includes: a display and a controller. wherein the display is configured to display a user interface; the controller is configured to perform the following program steps:
  • the recognition result is displayed in the user interface according to the source type of the image to be recognized.
  • the first aspect of the present application also provides a method for recognizing a VR scene image, which is applied to a virtual reality device, and the method includes:
  • the recognition result is displayed in the user interface according to the source type of the image to be recognized.
  • the present application also provides a virtual reality device, including: a display, a communicator, and a controller.
  • the display is configured to display a user interface;
  • the communicator is configured to connect to a server;
  • the controller is configured to perform the following program steps:
  • the recognition result is displayed in the user interface according to the source type of the image to be recognized.
  • the VR scene image recognition method provided by the second aspect of the present application is applied to the virtual reality device, and the method includes:
  • the recognition result is displayed in the user interface according to the source type of the image to be recognized.
  • FIG. 1 is a schematic structural diagram of a display system including a virtual reality device in an embodiment of the application
  • FIG. 2 is a schematic diagram of a VR scene global interface in an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a recommended content area of a global interface in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an application shortcut operation entry area of a global interface in an embodiment of the present application
  • FIG. 5 is a schematic diagram of a suspended matter of a global interface in an embodiment of the present application.
  • FIG. 6a is a schematic diagram of a VR screen in an embodiment of the present application.
  • FIG. 6b is a schematic diagram of a person identification result in an embodiment of the present application.
  • 6c is a schematic diagram of a building identification result in an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a VR scene image recognition method according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an initial state of a VR scene in an embodiment of the present application.
  • FIG. 9 is a schematic diagram of the effect of displaying a picture in an embodiment of the present application.
  • FIG. 10 is a schematic diagram of the effect of displaying a recognition result in an embodiment of the application.
  • FIG. 11 is a schematic flowchart of generating a recognition result according to a film source type in an embodiment of the present application.
  • FIG. 12 is a schematic diagram of an initial display state of a 3D film source in an embodiment of the present application.
  • FIG. 13 is a schematic diagram showing a 3D source identification result in an embodiment of the present application.
  • FIG. 14 is a schematic diagram of an initial display state of a 360-degree panorama source in an embodiment of the present application.
  • FIG. 15 is a schematic diagram showing a 360 panorama image source identification result in an embodiment of the application.
  • 16 is a schematic diagram of the coordinates of the recognition result in the embodiment of the application.
  • FIG. 17 is a schematic diagram of the coordinate mapping state of the recognition result in the embodiment of the present application.
  • FIG. 18 is a schematic flowchart of another VR scene image recognition method according to an embodiment of the present application.
  • module refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic or combination of hardware or/and software code capable of performing the function associated with that element.
  • the virtual reality device 500 generally refers to a display device that can be worn on the user's face to provide the user with an immersive experience, including but not limited to VR glasses, Augmented Reality (AR), and VR game devices , mobile computing devices, and other wearable computers.
  • the virtual reality device 500 can run independently, or be connected to other smart display devices as an external device, where the display device can be a smart TV, a computer, a tablet computer, a server, or the like.
  • the virtual reality device 500 can display a media image to provide a close-up image for the user's eyes, so as to bring an immersive experience.
  • the virtual reality device 500 may include a number of components for display and face wear.
  • the virtual reality device 500 may include components such as a casing, temples, an optical system, a display component, a posture detection circuit, and an interface circuit.
  • the optical system, the display component, the attitude detection circuit and the interface circuit can be arranged in the casing for presenting a specific display screen; temples are connected on both sides of the casing to be worn on the user's face.
  • the attitude detection circuit When in use, the attitude detection circuit has built-in attitude detection elements such as gravitational acceleration sensor and gyroscope. When the user's head moves or rotates, the user's attitude can be detected, and the detected attitude data can be transmitted to the controller, etc.
  • the processing element enables the processing element to adjust the specific screen content in the display assembly according to the detected gesture data.
  • the manner in which the specific screen content is presented is also different.
  • the built-in controller generally does not directly participate in the control process of the displayed content, but sends the gesture data to an external device, such as a computer, for processing by the external device, and The specific screen content to be displayed is determined in the external device, and then sent back to the VR glasses to display the final screen in the VR glasses.
  • the virtual reality device 500 shown can be connected to the display device 200, and a network-based display system is constructed between the virtual reality device 500, the display device 200 and the server 400 in real time.
  • the display device 200 may acquire media asset data from the server 400 and play it, and transmit the specific screen content to the virtual reality device 500 for display.
  • the display device 200 may be a liquid crystal display, an OLED display, or a projection display device.
  • the specific display device type, size and resolution are not limited. Those skilled in the art can understand that the display device 200 can make some changes in performance and configuration as required.
  • the display device 200 may provide the function of broadcasting and receiving TV, and may additionally provide the function of intelligent IPTV with computer support function, including but not limited to, IPTV, smart TV, Internet Protocol TV (IPTV) and the like.
  • IPTV Internet Protocol TV
  • the display device 200 and the virtual reality device 500 also perform data communication with the server 400 through various communication methods.
  • the display device 200 and the virtual reality device 500 may be allowed to be communicatively connected through a local area network (LAN), a wireless local area network (WLAN), and other networks.
  • the server 400 may provide various contents and interactions to the display device 200 .
  • the display device 200 interacts by sending and receiving information, and electronic program guide (EPG), receiving software program updates, or accessing a remotely stored digital media library.
  • EPG electronic program guide
  • the server 400 may be a cluster or multiple clusters, and may include one or more types of servers. Other network service contents such as video-on-demand and advertising services are provided through the server 400 .
  • the user can operate the display device 200 through the mobile terminal 100A and the remote controller 100B.
  • the mobile terminal 100A and the remote controller 100B may communicate with the display device 200 in a direct wireless connection manner, or may communicate in an indirect connection manner. That is, in some embodiments, the mobile terminal 100A and the remote control 100B may communicate with the display device 200 through a direct connection such as Bluetooth, infrared, or the like.
  • the mobile terminal 100A and the remote controller 100B can directly send the control command data to the display device 200 via Bluetooth or infrared.
  • the mobile terminal 100A and the remote control 100B may also access the same wireless network with the display device 200 through a wireless router, so as to establish an indirect connection and communication with the display device 200 through the wireless network.
  • the mobile terminal 100A and the remote controller 100B may first send the control command data to the wireless router, and then forward the control command data to the display device 200 through the wireless router.
  • the user can also use the mobile terminal 100A and the remote controller 100B to directly interact with the virtual reality device 500.
  • the mobile terminal 100A and the remote controller 100B can be used as handles in the virtual reality scene to Realize functions such as somatosensory interaction.
  • the display component of the virtual reality device 500 includes a display screen and a driving circuit related to the display screen.
  • the display component may include two display screens, corresponding to the user's left eye and right eye respectively.
  • the screen contents displayed on the left and right screens will be slightly different, and the left and right cameras of the 3D source during shooting can be displayed respectively. Due to the screen content observed by the user's left and right eyes, a display screen with a strong three-dimensional effect can be observed when wearing the device.
  • the optical system in the virtual reality device 500 is an optical module composed of multiple lenses.
  • the optical system is set between the user's eyes and the display screen, which can increase the optical path through the refraction of the optical signal by the lens and the polarization effect of the polarizer on the lens, so that the content displayed by the display component can be clearly displayed in the user's field of vision.
  • the optical system also supports focusing, that is, adjusting the position of one or more of the multiple lenses through the focusing component, changing the mutual distance between the multiple lenses, and thus changing the optical path. Adjust the picture sharpness.
  • the interface circuit of the virtual reality device 500 can be used to transmit interactive data.
  • the virtual reality device 500 can also be connected to other display devices or peripherals through the interface circuit to pass and Data exchange between connected devices to achieve more complex functions.
  • the virtual reality device 500 may be connected to a display device through an interface circuit, so as to output the displayed picture to the display device in real time for display.
  • the virtual reality device 500 may also be connected to a handle through an interface circuit, and the handle may be operated by the user by hand, so as to perform related operations in the VR user interface.
  • the VR user interface can be presented as a variety of different types of UI layouts according to user operations.
  • the user interface may include a global interface, and the global UI after the AR/VR terminal is started is shown in FIG. 2 , and the global UI can be displayed on the display screen of the AR/VR terminal or on the display of the display device. middle.
  • the global UI may include a recommended content area 1 , a business classification extension area 2 , an application shortcut operation entry area 3 , and a suspended object area 4 .
  • Recommended content area 1 is used to configure TAB columns of different categories; in the columns, you can choose to configure media resources, topics, etc.; the media resources can include 2D film and television, education courses, travel, 3D, 360-degree panorama, live broadcast, 4K film and television , program applications, games, travel and other businesses with media content, and the column can choose different template styles, and can support simultaneous recommendation and arrangement of media resources and themes, as shown in Figure 3.
  • the business classification extension area 2 supports the configuration of extended classifications of different classifications. If there is a new business type, you can configure an independent TAB to display the corresponding page content.
  • the expansion classification in the business classification expansion area 2 can also be sorted and adjusted and offline business operations can be performed.
  • the business classification expands the content that the area 2 can include: film and television, education, travel, application, mine.
  • the service classification extension area 2 is configured to display a large service classification TAB, and supports configuration of more classifications, and its icons support configuration, as shown in FIG. 3 .
  • the application shortcut operation entry area 3 can designate pre-installed applications to be displayed first for operation recommendation, and supports configuring special icon styles to replace default icons, and multiple pre-installed applications can be designated.
  • the application shortcut operation entry area 3 further includes a leftward movement control and a rightward movement control for moving the option target, for selecting different icons, as shown in FIG. 4 .
  • the floating object area 4 can be configured to be above the left oblique side or the right oblique side of the fixed area, can be configured as a replaceable image, or configured as a jump link. For example, after receiving the confirmation operation, the suspended object jumps to an application, or displays a specified function page, as shown in FIG. 5 . In some embodiments, the suspended objects may not be configured with jump links, and are simply used for image display.
  • the global UI further includes a status bar at the top for displaying time, network connection status, battery status, and more shortcut operation entries.
  • the search icon will display the text "Search" and the original icon.
  • the search icon will jump to the search page; for another example, clicking the favorite icon will jump to the favorite TAB, click the history
  • click the search icon to jump to the global search page, and click the message icon to jump to the message page.
  • the interaction can be performed through peripheral devices, for example, the handle of the AR/VR terminal can operate the user interface of the AR/VR terminal, including the back button; the home button, and its long press can realize the reset function; the volume Addition and subtraction buttons; touch area, the touch area can realize the functions of clicking, sliding, pressing and dragging the focus.
  • the handle of the AR/VR terminal can operate the user interface of the AR/VR terminal, including the back button; the home button, and its long press can realize the reset function; the volume Addition and subtraction buttons; touch area, the touch area can realize the functions of clicking, sliding, pressing and dragging the focus.
  • the user can enter different scene interfaces through the global interface.
  • the user can enter the browsing interface through the "Browse Interface" entry in the global interface, or start the browsing interface by selecting any media asset in the global interface.
  • the virtual reality device 500 can create a 3D scene through the Unity 3D engine, and render specific screen content in the 3D scene.
  • the virtual reality device 500 can display the operation UI content in the browsing interface.
  • a list UI may also be displayed in front of the display panel in the Unity 3D scene, and the list UI may display media icons currently stored locally by the virtual reality device 500 , or display network media that can be played in the virtual reality device 500 . capital icon.
  • the user can select any icon in the list UI, and the selected media asset can be displayed in real time in the display panel.
  • the virtual reality device 500 may also perform image recognition on the displayed screen content, identify a specific image from the displayed screen, and mark it. For example, objects such as people, buildings, key markers, etc. can be identified in the displayed picture, and the location of the objects marked. While displaying the picture, the virtual reality device 500 also displays the mark of the target, for example, the identified person is framed through the identification frame.
  • the media assets that can be displayed in the Unity 3D scene can be in various forms such as pictures and videos, and, due to the display characteristics of the VR scene, the media assets displayed in the Unity 3D scene at least include 2D pictures or videos, 3D pictures or videos and 360 panoramic pictures or videos.
  • the 2D picture or video is a traditional picture or video file. When displayed, the same image can be displayed on the two display screens of the virtual reality device 500.
  • the 2D picture or video is collectively referred to as a 2D film source.
  • 3D pictures or videos that is, 3D film sources are made by at least two cameras shooting the same object at different angles, and can display different images on the two display screens of the virtual reality device 500;
  • 360 panorama Pictures or videos that is, 360 panorama sources, are 360-degree panoramic images obtained by panoramic cameras or special shooting methods.
  • the pictures can be displayed by creating a display sphere in the Unity 3D scene.
  • the recognition frame that can directly identify the result is displayed on the display panel, while for 360 panorama source, since it needs to be displayed on a spherical surface, and the recognition frame cannot be directly displayed on the spherical surface, the recognition frame can be used. Pointer to mark the position of the recognition result.
  • identification results can also be marked in other ways, such as geometric shapes such as indicator lines, circles, ellipses, triangles, and diamonds, or display effects such as highlight display and color transformation.
  • some prompt texts can also be used to explain the recognition result. For example, as shown in Figure 6b, when a person image is recognized, the gender, age and other information of the recognized person can be displayed near the recognition frame; as shown in Figure 6c, when a building target is recognized, it can be displayed near the recognition frame. Information such as the name of the identified building to improve the actual viewing experience of the user.
  • the VR scene image recognition method provided in some embodiments of the present application can be applied to the virtual reality device 500 .
  • the method includes the following:
  • the user inputs a control instruction for starting image recognition to the virtual reality device 500, so that the virtual reality device 500 recognizes the image after receiving the control instruction, and displays the image recognition result.
  • the display of the image recognition result can be used as an auxiliary display function of the virtual reality device 500 when displaying the media asset screen. Therefore, users can choose whether to enable the function of displaying the recognition results in real time according to their needs. For example, the user can enable the "AI" function in the setting interface, then the virtual reality device 500 will perform image recognition in real time while displaying the media asset screen content, and display the image recognition result in the media asset screen content.
  • a control instruction for starting image recognition that is, the control
  • the command can be input by the user after the user controls the focus cursor in the user interface to move to any picture icon by means of a remote control or a somatosensory handle, and then clicks the confirmation key or the play key.
  • the user does not turn on the auxiliary display function, when the user selects the switch button in the browsing interface and clicks the confirmation key to turn on the auxiliary display function, it means that the user inputs a control instruction for starting image recognition.
  • the control instruction can also be input in other ways, for example, the user can use a voice system, an external smart terminal and other devices.
  • the virtual reality device 500 may start to perform image recognition according to the control instruction. Since the image recognition is performed in different ways when the types of film sources displayed by the virtual reality device 500 are different, and the image recognition results are displayed in different ways, therefore, before image recognition is performed, the type of film source of the image to be recognized can be identified. detection, wherein the film source types include at least 2D film sources, 3D film sources and 360 panoramic film sources.
  • the controller can extract the displayed information such as the classification, format, extension, file description and other information of the displayed media resource after receiving the control instruction, so as to determine the clip source type of the currently displayed media resource. For example, for the network resource presented in the user interface, when the media resource is shared, the source type of the media resource can be indicated in the file description.
  • the source type of the currently displayed media resource can also be determined in combination with the specific picture content.
  • the extension of the image file of the displayed media resource is ".jpg"
  • the source type of the image currently to be recognized is 2D Film source; if the similarity of the pictures on both sides is relatively large, it can be determined that the film source type of the image to be recognized currently is a 3D film source.
  • the controller may perform image recognition on the image to be recognized according to the specific recognition method of the type of image, so as to generate a recognition result of the image to be recognized.
  • the specific image recognition manner is not limited in this embodiment.
  • a recognition model may be used for image recognition, that is, the image to be recognized may be input into the recognition model, and the recognition model may output the recognition result.
  • Different identification methods can also be selected according to specific user requirements and application scenarios, thereby obtaining different identification results.
  • different types of recognition models can be used. After detecting the source type of the image to be recognized, the image to be recognized can be input into the recognition model according to the input method corresponding to the source type, and the The model can calculate the image to be recognized through the preset image recognition algorithm to obtain the recognition result.
  • a scene recognition model can be built into the application, and the user wearing the virtual reality device 500 can browse different scenes, and at the same time, identify the specific targets in the scene through the image recognition algorithm, so that the scenic spot
  • the location marks the name, definition and other related information of the scenic spot.
  • the virtual reality device 500 may display the recognition result in the user interface.
  • the identification results of different source types can be displayed in different ways. For example, as shown in Figure 10, for an image to be recognized of a 2D film source or a 3D film source, the recognized image can be displayed on the display panel in the Unity 3D scene, and a recognition frame is displayed on the recognized image, which will be recognized target is selected. For the 360 panorama source, you can locate and identify marker points on the display sphere in the Unity 3D scene, and mark and display the marker points through guide lines.
  • the VR scene image recognition method provided by the above embodiments can detect the source type of the image to be recognized after obtaining the image recognition control instruction input by the user, and generate the recognition result according to the image recognition algorithm, and the type of photo source can be determined according to the type of the image source. Display the recognition results in the user interface.
  • the method can adopt different coordinate mapping methods according to different film sources, so as to correctly display the recognition result in the user interface, and solve the problem that the traditional virtual reality device 500 cannot accurately display the recognition result.
  • the step of generating the recognition result of the to-be-recognized image further includes:
  • the source type of the image to be identified is the first type, extract the original image of the source as the image to be identified;
  • the slice source type of the image to be identified is the second type, extract the half-side image corresponding to the left monitor or the right monitor in the slice source image as the to-be-identified image;
  • Image recognition is performed on the half-side image of the slice source image to generate a recognition result.
  • the to-be-recognized image may be preprocessed according to the source type of the to-be-recognized image.
  • the slice source types may include a first type slice source and a second type slice source.
  • the first type refers to the type of film source that only includes a single image in the content screen, including but not limited to 2D film source and 360 panoramic film source;
  • the second type refers to the film content that includes two or more images.
  • Source type including but not limited to 3D film sources.
  • the film source type of the image to be recognized is the first type such as 2D film source or 360 panoramic film source
  • the original image of the to-be-recognized image can be directly input into the recognition model for processing to generate a recognition result.
  • the image source type of the image to be recognized is a second type such as a 3D film source
  • the image to be recognized can be cut and separated, and the corresponding left monitor or right monitor in the image source image can be extracted. and input into the recognition model for recognition to generate recognition results.
  • the original 2D picture to be displayed can be obtained and displayed on a designated panel in the Unity3D scene.
  • the Android layer can identify the original image by inputting the original image into the recognition model through a recognition request.
  • the Android layer is a system layer used to transfer data and instructions between various software layers.
  • the layers parallel to the Android layer in the virtual reality device may further include an application layer and a framework layer, and the application layer is configured to present specific algorithms and directly present screen contents.
  • the recognition model can be integrated in the application layer, through the data interaction between the framework layer and the system layer, that is, the image is obtained from the system layer and recognized, and the recognition results are generated and fed back to the system layer.
  • the left and right images can be displayed on the designated panels in the Unity 3D scene, and the Android layer can input the left half image of the original image through the recognition request. model for image recognition.
  • the image content of some 3D sources is arranged in a left-right type, that is, a frame of image includes left and right halves, the left half of the image is the content displayed on the left monitor, and the right half is the content displayed on the right monitor. Then, the left half or right half of the source image can be extracted as the image to be recognized.
  • the image content of some 3D sources is arranged in the upper and lower type, that is, a frame of image includes upper and lower parts, the upper part is the content displayed on the left monitor, and the lower half is the content displayed on the right monitor. The upper or lower half of the source image is used as the image to be recognized.
  • the image content of some 3D film sources is arranged in a hybrid type, that is, the area is not fixedly divided in one frame of image, but the content displayed on the left monitor and the content displayed on the right monitor are mixed and arranged, such as two adjacent columns of pixels.
  • one column of pixels is the content displayed by the left monitor
  • one column of pixels is the content displayed by the right monitor
  • multiple columns of pixels are alternately arranged to form a frame of image.
  • the content displayed on the left and right monitors can be separated by pixel recombination before being sent to image recognition to obtain the left image and the right image, and use one of them as the to-be-recognized image. image.
  • the model that is, the step of generating the recognition result of the to-be-recognized image further includes:
  • the recognition model can be pre-built according to different types of film sources, and the specific model building method is not limited in this application, and can be obtained by model training or by building an image analyzer.
  • the constructed recognition model may be stored in the memory of the virtual reality device 500 or the display device that performs image recognition processing for the controller to call.
  • the controller may call the recognition model according to the source type of the image to be recognized, input the image to be recognized cut in the above embodiment into the called recognition model, and perform recognition processing on the image to be recognized through the recognition model.
  • the recognition model processes the picture, it can output the recognition result, that is, the controller obtains the recognition result output by the model to be recognized. Since different recognition models are constructed for different types of film sources, the recognition models can be adapted to the types of film sources of the current image to be recognized, and more accurate recognition results can be obtained.
  • different recognition models can also be called according to different application scenarios to obtain different recognition results.
  • the controller can also judge the current application scenario, so as to determine the recognition model group to be called. It is used for image recognition of the images to be recognized for 2D film sources, 3D film sources and 360 panoramic film sources. Then, according to the source type of the image to be recognized, an appropriate recognition model is determined from the recognition model group.
  • the output recognition results are also different.
  • the input recognition result is the classification probability of each region on the image for the specific classification.
  • the recognition result may include a result mark and the position of the result mark relative to the to-be-recognized image; for the to-be-recognized images of 2D slice source type and 3D slice source type, the result mark is a recognition frame, The position of the result mark includes the coordinates of the upper left corner and the lower right corner of the recognition frame; as shown in Figure 14 and Figure 15, for the image to be recognized of the 360 panorama source type, the result is marked as a recognition indicator point, so The position of the result mark is the coordinate of the identification indication point.
  • the recognition frame needs to be displayed on a plane, while the recognition indication point can be displayed on a curved surface. Therefore, in some embodiments of the present application, in order to display the recognition result, the user can The step of displaying the identification result in the interface further includes:
  • Coordinate mapping is performed according to the coordinate parameters to display the recognition result in the result display area.
  • the result display area can be set in the Unity 3D scene according to the recognition result.
  • the specific form of the display area can be set according to the user interface and the virtual reality function. For example, for a virtual theater, the display area is the one in the virtual theater. screen.
  • the image to be recognized can also be displayed in the result display area.
  • the to-be-recognized image is an image in a video
  • the to-be-recognized image displayed in the result display area also changes dynamically.
  • slice source type of the image to be identified is a 2D slice source or a 3D slice source
  • a display panel is created in the user interface, that is, through the display panel.
  • the to-be-recognized image is tiled and displayed;
  • the source type of the to-be-recognized image is a 360 panorama source, a display sphere is created in the user interface, that is, the to-be-recognized image is displayed around the display sphere.
  • the controller can extract the coordinate parameters of the result display area in the unity 3D scene, and perform coordinate mapping transformation according to the coordinate parameters to display the recognition result in the result display area.
  • the coordinate parameters include spatial position and regional shape data
  • the step of performing coordinate mapping according to the coordinate parameters also includes:
  • the slice source type of the to-be-recognized image is a 2D slice source in the first type or a 3D slice source in the second type, extract the identification mark position in the recognition result;
  • the coordinates of the upper left corner and the upper right corner of the identification mark in the user interface are calculated.
  • the controller can also determine the type of data to be extracted according to the source type of the image to be recognized. That is, the recognition result can be marked by the recognition box. Then, the position of the identification mark in the identification result can be extracted, and the spatial position of the result display area in the unity 3D scene can be obtained, wherein the spatial position includes the coordinates of the upper left corner and the upper right corner of the result display area.
  • the upper left corner coordinate and the upper right corner coordinate of the identification mark in the user interface are calculated, so that the identification frame is rendered according to the upper left corner coordinate and the upper right corner coordinate of the identification mark obtained by calculation. , and displayed in the result display area.
  • the recognition result information contains type: building, location: (x: 0.2215, y: 0.3325, w: 0.5825, h: 0495), where x is the x-axis coordinate of the upper left corner of the recognition frame /The width W of the original image, y is the y-axis coordinate of the upper right corner of the recognition frame/the height H of the original image, w is the width of the recognition frame/the width W of the original image, h is the height of the recognition frame/the height H of the original image.
  • the coordinates of the upper left corner of the recognition box are:
  • the coordinates of the lower right corner of the recognition box are:
  • RRx LTPx+(RBPx-LTPx)*(x+w);
  • the image recognition result of the 2D film source or the 3D film source can be displayed in the result display area, so that the recognition result can be displayed correctly in the VR scene.
  • the film source type of the to-be-recognized image is a 360-degree panoramic film source in the first type, extract the identification mark position in the recognition result;
  • the position coordinates of the identification mark in the user interface are calculated.
  • the recognition result should be able to meet the form of marking on the spherical surface. To this end, it is necessary to convert the recognition frame in the two-dimensional image into a marker point that can be displayed on a spherical surface.
  • the position of the recognition mark can be extracted from the recognition result first, and the position of the recognition mark can be converted into the latitude and longitude information on the display sphere, and then the radius of the display sphere corresponding to the result display area can be obtained. data, calculate the position coordinates of the identification mark in the user interface.
  • the coordinates of the recognition frame in the recognition result are (x, y, w, h), and the coordinates of the marker points obtained by conversion are (RLx, Rly, RLz), then the coordinates of the upper left corner of the recognition frame are used as the benchmark to map the recognition frame to Display the spherical surface, that is, according to the coordinates of the recognition frame and the coordinates of the marker point, the latitude and longitude information can be calculated as:
  • r is the radius of the display sphere, which can be set according to the actual distance of the scene. It can be seen that, in the above embodiment, the recognition result can be displayed by the marker point instead of the recognition frame, so as to adapt to the display form of the display sphere, so that the image recognition result of the 360 panorama source type can also be displayed in the VR scene.
  • the virtual reality device 500 also provided in some embodiments of the present application includes: a display and a controller, wherein the display is configured to display a user interface; the controller is configured to execute the following program steps:
  • the recognition result is displayed in the user interface according to the source type of the image to be recognized.
  • the virtual reality device 500 can detect the source type of the image to be recognized after acquiring the image recognition control instruction input by the user, and generate the recognition result according to the image recognition algorithm, and display the image source type according to the photo source type.
  • the recognition results are displayed in the user interface.
  • the virtual reality device 500 can adopt different coordinate mapping methods according to different film sources, so as to correctly display the recognition result in the user interface, and solve the problem that the traditional virtual reality device 500 cannot accurately display the recognition result.
  • the image recognition is completed by the virtual reality device 500. Since the computing power and storage capacity of the virtual reality device 500 are limited, the image recognition process can also be handled by other devices, that is, in some embodiments of the present application , the VR scene image recognition method also provided is applied to a virtual reality device 500, the virtual reality device 500 includes a display, a communicator and a controller, wherein the display is configured to display a user interface; the communicator is configured to connect to a server; As shown in Figure 18, the method includes the following steps:
  • detecting a film source type of the image to be identified where the film source type includes a 2D film source, a 3D film source and a 360 panoramic film source;
  • the recognition result is displayed in the user interface according to the source type of the image to be recognized.
  • this embodiment after detecting the source type of the image to be recognized, this embodiment can send an image recognition request to the server through the communicator, and the server can feedback the image recognition result after receiving the image recognition request Give the virtual reality device 500.
  • the image recognition request sent by the virtual reality device 500 should include the image to be recognized.
  • the virtual reality device 500 may send different image recognition requests according to the source type of the image to be recognized. For example, for a 2D film source or a 360 panoramic film source, the sent image recognition request is accompanied by the film source of the image to be recognized. The original image; for 3D film sources, the image recognition request sent can be accompanied by the left half image of the original image of the film source.
  • the image to be recognized is sent to the server for image recognition, which can reduce the data processing volume of the virtual reality device 500 , and eliminate the need for the virtual reality device 500 to maintain multiple recognition models, thereby reducing the configuration requirements for the virtual reality device 500 .
  • a virtual reality device 500 further provided in some embodiments of the present application includes: a display, a communicator and a controller, wherein the display is configured to display a user interface; the communicator is configured to connect to a server ; the controller is configured to perform the following program steps:
  • detecting a film source type of the image to be identified where the film source type includes a 2D film source, a 3D film source and a 360 panoramic film source;
  • the recognition result is displayed in the user interface according to the source type of the image to be recognized.
  • the virtual reality device 500 can establish a communication connection between the virtual reality device 500 and the server, so that the virtual reality device 500 obtains the control instruction input by the user and detects the slice of the image to be recognized.
  • an image recognition request is sent to the server, so that the server can return the image recognition result according to the image recognition request, and the virtual reality device 500 displays the recognition result in the user interface according to the source type of the image to be recognized.
  • the virtual reality device 500 can hand over the image recognition process to the server to relieve the processing burden of the virtual reality device 500, and can correctly display the recognition result in the user interface, solving the problem that the traditional virtual reality device cannot accurately display the recognition result.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • User Interface Of Digital Computer (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

虚拟现实设备及VR场景图像识别方法,所述方法可以在获取用户输入图像识别控制指令后,检测待识别图像的片源类型,并根据图像识别算法生成识别结果,并按照片源类型在用户界面中显示识别结果。

Description

虚拟现实设备及VR场景图像识别方法
本申请要求在2020年11月30日提交中国专利局、申请号为202011379185.3、名称为“虚拟现实设备及VR场景图像识别方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及虚拟现实设备技术领域,尤其涉及一种虚拟现实设备及VR场景图像识别方法。
背景技术
虚拟现实(Virtual Reality,VR)技术是通过计算机模拟虚拟环境,从而给人以环境沉浸感的显示技术。虚拟现实设备是一种应用虚拟显示技术为用户呈现虚拟画面以实现沉浸感的设备。通常,虚拟现实设备包括两个用于呈现虚拟画面内容的显示屏幕,分别对应用户的左右眼。当两个显示屏幕所显示的内容分别来自于同一个物体不同视角的图像时,可以为用户带来立体的观影感受。
在部分应用场景下,可以对虚拟现实设备所显示的内容进行图像识别,例如,通过图像分析,定位图像中的人像、特殊目标等。为了进行图像识别,虚拟现实设备可以对所显示内容进行截图,并对获得的截图图像执行图像识别程序。然而由于虚拟现实设备为适应光学组件的畸变效应,屏幕显示的内容存在变形,与实际图案偏差较大,并且对于不同类型的片源,显示内容的变形程度不同,使得图像识别结果无法正确显示。
发明内容
第一方面本申请提供的虚拟现实设备,包括:显示器和控制器。其中,显示器被配置为显示用户界面;控制器被配置为执行以下程序步骤:
获取用户输入的用于启动图像识别的控制指令;
响应于所述控制指令,检测待识别图像的片源类型;
生成所述待识别图像的识别结果;
按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
基于上述虚拟现实设备,本申请第一方面还提供的VR场景图像识别方法,应用于虚拟现实设备,所述方法包括:
获取用户输入的用于启动图像识别的控制指令;
响应于所述控制指令,检测待识别图像的片源类型;
生成所述待识别图像的识别结果;
按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
第二方面,本申请还提供的虚拟现实设备,包括:显示器、通信器以及控制器。 其中,所述显示器被配置为显示用户界面;所述通信器被配置为连接服务器;所述控制器被配置为执行以下程序步骤:
获取用户输入的用于启动图像识别的控制指令;
响应于所述控制指令,检测待识别图像的片源类型;
通过所述通信器向所述服务器发送图像识别请求;
接收所述服务器反馈的识别结果;
按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
基于上述虚拟现实设备,本申请第二方面还提供的VR场景图像识别方法,应用于虚拟现实设备,所述方法包括:
获取用户输入的用于启动图像识别的控制指令;
响应于所述控制指令,检测待识别图像的片源类型;
通过所述通信器向服务器发送图像识别请求;
接收所述服务器反馈的识别结果;
按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
附图说明
为了更清楚地说明本申请的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例中包括虚拟现实设备的显示系统结构示意图;
图2为本申请实施例中VR场景全局界面示意图;
图3为本申请实施例中全局界面的推荐内容区域示意图;
图4为本申请实施例中全局界面的应用快捷操作入口区域示意图;
图5为本申请实施例中全局界面的悬浮物示意图;
图6a为本申请实施例中VR画面示意图;
图6b为本申请实施例中人物识别结果示意图;
图6c为本申请实施例中建筑识别结果示意图;
图7为本申请实施例中一种VR场景图像识别方法的流程示意图;
图8为本申请实施例中VR场景初始状态示意图;
图9为本申请实施例中显示图片效果示意图;
图10为本申请实施例中显示识别结果的效果示意图;
图11为本申请实施例中根据片源类型生成识别结果的流程示意图;
图12为本申请实施例中3D片源初始显示状态示意图;
图13为本申请实施例中显示3D片源识别结果示意图;
图14为本申请实施例中360全景片源初始显示状态示意图;
图15为本申请实施例中显示360全景片源识别结果示意图;
图16为本申请实施例中识别结果坐标示意图;
图17为本申请实施例中识别结果坐标映射状态示意图;
图18为本申请实施例中另一种VR场景图像识别方法的流程示意图。
具体实施方式
为使本申请示例性实施例的目的、技术方案和优点更加清楚,下面将结合本申请示例性实施例中的附图,对本申请示例性实施例中的技术方案进行清楚、完整地描述,显然,所描述的示例性实施例仅是本申请一部分实施例,而不是全部的实施例。
基于本申请中示出的示例性实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。此外,虽然本申请中公开内容按照示范性一个或几个实例来介绍,但应理解,可以就这些公开内容的各个方面也可以单独构成一个完整技术方案。
应当理解,本申请中说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,例如能够根据本申请实施例图示或描述中给出那些以外的顺序实施。
此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖但不排他的包含,例如,包含了一系列组件的产品或设备不必限于清楚地列出的那些组件,而是可包括没有清楚地列出的或对于这些产品或设备固有的其它组件。
本申请中使用的术语“模块”,是指任何已知或后来开发的硬件、软件、固件、人工智能、模糊逻辑或硬件或/和软件代码的组合,能够执行与该元件相关的功能。
本说明书通篇提及的“多个实施例”、“一些实施例”、“一个实施例”或“实施例”等,意味着结合该实施例描述的具体特征、结构或特性包括在至少一个实施例中。因此,本说明书通篇出现的短语“在多个实施例中”、“在一些实施例中”、“在至少另一个实施例中”或“在实施例中”等并不一定都指相同的实施例。此外,在一个或多个实施例中,具体特征、结构或特性可以任何合适的方式进行组合。因此,在无限制的情形下,结合一个实施例示出或描述的具体特征、结构或特性可全部或部分地与一个或多个其他实施例的特征、结构或特性进行组合。这种修改和变型旨在包括在本申请的范围之内。
本申请实施例中,所述虚拟现实设备500泛指能够佩戴于用户面部,为用户提供沉浸感体验的显示设备,包括但不限于VR眼镜、增强现实设备(Augmented Reality,AR)、VR游戏设备、移动计算设备以及其它可穿戴式计算机等。所述虚拟现实设备500可以独立运行,或者作为外接设备接入其他智能显示设备,其中,所述显示设备可以是智能电视、计算机、平板电脑、服务器等。
虚拟现实设备500可以在佩戴于用户面部后,显示媒资画面,为用户双眼提供近距离影像,以带来沉浸感体验。为了呈现媒资画面,虚拟现实设备500可以包括多个用于显示画面和面部佩戴的部件。以VR眼镜为例,虚拟现实设备500可以包括外壳、镜腿、光学系统、显示组件、姿态检测电路、接口电路等部件。实际应用中,光学系统、显示组件、姿态检测电路以及接口电路可以设置于外壳内,以用于呈现具体的显示画面;外壳两侧连接镜腿,以佩戴于用户面部。
在使用时,姿态检测电路中内置有重力加速度传感、陀螺仪等姿态检测元件,当用户头部移动或转动时,可以检测到用户的姿态,并将检测到的姿态数据传递给控制 器等处理元件,使处理元件可以根据检测到的姿态数据调整显示组件中的具体画面内容。
需要说明的是,根据虚拟现实设备500的类型不同,其所呈现具体画面内容的方式也不同。例如,如图1所示,对于部分轻薄VR眼镜,其内置的控制器一般不直接参与显示内容的控制过程,而是将姿态数据发送给外接设备,如计算机中,由外接设备进行处理,并在外接设备中确定要显示的具体画面内容,再回传给VR眼镜中,以在VR眼镜中显示最终的画面。
在一些实施例中,所示虚拟现实设备500可以接入显示设备200,并与服务器400之间构建一个基于网络的显示系统,在虚拟现实设备500、显示设备200以及服务器400之间可以实时进行数据交互,例如显示设备200可以从服务器400获取媒资数据并进行播放,以及将具体的画面内容传输给虚拟现实设备500中进行显示。
其中,显示设备200可以是液晶显示器、OLED显示器、投影显示设备。具体显示设备类型,尺寸大小和分辨率等不作限定,本领技术人员可以理解的是,显示设备200可以根据需要做性能和配置上一些改变。显示设备200可以提供广播接收电视功能,还可以附加提供计算机支持功能的智能网络电视功能,包括但不限于,网络电视、智能电视、互联网协议电视(IPTV)等。
显示设备200以及虚拟现实设备500还与服务器400通过多种通信方式进行数据通信。可允许显示设备200和虚拟现实设备500通过局域网(LAN)、无线局域网(WLAN)和其他网络进行通信连接。服务器400可以向显示设备200提供各种内容和互动。示例的,显示设备200通过发送和接收信息,以及电子节目指南(EPG)互动,接收软件程序更新,或访问远程储存的数字媒体库。服务器400可以是一个集群,也可以是多个集群,可以包括一类或多类服务器。通过服务器400提供视频点播和广告服务等其他网络服务内容。
在进行数据交互的过程中,用户可通过移动终端100A和遥控器100B操作显示设备200。移动终端100A和遥控器100B可以与显示设备200之间采用直接的无线连接方式进行通信,也可以采用非直接连接的方式进行通信。即在一些实施例中,移动终端100A和遥控器100B可以通过蓝牙、红外等直接连接方式与显示设备200进行通信。当发送控制指令时,移动终端100A和遥控器100B可以直接将控制指令数据通过蓝牙或红外发送到显示设备200。
在另一些实施例中,移动终端100A和遥控器100B还可以通过无线路由器与显示设备200接入同一个无线网络,以通过无线网络与显示设备200建立非直接连接通信。当发送控制指令时,移动终端100A和遥控器100B可以将控制指令数据先发送给无线路由器,再通过无线路由器将控制指令数据转发给显示设备200。
在一些实施例中,用户还可以使用移动终端100A和遥控器100B还可以直接与虚拟现实设备500进行交互,例如,可以将移动终端100A和遥控器100B作为虚拟现实场景中的手柄进行使用,以实现体感交互等功能。
在一些实施例中,虚拟现实设备500的显示组件包括显示屏幕以及与显示屏幕有关的驱动电路。为了呈现具体画面,以及带来立体效果,显示组件中可以包括两个显示屏幕,分别对应于用户的左眼和右眼。在呈现3D效果时,左右两个屏幕中显示的 画面内容会稍有不同,可以分别显示3D片源在拍摄过程中的左相机和右相机。由于用户左右眼观察到的画面内容,因此在佩戴时,可以观察到立体感较强的显示画面。
虚拟现实设备500中的光学系统,是由多个透镜组成的光学模组。光学系统设置在用户的双眼与显示屏幕之间,可以通过透镜对光信号的折射以及透镜上偏振片的偏振效应,增加光程,使显示组件呈现的内容可以清晰的呈现在用户的视野范围内。同时,为了适应不同用户的视力情况,光学系统还支持调焦,即通过调焦组件调整多个透镜中的一个或多个的位置,改变多个透镜之间的相互距离,从而改变光程,调整画面清晰度。
虚拟现实设备500的接口电路可以用于传递交互数据,除上述传递姿态数据和显示内容数据外,在实际应用中,虚拟现实设备500还可以通过接口电路连接其他显示设备或外设,以通过和连接设备之间进行数据交互,实现更为复杂的功能。例如,虚拟现实设备500可以通过接口电路连接显示设备,从而将所显示的画面实时输出至显示设备进行显示。又例如,虚拟现实设备500还可以通过接口电路连接手柄,手柄可以由用户手持操作,从而在VR用户界面中执行相关操作。
其中,所述VR用户界面可以根据用户操作呈现为多种不同类型的UI布局。例如,用户界面可以包括全局界面,AR/VR终端启动后的全局UI如图2所示,所述全局UI可显示于AR/VR终端的显示屏幕中,也可显示于所述显示设备的显示器中。全局UI可以包括推荐内容区域1、业务分类扩展区域2、应用快捷操作入口区域3以及悬浮物区域4。
推荐内容区域1用于配置不同分类TAB栏目;在所述栏目中可以选择配置媒资、专题等;所述媒资可包括2D影视、教育课程、旅游、3D、360度全景、直播、4K影视、程序应用、游戏、旅游等具有媒资内容的业务,并且所述栏目可以选择不同的模板样式、可支持媒资和专题同时推荐编排,如图3所示。
业务分类扩展区域2支持配置不同分类的扩展分类。如果有新的业务类型时,支持配置独立TAB,展示对应的页面内容。业务分类扩展区域2中的扩展分类,也可以对其进行排序调整及下线业务操作。在一些实施例中,业务分类扩展区域2可包括的内容:影视、教育、旅游、应用、我的。在一些实施例中,业务分类扩展区域2被配置为可展示大业务类别TAB,且支持配置更多的分类,其图标支持配置,如图3所示。
应用快捷操作入口区域3可指定预装应用靠前显示以进行运营推荐,支持配置特殊图标样式替换默认图标,所述预装应用可指定为多个。在一些实施例中,应用快捷操作入口区域3还包括用于移动选项目标的左向移动控件、右向移动控件,用于选择不同的图标,如图4所示。
悬浮物区域4可以配置为在固定区域左斜侧上方、或右斜侧上方,可配置为可替换形象、或配置为跳转链接。例如,悬浮物接收到确认操作后跳转至某个应用、或显示指定的功能页,如图5所示。在一些实施例中,悬浮物也可不配置跳转链接,单纯的用于形象展示。
在一些实施例中,全局UI还包括位于顶端的状态栏,用于显示时间、网络连接状态、电量状态、及更多快捷操作入口。使用AR/VR终端的手柄,即手持控制器选中图标后,图标将显示包括左右展开的文字提示,选中的图标按照位置自身进行左右拉 伸展开显示。
例如,选中搜索图标后,搜索图标将显示包含文字“搜索”及原图标,进一步点击图标或文字后,搜索图标将跳转至搜索页面;又例如,点击收藏图标跳转至收藏TAB、点击历史图标默认定位显示历史页面、点击搜索图标跳转至全局搜索页面、点击消息图标跳转至消息页面。
在一些实施例中,可以通过外设执行交互,例如AR/VR终端的手柄可对AR/VR终端的用户界面进行操作,包括返回按钮;主页键,且其长按可实现重置功能;音量加减按钮;触摸区域,所述触摸区域可实现焦点的点击、滑动、按住拖拽功能。
用户可以通过全局界面进入不同的场景界面,例如,如图6a所示,用户可以在全局界面中的“浏览界面”入口进入浏览界面,或者通过在全局界面中选择任一媒资启动浏览界面。在浏览界面中,虚拟现实设备500可以通过Unity 3D引擎创建3D场景,并在3D场景中渲染具体的画面内容。
在浏览界面中,用户可以观看具体媒资内容,为了获得更好的观影体验,浏览界面中还可以设置不同的虚拟场景控件,以配合媒资内容呈现具体场景或实时交互。例如,在浏览界面中,可以在Unity 3D场景中设置面板,用来呈现图片内容,在配合其他家居虚拟控件,以实现影院银幕的效果。
虚拟现实设备500可以在浏览界面中展示操作UI内容。例如,在Unity 3D场景中的显示面板前方还可以显示有列表UI,在列表UI中可以显示当前虚拟现实设备500本地存储的媒资图标,或者显示可在虚拟现实设备500中进行播放的网络媒资图标。用户可以在列表UI中选择任一图标,则在显示面板中可以实时显示被选中的媒资。
在显示媒资具体画面的同时,虚拟现实设备500还可以对所显示的画面内容进行图像识别,从所显示的画面中识别特定的影像,并进行标记。例如,可以在所显示的图片中识别人物、建筑、关键标记等目标,并标记目标位置。虚拟现实设备500在显示图片的同时,还显示目标的标记,如通过识别框将识别出的人物进行框选。
能够在Unity 3D场景中显示的媒资可以是图片、视频等多种形式,并且,由于VR场景的显示特点,在Unity 3D场景中显示的媒资至少包括2D图片或视频、3D图片或视频以及360全景图片或视频。
其中,2D图片或视频是一种传统图片或视频文件,当进行显示时,可以在虚拟现实设备500的两个显示屏幕中显示相同的图像,本申请中将2D图片或视频统称为2D片源;3D图片或视频,即3D片源是一种由至少两个相机在不同角度对同一个物体进行拍摄制作而成,可以在虚拟现实设备500的两个显示屏幕中显示不同的图像;360全景图片或视频,即360全景片源,是通过全景相机或者特殊的拍摄手段获得的360度全景影像,可以通过在Unity 3D场景中创建显示球面的方式,将图片进行展示。
由于显示的片源类型不同,因此在显示识别结果时,会因显示片源的类型不同,呈现为不同的显示效果。例如,对于2D图片或视频,可以直接识别结果的识别框显示在显示面板上,而对于360全景片源,由于其需要在球面上进行显示,而球面上无法直接显示识别框,因此可以使用识别指示点,对识别结果位置进行标记。
需要说明的是,对于识别结果,还可以采用其他方式进行标记,例如可以为指示线、圆形、椭圆形、三角形、菱形等几何形状,也可以是高亮显示、颜色变换等显示 效果。另外,在显示识别结果的同时,还可以配合一些提示文字,对识别结果进行解释说明。例如,如图6b所示,在识别出人物图像时,可以在识别框附近显示所识别人像的性别、年龄等信息;如图6c所示,在识别出建筑目标时,可以在识别框附近显示所识别建筑的名称等信息,以提高用户的实际观影感受。
然而,对于不同类型的片源,由于其在显示过程中左右两个屏幕上显示的图像不同,或者在Unity 3D场景中的表现形式不同,会造成显示结果与原图片之间存在这变形或差异,使得识别结果在显示画面上显示错位,降低用户体验。
为了准确显示图像识别结果,如图7所示,本申请的部分实施例中提供的VR场景图像识别方法,该识别方法可以应用于虚拟现实设备500。所述方法包括以下内容:
用户向虚拟现实设备500输入用于启动图像识别的控制指令,以使虚拟现实设备500在接收到控制指令后对图像进行识别,并对图像识别结果进行显示。图像识别结果显示可以作为虚拟现实设备500在显示媒资画面时的辅助显示功能。因此用户可以根据需要选择是否开启实时显示识别结果的功能。例如,用户可以在设置界面中开启“AI”功能,则在虚拟现实设备500显示媒资画面内容的同时,实时进行图像识别,并将图像识别结果显示在媒资画面内容中。
如图8、图9所示,在用户开启辅助显示功能的状态下,用户在打开任一媒资并进入浏览界面时,即代表用户输入了用于启动图像识别的控制指令,即所述控制指令可以由用户通过遥控器或体感手柄等方式,控制用户界面中的焦点光标移动至任一图片图标上以后,点击确认键或者播放键时输入。而在用户未开启辅助显示功能的状态时,用户在浏览界面中选中开关按钮,并点击确认键以开启辅助显示功能时,即代表用户输入了用于启动图像识别的控制指令。控制指令还可以通过其他方式完成输入,例如用户可以通过语音系统、外接的智能终端等设备。
在获取用户输入的控制指令以后,虚拟现实设备500可以根据控制指令开始进行图像识别。由于在虚拟现实设备500所显示的片源类型不同时,会按照不同的方式进行图像识别,并按照不同的方式显示图像识别结果,因此在进行图像识别前,可以对待识别图像的片源类型进行检测,其中所述片源类型至少包括2D片源、3D片源以及360全景片源。
为了实现对片源类型的检测,控制器可以在接收到控制指令后,提取所显示的媒资分类、格式、扩展名、文件说明等信息进行提取,从而确定当前显示媒资的片源类型。例如,对于用户界面中呈现的网络资源,其在分享媒资的同时,可以在文件说明中指示该媒资的片源类型。
还可以结合具体的图片内容进行判断当前所显示媒资的片源类型。例如,被显示媒资的图片文件扩展名为“.jpg”,同时通过分析图片的左右两侧相似度,则在两侧图片相似度较小时,可以确定当前待识别图像的片源类型为2D片源;在两侧图片相似度较大是,可以确定当前待识别图像的片源类型为3D片源。
在检测到所显示媒资的片源类型后,控制器可以按照该类型图片的具体识别方式,对待识别图像进行图像识别,以生成待识别图像的识别结果。具体的图像识别方式本实施例不进行限定。例如,图像识别可以采用识别模型,即可以将待识别图像输入到识别模型中,由识别模型输出识别结果。
还可以根据具体的用户需求和应用场景选择不同的识别方式,从而获得不同的识别结果。在处理不同媒资文件时,可以采用不同种类的识别模型,在检测到待识别图像的片源类型后,可以按照该片源类型所对应的输入方式,将待识别图像输入到识别模型,识别模型可以通过预设图像识别算法,对待识别图像进行计算,以获得识别结果。
例如,在使用虚拟现实设备500模拟旅行时,可以在应用程序中内置景物识别模型,用户佩戴虚拟现实设备500可以浏览不同的景物,同时通过图像识别算法对景物中的具体目标进行识别,从而景点的位置标记景点的名称、释义等相关信息。
在生成识别结果后,虚拟现实设备500可以在用户界面中对识别结果进行显示。不同片源类型的识别结果,可以采用不同方式进行显示。例如,如图10所示,对于2D片源或3D片源的待识别图像,可以在Unity 3D场景中的显示面板中显示被识别的图像,同时在被识别图像上显示识别框,将被识别的目标进行框选。而对于360全景片源,可以在Unity 3D场景中的显示球面上定位识别标记点,并通过指引线对标记点进行标记显示。
由以上技术方案可知,上述实施例提供的VR场景图像识别方法可以在获取用户输入图像识别控制指令后,检测待识别图像的片源类型,并根据图像识别算法生成识别结果,并按照片源类型在用户界面中显示识别结果。所述方法可以根据不同的片源采用不同的坐标映射方式,从而在用户界面中正确显示识别结果,解决传统虚拟现实设备500不能准确显示识别结果的问题。
由于不同片源的媒资,其在画面表现形式上存在着不同,因此对其进行图像识别的方式也不同。例如,2D片源的图片,其画面表现为单个图片形式,则可以直接通过识别模型对整个图片进行识别,而对于3D片源的图片,其画面表现为并列的两个角度所拍摄的图片,两个图片内容略有不同,与拍摄时相机的相对位置有关。而对于3D片源的图片,在进行图像识别时,如果仍然以整个原图片输入到识别模型中时,会受到两侧图片的相互干扰识别出错误的结果。因此,如图11所示,在本申请的部分实施例中,为了获得图像识别结果,生成所述待识别图像的识别结果的步骤还包括:
如果所述待识别图像的片源类型为第一类型,提取片源原图作为所述待识别图像;
对所述片源原图执行图像识别,以生成识别结果;
如果所述待识别图像的片源类型为第二类型,提取片源图像中左显示器或右显示器对应的半侧图像作为所述待识别图像;
对所述片源图像的半侧图像执行图像识别,以生成识别结果。
在对待识别图像进行识别前,可以根据待识别图像的片源类型对待识别图像进行预处理。本实施例中,片源类型可以包括第一类型片源和第二类型片源。其中,第一类型是指内容画面中仅包括单个图像的片源类型,包括但不限于2D片源和360全景片源;第二类型是指画面内容中包括两个或两个以上图像的片源类型,包括但不限于3D片源。当检测到待识别图像的片源类型为2D片源或360全景片源等第一类型时,可以直接将待识别图像的原图输入到识别模型中进行处理,以生成识别结果。如图12、图13所示,当检测到待识别图像的片源类型为3D片源等第二类型时,则可以对待识别图像进行裁切分离,提取片源图像中左显示器或右显示器对应的半侧图像,并输入 到识别模型中进行识别,以生成识别结果。
例如,在2D图片的播放模式下,可以获取待显示的2D图片原图,并显示在Unity3D场景中的指定面板上,同时Android层通过识别请求将原图输入识别模型,对原图进行识别。其中,Android层是一种系统层,用于在各软件层级之间传递数据和指令。虚拟现实设备中与Android层并列的层级还可以包括应用层和框架层,应用层被配置为呈现具体算法,以及直接呈现画面内容。识别模型可集成在应用层中,通过框架层与系统层之间进行数据交互,即从系统层获取图像并进行识别,同时产生识别结果反馈给系统层。在3D图片的播放模式下,可以在获取待显示图片的原图后,将左右两边图像分别显示在Unity 3D场景中的指定面板上,同时Android层通过识别请求将原图片的左半边图像输入识别模型,进行图像识别。
需要说明的是,对于不同片源类型的图片或视频,根据其图像内容结构可以存在不同的待识别图像预处理方式。例如,部分3D片源的图像内容排列方式为左右型,即一帧图像中包括左右两半部分,左半部分图像为在左侧显示器显示的内容,右半部分为右侧显示器显示的内容。则可以提取片源图像的左半侧或右半侧,作为待识别图像。而部分3D片源的图像内容排列方式为上下型,即一帧图像中包括上下两部分,上半部分为在左侧显示器显示的内容,下半部分为右侧显示器显示的内容,则可以提取片源图像的上半侧或下半侧,作为待识别图像。
此外,部分3D片源的图像内容排列方式为混合型,即一帧图像中不固定划分区域,而是将左侧显示器显示的内容和右侧显示器显示的内容混合排列,如相邻两列像素中,一列像素为左侧显示器显示的内容,一列像素为右侧显示器显示的内容,多列像素交替排列组成一帧图像。对于混合型排列的3D片源图像,可以在送入图像识别前,通过像素重组,将左右两侧显示器所显示的内容进行分离,获得左侧图像和右侧图像,并将其中一个作为待识别图像。
可见,在本实施例中,通过对不同片源类型的待识别图像进行不同预处理,可以实现输入至识别模型中的图像能够保留具体图像内容的同时,缓解左图像和右图像内容的干扰,从而能够生成正确的识别结果。
由于不同片源类型的待识别图像所对应的图像表现形式不同,相应的在进行图像识别时具体的识别算法也存在差异。例如,对于360全景片源,由于拍摄或合成过程中的视角衔接,从而将整个360度一周范围内的画面内容都显示在同一张图片中。而在合成时会在图片的底部产生变形,因此通过2D图片的图像识别算法会因变形区域的干扰影响识别结果,因此,在一些实施例中,可以根据不同的片源类型,调用不同的识别模型,即生成所述待识别图像的识别结果的步骤还包括:
按照所述待识别图像的片源类型调用识别模型;
将所述待识别图像输入调用的识别模型;
获取所述识别模型输出的识别结果。
其中,识别模型可以根据不同的片源类型分别预先构建,具体的模型构建方法本申请不做限定,可以是通过模型训练的方式获得,也可以是通过建立图像分析器的方式获得。构建的识别模型,可以存储在虚拟现实设备500或者进行图像识别处理的显示设备的存储器中,以供控制器调用。
控制器可以按照待识别图像的片源类型调用识别模型,并将上述实施例中裁切的待识别图像输入到被调用的识别模型中,通过识别模型对待识别图像进行识别处理。识别模型处理图片以后,可以输出识别结果,即控制器获取待识别模型输出的识别结果。由于不同识别模型是针对不同片源类型所构建的,因此识别模型可以适应当前待识别图像的片源类型,获得更加准确的识别结果。
此外,还可以根据不同的应用场景调用不同的识别模型,以获得不同的识别结果。例如,当获取用户输入的控制指令后,控制器还可以对当前应用场景进行判断,从而确定需要调用的识别模型组,识别模型组可以包括能够满足当前场景功能的至少三个识别模型,分别用于对2D片源、3D片源以及360全景片源的待识别图像进行图像识别。再根据待识别图像的片源类型,从识别模型组中确定合适的识别模型。
对于不同的识别模型,输出的识别结果也不同。例如,对于模型训练获得的识别模型,输入的识别结果为图像上各区域对具体分类的分类概率。
在一些实施例中,识别结果可以包括结果标记以及所述结果标记相对于所述待识别图像的位置;对于2D片源类型和3D片源类型的待识别图像,所述结果标记为识别框,所述结果标记的位置包括所述识别框的左上角坐标和右下角坐标;如图14、图15所示,对于360全景片源类型的待识别图像,所述结果标记为识别指示点,所述结果标记的位置为所述识别指示点的坐标。
由于不同片源类型的识别结果表现形式不同,导致其在最终显示时也不同。例如,识别框需要显示在平面上,而识别指示点则可以显示在曲面上,因此在本申请的部分实施例中,为了显示识别结果,按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果的步骤还包括:
根据所述待识别图像的片源类型在所述用户界面中设置结果展示区;
提取所述结果展示区在所述用户界面中的坐标参数;
根据所述坐标参数执行坐标映射,以将所述识别结果显示在所述结果展示区内。
在生成识别结果后,可以根据识别结果在Unity 3D场景中设置结果展示区,展示区的具体形式可以根据用户界面以及虚拟现实功能进行设置,例如,对于虚拟影院,展示区的为虚拟影院中的银幕。在设置结果展示区以后,还可以在结果展示区中显示待识别图像。显然,当待识别图像为视频中的图像时,结果展示区中显示的待识别图像也是动态变化的。
不同的片源类型所需要的结果展示区形式也不同,例如,如果所述待识别图像的片源类型为2D片源或3D片源,在所述用户界面中创建显示面板,即通过显示面板对待识别图像进行平铺展示;如果所述待识别图像的片源类型为360全景片源,在所述用户界面中创建显示球面,即通过显示球面对待识别图像进行环绕展示。
由于结果展示区的具体大小和位置都根据具体从VR场景进行设置,因此在显示待识别图像时,待识别图像会根据结果展示区的大小和位置进行缩放调整。相应的识别结果在显示时,也需要做相适应的调整。即在设置结果展示区后,控制器可以提取结果展示区在unity 3D场景中的坐标参数,并根据坐标参数执行坐标映射变换,以将识别结果显示在结果展示区内。
其中,所述坐标参数包括空间位置和区域形状数据,具体的,根据所述坐标参数 执行坐标映射的步骤还包括:
如果所述待识别图像的片源类型为第一类型中的2D片源或第二类型中的3D片源,在所述识别结果中提取识别标记位置;
获取所述结果展示区的空间位置;
根据所述识别标记位置和所述空间位置,计算所述识别标记在所述用户界面中的左上角坐标和右上角坐标。
在生成图像识别结果后,控制器还可以根据待识别图像的片源类型,确定提取的数据类型,如果当前待识别图像的片源类型为2D片源或3D片源,识别标记为识别框,即识别结果可以通过识别框进行标记。则可以对识别结果中的识别标记位置进行提取,并获取结果展示区在unity 3D场景中的空间位置,其中,所述空间位置包括所述结果展示区的左上角坐标和右上角坐标。
在获取空间位置后,再根据识别标记位置和空间位置,计算识别标记在用户界面中的左上角坐标和右上角坐标,从而根据计算获得的识别标记的左上角坐标和右上角坐标渲染出识别框,并显示在结果展示区内。
例如,如图16所示,识别结果信息中包含类型:建筑,位置:(x:0.2215,y:0.3325,w:0.5825,h:0495),其中,x为识别框左上角点的x轴坐标/原图的宽W,y为识别框右上角的y轴坐标/原图的高H,w为识别框的宽/原图的宽W,h为识别框的高/原图的高H。
如图17所示,面板左上角在场景中的坐标为(LTPx,LTPy,LTPz)、右下角在场景中的坐标为(RBPx,RBPy,RBPz),识别结果中的识别框坐标为(x,y,w,h),场景中展示识别框的左上角坐标为(RLx,RLy,RLz)、右下角坐标为(RRx,=RRy,RRz),则坐标映射方法是计算识别框在unity 3D场景中的坐标。
即,识别框的左上角坐标为:
RLx=LTPx+(RBPx-LTPx)*x;
RLy=LTPy+(RBPy-LTPy)*y;
RLz=LTPz+(RBPz-LTPz)*x;
识别框的右下角坐标为:
RRx=LTPx+(RBPx-LTPx)*(x+w);
RRy=LTPy+(RBPy-LTPy)*(y+h);
RRz=LTPz+(RBPz-LTPz)*(x+w);
可见,通过上述坐标映射的计算方式,可以在结果展示区中展示2D片源或3D片源的图像识别结果,使识别结果能够在VR场景中正确显示。
如果所述待识别图像的片源类型为第一类型中的360全景片源,在所述识别结果中提取识别标记位置;
将所述识别标记位置转化为经纬度;
获取所述结果展示区的区域形状数据;
根据所述经纬度和所述区域形状数据,计算所述识别标记在所述用户界面中的位置坐标。
由于360全景片源需要在显示球面进行展示,因此为了能够获得更好的展示效果, 当待识别图像的片源类型为360全景片源时,识别结果应能够满足在球面进行标记的形式。为此,需要将二维图像中的识别框转化为可以在球面进行展示的标记点。
即在显示识别结果时,可以先在识别结果中提取识别标记位置,并将识别标记位置转化为在显示球面上的经纬度信息,再获取结果展示区对应的显示球面半径,并根据经纬度和区域形状数据,计算识别标记在用户界面中的位置坐标。
例如,识别结果中的识别框坐标为(x,y,w,h),转换获得的标记点坐标为(RLx,Rly,RLz),则以识别框左上角坐标为基准,将识别框映射到显示球面上,即可以根据识别框坐标和标记点坐标,计算经纬度信息为:
Wd(经度)=(x+90)*π/180;
Jd(纬度)=y*π/180;
则标记点坐标为(RLx,Rly,RLz)为:
RLx=-r*cos(jd)*cos(wd);
RLy=-r*sin(jd);
RLz=r*cos(jd)*sin(wd);
其中,r为显示球面半径,可以根据场景实际距离进行设置。可见,在上述实施例中,可以通过标记点代替识别框对识别结果进行展示,从而适应显示球面的显示形式,使360全景片源类型的图像识别结果也能够展示在VR场景中。
需要说明的是,在上述实施例中,所述片源类型以2D片源、3D片源和360全景片源为例进行说明,本领域技术人员在结合上述实施例中的片源类型,在不付出创造性劳动的前提下,所能够联想到的其他片源类型的图像识别方法也属于本申请的保护范围。
基于上述VR场景图像识别方法,本申请的部分实施例中还提供的虚拟现实设备500,包括:显示器和控制器,其中,显示器被配置为显示用户界面;控制器被配置为执行以下程序步骤:
获取用户输入的用于启动图像识别的控制指令;
响应于所述控制指令,检测待识别图像的片源类型;
生成所述待识别图像的识别结果;
按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
由以上技术方案可知,上述实施例提供的虚拟现实设备500可以在获取用户输入图像识别控制指令后,检测待识别图像的片源类型,并根据图像识别算法生成识别结果,并按照片源类型在用户界面中显示识别结果。所述虚拟现实设备500可以根据不同的片源采用不同的坐标映射方式,从而在用户界面中正确显示识别结果,解决传统虚拟现实设备500不能准确显示识别结果的问题。
在上述实施例中,图像识别由虚拟现实设备500完成,由于虚拟现实设备500的计算能力和存储能力有限,因此还可以将图像识别过程交由其他设备处理,即在本申请的部分实施例中,还提供的VR场景图像识别方法,应用于虚拟现实设备500,所述虚拟现实设备500包括显示器、通信器以及控制器,其中,显示器被配置为显示用户界面;通信器被配置为连接服务器;如图18所示,所述方法包括以下步骤:
获取用户输入的用于启动图像识别的控制指令;
响应于所述控制指令,检测待识别图像的片源类型,所述片源类型包括2D片源、3D片源以及360全景片源;
通过所述通信器向所述服务器发送图像识别请求;
接收所述服务器反馈的识别结果;
按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
本实施例与上述实施例的区别在于,本实施例在检测待识别图像的片源类型后,可以通过通信器向服务器发送图像识别请求,服务器在接收到图像识别请求后,可以反馈图像识别结果给虚拟现实设备500。
为了使服务器能够针对待识别图像反馈图像识别结果,虚拟现实设备500发送的图像识别请求中应带有待识别图像。在一些实施例中,虚拟现实设备500可以根据待识别图像的片源类型发送不同的图像识别请求,例如对于2D片源或360全景片源,发送的图像识别请求中附带待识别图像的片源原图;而对于3D片源,发送的图像识别请求中可以附带片源原图的左半侧图像。
本实施例将待识别图像发送至服务器进行图像识别,可以减轻虚拟现实设备500的数据处理量,并且使虚拟现实设备500无需维护多个识别模型,降低对虚拟现实设备500的配置要求。
基于上述VR场景图像识别方法,本申请的部分实施例中还提供的虚拟现实设备500,包括:显示器、通信器和控制器,其中,显示器被配置为显示用户界面;通信器被配置为连接服务器;控制器被配置为执行以下程序步骤:
获取用户输入的用于启动图像识别的控制指令;
响应于所述控制指令,检测待识别图像的片源类型,所述片源类型包括2D片源、3D片源以及360全景片源;
通过所述通信器向所述服务器发送图像识别请求;
接收所述服务器反馈的识别结果;
按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
由以上技术方案可知,上述实施例提供的虚拟现实设备500,可以在虚拟现实设备500与服务器之间建立通信连接,从而在虚拟现实设备500获取用户输入的控制指令,并检测待识别图像的片源类型后,向服务器发送图像识别请求,以使服务器可以根据图像识别请求返回图像识别结果,虚拟现实设备500再按照待识别图像的片源类型在用户界面中显示识别结果。所述虚拟现实设备500可以将图像识别过程交由服务器完成,缓解虚拟现实设备500的处理负担,并能够在用户界面中正确显示识别结果,解决传统虚拟现实设备不能准确显示识别结果的问题。
本申请提供的实施例之间的相似部分相互参见即可,以上提供的具体实施方式只是本申请总的构思下的几个示例,并不构成本申请保护范围的限定。对于本领域的技术人员而言,在不付出创造性劳动的前提下依据本申请方案所扩展出的任何其他实施方式都属于本申请的保护范围。

Claims (10)

  1. 一种虚拟现实设备,包括:
    显示器,被配置为显示用户界面;
    控制器,被配置为:
    获取用户输入的用于启动图像识别的控制指令;
    响应于所述控制指令,检测待识别图像的片源类型;
    生成所述待识别图像的识别结果;
    按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
  2. 根据权利要求1所述的虚拟现实设备,生成所述待识别图像的识别结果的步骤中,所述控制器被进一步配置为:
    如果所述待识别图像的片源类型为第一类型,提取片源原图作为所述待识别图像;
    对所述片源原图执行图像识别,以生成识别结果;
    如果所述待识别图像的片源类型为第二类型,提取片源图像中左显示器或右显示器对应的半侧图像作为所述待识别图像;
    对所述片源图像的半侧图像执行图像识别,以生成识别结果。
  3. 根据权利要求1所述的虚拟现实设备,生成所述待识别图像的识别结果的步骤中,所述控制器被进一步配置为:
    按照所述待识别图像的片源类型调用识别模型;
    将所述待识别图像输入所述识别模型;
    获取所述识别模型输出的识别结果。
  4. 根据权利要求1-3任一项所述的虚拟现实设备,所述识别结果包括结果标记以及所述结果标记相对于所述待识别图像的位置;
    对于不同的片源类型,所述结果标记为识别框、识别指示点、高亮显示标记以及颜色变换标记中的一种或多种的组合;所述结果标记的位置为所述结果标记区域中的指定点,包括图形顶点、图形中点以及指示点的坐标。
  5. 根据权利要求1所述的虚拟现实设备,按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果的步骤中,所述控制器被进一步配置为:
    根据所述待识别图像的片源类型在所述用户界面中设置结果展示区;
    提取所述结果展示区在所述用户界面中的坐标参数,所述坐标参数包括空间位置和区域形状数据;
    根据所述坐标参数执行坐标映射,以将所述识别结果显示在所述结果展示区内。
  6. 根据权利要求5所述的虚拟现实设备,根据所述待识别图像的片源类型在所述用户界面中设置结果展示区的步骤中,所述控制器被进一步配置为:
    如果所述待识别图像的片源类型为第一类型中的2D片源或第二类型中的3D片源,在所述用户界面中创建显示面板;
    如果所述待识别图像的片源类型为第一类型中的360全景片源,在所述用户界面中创建显示球面。
  7. 根据权利要求5所述的虚拟现实设备,根据所述坐标参数执行坐标映射的步骤中,所述控制器被进一步配置为:
    如果所述待识别图像的片源类型为第一类型中的2D片源或第二类型中的3D片源,在所述识别结果中提取识别标记位置;
    获取所述结果展示区的空间位置,所述空间位置包括所述结果展示区的左上角坐标和右上角坐标;
    根据所述识别标记位置和所述空间位置,计算所述识别标记在所述用户界面中的左上角坐标和右上角坐标。
  8. 根据权利要求5所述的虚拟现实设备,根据所述坐标参数执行坐标映射的步骤中,所述控制器被进一步配置为:
    如果所述待识别图像的片源类型为第一类型中的360全景片源,在所述识别结果中提取识别标记位置;
    将所述识别标记位置转化为经纬度;
    获取所述结果展示区的区域形状数据,所述区域形状数据包括显示球面半径;
    根据所述经纬度和所述区域形状数据,计算所述识别标记在所述用户界面中的位置坐标。
  9. 一种虚拟现实设备,包括:
    显示器,被配置为显示用户界面;
    通信器,被配置为连接服务器;
    控制器,被配置为:
    获取用户输入的用于启动图像识别的控制指令;
    响应于所述控制指令,检测待识别图像的片源类型;
    通过所述通信器向所述服务器发送图像识别请求;
    接收所述服务器反馈的识别结果;
    按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
  10. 一种VR场景图像识别方法,应用于虚拟现实设备,所述方法包括:
    获取用户输入的用于启动图像识别的控制指令;
    响应于所述控制指令,检测待识别图像的片源类型;
    生成所述待识别图像的识别结果;
    按照所述待识别图像的片源类型在所述用户界面中显示所述识别结果。
PCT/CN2021/119318 2020-11-30 2021-09-18 虚拟现实设备及vr场景图像识别方法 WO2022111005A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011379185.3 2020-11-30
CN202011379185.3A CN114299407A (zh) 2020-11-30 2020-11-30 虚拟现实设备及vr场景图像识别方法

Publications (1)

Publication Number Publication Date
WO2022111005A1 true WO2022111005A1 (zh) 2022-06-02

Family

ID=80964382

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119318 WO2022111005A1 (zh) 2020-11-30 2021-09-18 虚拟现实设备及vr场景图像识别方法

Country Status (2)

Country Link
CN (1) CN114299407A (zh)
WO (1) WO2022111005A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104656893A (zh) * 2015-02-06 2015-05-27 西北工业大学 一种信息物理空间的远程交互式操控系统及方法
CN106851240A (zh) * 2016-12-26 2017-06-13 网易(杭州)网络有限公司 图像数据处理的方法及装置
CN107018336A (zh) * 2017-04-11 2017-08-04 腾讯科技(深圳)有限公司 图像处理的方法和装置及视频处理的方法和装置
CN110012284A (zh) * 2017-12-30 2019-07-12 深圳多哚新技术有限责任公司 一种基于头戴设备的视频播放方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104656893A (zh) * 2015-02-06 2015-05-27 西北工业大学 一种信息物理空间的远程交互式操控系统及方法
CN106851240A (zh) * 2016-12-26 2017-06-13 网易(杭州)网络有限公司 图像数据处理的方法及装置
CN107018336A (zh) * 2017-04-11 2017-08-04 腾讯科技(深圳)有限公司 图像处理的方法和装置及视频处理的方法和装置
CN110012284A (zh) * 2017-12-30 2019-07-12 深圳多哚新技术有限责任公司 一种基于头戴设备的视频播放方法及装置

Also Published As

Publication number Publication date
CN114299407A (zh) 2022-04-08

Similar Documents

Publication Publication Date Title
CN114286142B (zh) 一种虚拟现实设备及vr场景截屏方法
US9294670B2 (en) Lenticular image capture
US20150213784A1 (en) Motion-based lenticular image display
CN112732089A (zh) 一种虚拟现实设备及快捷交互方法
CN116472715A (zh) 显示设备及摄像头追踪方法
CN113066189B (zh) 一种增强现实设备及虚实物体遮挡显示方法
CN114302221B (zh) 一种虚拟现实设备及投屏媒资播放方法
WO2022193931A1 (zh) 一种虚拟现实设备及媒资播放方法
WO2022111005A1 (zh) 虚拟现实设备及vr场景图像识别方法
WO2022151883A1 (zh) 虚拟现实设备
CN114286077B (zh) 一种虚拟现实设备及vr场景图像显示方法
WO2022151882A1 (zh) 虚拟现实设备
Narducci et al. Enabling consistent hand-based interaction in mixed reality by occlusions handling
CN115129280A (zh) 一种虚拟现实设备及投屏媒资播放方法
CN112905007A (zh) 一种虚拟现实设备及语音辅助交互方法
CN112732088B (zh) 一种虚拟现实设备及单目截屏方法
CN116935084A (zh) 一种虚拟现实设备及校验数据的方法
CN114363705A (zh) 一种增强现实设备及交互增强方法
US20230334791A1 (en) Interactive reality computing experience using multi-layer projections to create an illusion of depth
US20230334790A1 (en) Interactive reality computing experience using optical lenticular multi-perspective simulation
US20240185546A1 (en) Interactive reality computing experience using multi-layer projections to create an illusion of depth
US20230334792A1 (en) Interactive reality computing experience using optical lenticular multi-perspective simulation
WO2020244576A1 (zh) 基于光通信装置叠加虚拟对象的方法和相应的电子设备
CN116339499A (zh) 一种头戴设备及头戴设备中的平面检测方法
CN114283055A (zh) 一种虚拟现实设备及图片显示方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896509

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21896509

Country of ref document: EP

Kind code of ref document: A1