WO2022166173A1 - 视频资源处理方法、装置、计算机设备、存储介质及程序 - Google Patents

视频资源处理方法、装置、计算机设备、存储介质及程序 Download PDF

Info

Publication number
WO2022166173A1
WO2022166173A1 PCT/CN2021/114547 CN2021114547W WO2022166173A1 WO 2022166173 A1 WO2022166173 A1 WO 2022166173A1 CN 2021114547 W CN2021114547 W CN 2021114547W WO 2022166173 A1 WO2022166173 A1 WO 2022166173A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
area
video resource
playback
resource
Prior art date
Application number
PCT/CN2021/114547
Other languages
English (en)
French (fr)
Inventor
李宇飞
张建博
Original Assignee
深圳市慧鲤科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市慧鲤科技有限公司 filed Critical 深圳市慧鲤科技有限公司
Publication of WO2022166173A1 publication Critical patent/WO2022166173A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47202End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting content on demand, e.g. video on demand
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration

Definitions

  • the present disclosure relates to the field of enhanced display technologies, and in particular, to a video resource processing method, apparatus, computer device, storage medium, and program.
  • the video when playing a video, the video is generally played in a loop through a fixed display device (such as an electronic screen, etc.). During playback, the playback video may be stuck due to network problems, and the playback effect is poor.
  • a fixed display device such as an electronic screen, etc.
  • the embodiments of the present disclosure provide a video resource processing method, apparatus, computer device, storage medium, and program.
  • An embodiment of the present disclosure provides a video resource processing method, the method is executed by an electronic device, and the method includes:
  • the video resource corresponding to the video playing area is loaded.
  • the video resource can be preloaded under the condition that the relative pose relationship between the AR device and the video playback area satisfies the preset condition.
  • the video playback area used for playing video resources can be not carried on the physical playback device in the target scene, but also does not need to actually occupy the location space in the target scene, so that the actual location space resources and device resources can be saved.
  • the relative pose relationship includes a relative distance
  • the loading of the video resource corresponding to the video playback area in response to the relative pose relationship satisfying a preset condition includes:
  • the video resource corresponding to the video playing area is loaded.
  • loading the video resource corresponding to the video playback area includes:
  • the relative distance between the AR device and the video playing area is less than the set distance
  • load the video resource corresponding to the video playing area when the relative distance between the AR device and the video playback area is less than the set distance, the video resources corresponding to the video playback area are started to be loaded.
  • the video resources are preloaded before the playback conditions are met. Improve the smoothness of video resource playback after the playback conditions are met.
  • the limitation on distance also reduces the wasted loading of video resources to a certain extent (for example, the playback conditions cannot be met after loading), and reduces After the video is loaded and the video playback conditions are met, the playback is unclear because the distance between the AR device and the video playback area is too far.
  • the method further includes:
  • the AR device determines whether the AR scene screen includes video elements corresponding to video resources, and only when the AR scene screen includes video elements corresponding to video resources will the video resources be played, so that the video resources can appear during playback. In the screen of the AR device, the invalid playback of the video is reduced, and the resource utilization rate is improved.
  • playing the loaded video resource in the video playback area when the displayed AR scene picture includes a video element corresponding to the video resource includes:
  • playing the loaded video resource in the video playback area when the displayed AR scene picture includes a video element corresponding to the video resource includes:
  • the displayed AR scene picture includes the video element
  • in the relative pose relationship obtain the angle between the shooting direction of the AR device and the direction toward the video playback area;
  • the included angle is within the set angle range
  • the loaded video resource is played in the video playback area.
  • the video resources will be played, which further improves the playback effect and resource utilization of the video resources, and improves the improve the user's viewing experience.
  • the method further includes:
  • the determining the first pose information of the AR device based on the target scene image includes:
  • the first pose information of the AR device is determined.
  • the scene image of the AR device under various pose information of the target scene can be obtained, and the AR device can be determined by matching the target scene image obtained by AR in real time with the 3D scene model. first pose information.
  • the video playing area includes at least one of the following: a video playing area located on at least one target display object in the target scene, a video playing area corresponding to a virtual playing device located in the target scene . In this way, video assets can be preloaded in various types of video playback areas.
  • the loading of the video resource corresponding to the video playback area includes:
  • the video resource bound to the area identification information is loaded.
  • the video resource identifier corresponding to the area identifier information can be searched based on the area identifier information corresponding to the video playback area. , and then load the video resource corresponding to the found video resource ID.
  • loading a video resource bound to the region identification information according to the region identification information corresponding to the video playback region including:
  • a video resource corresponding to the current time is selected to be loaded.
  • different video resources can be set to be played in different time periods, thereby enriching the displayed video resources.
  • Embodiments of the present disclosure also provide a video resource processing apparatus, including:
  • a first determining module configured to determine the first pose information of the AR device based on the target scene image captured by the AR device in real time
  • the second determining module is configured to determine the relationship between the AR device and the video playing area according to the first posture information and the second posture information of the preset video playing area in the 3D scene model corresponding to the target scene The relative pose relationship between;
  • a loading module configured to load a video resource corresponding to the video playing area in response to the relative pose relationship satisfying a preset condition.
  • the relative pose relationship includes a relative distance
  • the loading module when loading the video resource corresponding to the video playback area in response to the relative pose relationship satisfying a preset condition, is configured as:
  • the video resource corresponding to the video playing area is loaded.
  • the loading module when loading the video resource corresponding to the video playing area in response to the relative distance between the AR device and the video playing area satisfying a preset condition, is configured to:
  • the apparatus further includes a playback module configured to:
  • the playback module when the displayed AR scene picture includes the video element, when playing the loaded video resource in the video playing area, the playback module is configured to:
  • the playback module when the displayed AR scene picture includes the video element, when playing the loaded video resource in the video playing area, the playback module is configured to:
  • the displayed AR scene picture includes the video element
  • in the relative pose relationship obtain the angle between the shooting direction of the AR device and the direction toward the video playback area;
  • the included angle is within the set angle range, the loaded video resource is played in the video playback area.
  • the video resource processing apparatus further includes a playback control module, and after playing the loaded video resource in the video playback area, the playback control module is configured to:
  • the first determination module when determining the first pose information of the AR device based on the target scene image captured by the AR device in real time, is configured to:
  • the first pose information of the AR device is determined.
  • the video playback area includes at least one of the following: a video playback area located on at least one target display object in the target scene, a video playback area corresponding to a virtual playback device located in the target scene .
  • the loading module when loading the video resource corresponding to the video playing area, is configured to:
  • the video resource bound to the area identification information is loaded.
  • the loading module when loading the video resource bound to the region identification information according to the region identification information corresponding to the video playback region, is configured to:
  • a video resource corresponding to the current time is selected to be loaded.
  • Embodiments of the present disclosure further provide a computer device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processor is connected to a bus.
  • the memories communicate through a bus, and when the machine-readable instructions are executed by the processor, the video resource processing method described in any of the foregoing embodiments is executed.
  • Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the video resource processing method described in any of the foregoing embodiments is executed.
  • Embodiments of the present disclosure further provide a computer program, where the computer program includes computer-readable codes, and when the computer-readable codes are executed in an electronic device, the processor of the electronic device executes any of the foregoing implementations The video resource processing method described in the example.
  • the embodiments of the present disclosure provide at least a video resource processing method, apparatus, computer device, storage medium and program, which can preload video resources when the relative pose relationship between the AR device and the video playback area satisfies a preset condition , that is, not only the video playback area that can be used to play video resources does not need to be carried on the physical playback device in the target scene, but also does not need to actually occupy the location space in the target scene, so that the actual location space resources and equipment resources can be saved.
  • a preset condition that is, not only the video playback area that can be used to play video resources does not need to be carried on the physical playback device in the target scene, but also does not need to actually occupy the location space in the target scene, so that the actual location space resources and equipment resources can be saved.
  • a preset condition that is, not only the video playback area that can be used to play video resources does not need to be carried on the physical playback device in the target scene, but also does not need to actually occupy the location space in
  • FIG. 1 shows a flowchart of a video resource processing method provided by an embodiment of the present disclosure
  • FIG. 2 shows a schematic diagram of a system architecture of a video resource processing method provided by an embodiment of the present disclosure
  • FIG. 3 shows a schematic diagram of a relative orientation angle in a relative pose relationship provided by an embodiment of the present disclosure
  • FIG. 4 shows a schematic diagram of the architecture of a video resource processing apparatus provided by an embodiment of the present disclosure
  • FIG. 5 shows a schematic structural diagram of a computer device 500 provided by an embodiment of the present disclosure.
  • Embodiments of the present disclosure provide a video resource processing method, apparatus, computer device, storage medium, and program, which can preload video resources when the relative pose relationship between an AR device and a video playback area satisfies a preset condition .
  • the video playback area used to play the video resources may not need to be carried on the physical playback device in the target scene, nor need to actually occupy the location space in the target scene, thereby saving actual location space resources and device resources.
  • the preloaded video resources are played. Since the video resources are preloaded locally, the influence of the network environment on the video loading can be reduced, and the video playback process can be reduced.
  • the smoothness of video playback is improved.
  • the method of playing the video resources from the server is adopted, which is easily affected by the network environment. For example, if the current network status of the AR device is not good, it may not be able to load the video resources from the server in time to be played. the video resources, resulting in a freeze in playback, which in turn affects the video playback effect.
  • the solution of the embodiment of the present disclosure is to load the video resources based on the pose relationship. If the conditions are met, the video resources to be played can be loaded at one time, so the smoothness of the video playback can be improved.
  • the execution subject of the video resource processing method provided by the embodiment of the present disclosure is generally a computer device with a certain computing capability. It can be a terminal device or other processing device.
  • AR devices can include AR glasses, tablet computers, smart phones, smart wearable devices and other devices with obvious display functions and data processing functions. AR devices can be connected to cloud servers through applications.
  • FIG. 1 is a flowchart of a video resource processing method provided by an embodiment of the present disclosure, the method includes steps S101 to S103, wherein:
  • S101 Determine the first pose information of the AR device based on the target scene image captured by the AR device in real time;
  • S102 Determine the relative pose between the AR device and the video playback area according to the first pose information and the second pose information of the preset video playback area in the 3D scene model corresponding to the target scene relation;
  • the video playback area it is possible to preload video resources under the condition that the relative pose relationship between the AR device and the video playback area satisfies a preset condition, that is, not only the video playback area that can be used to play video resources may not need to carry the video resources On the physical playback device in the target scene, there is no need to actually occupy the location space in the target scene, so that the actual location space resources and device resources can be saved.
  • a preset condition that is, not only the video playback area that can be used to play video resources may not need to carry the video resources
  • the video playback area there is no need to actually occupy the location space in the target scene, so that the actual location space resources and device resources can be saved.
  • the video can be played when needed.
  • directly play the pre-loaded video resources directly play the pre-loaded video resources. Since the video resources are pre-loaded locally, the impact of the network environment on the video loading can be reduced, the situation of freezing during the video playback process can be reduced, and the smoothness of the video play
  • FIG. 2 is a schematic diagram of a system architecture to which a video resource processing method according to an embodiment of the present disclosure can be applied; as shown in FIG. 2 , the system architecture includes: an AR device 201 , a network 202 , and a control terminal 203 .
  • the AR device 201 and the control terminal 203 establish a communication connection through the network 202
  • the AR device 201 reports the target scene image to the control terminal 203 through the network 202
  • the control terminal 203 responds to the target scene image and determines the AP device's image.
  • the control terminal 203 uploads the loaded video resource to the network 202 and sends it to the AR device 201 through the network 202 .
  • the video resources can be preloaded when the relative pose relationship between the AR device and the video playback area satisfies the preset conditions. In this way, the video playback area used to play the video resources can be played without an entity carried in the target scene.
  • control terminal 203 may include a visual processing device or a remote server with visual information processing capabilities.
  • Network 202 may employ wired or wireless connections.
  • the AR device 201 can be connected to the visual processing device through a wired connection, such as data communication through a bus; when the control terminal 203 is a remote server, the AR device 201 can perform data interaction with a remote server through a wireless network.
  • the AR device 201 may be a visual processing device with a video capture module, or a host with a camera.
  • the video resource processing method according to the embodiment of the present disclosure may be executed by the AR device 201 , and the above-mentioned system architecture may not include the network 202 and the control terminal 203 .
  • the target scene image may be an image of a real scene acquired by the AR device in real time. Wherein, when the AR device captures the target scene image, it may be captured by the AR device after the user triggers the capture button of the AR device, or after the AR device is activated.
  • the determining the first pose information of the AR device based on the target scene image captured by the AR device in real time may be that the AR device determines the first pose information of the AR device based on the target scene image.
  • the pose information can also be that the AR device sends the target scene image to the server, the server determines the first pose information of the AR device based on the target scene image, and then the AR device obtains the determined first pose information from the server.
  • the AR device when the AR device determines the first pose information, it may determine the first position of the AR device in the scene coordinate system established based on the scene corresponding to the target scene image based on the target scene image captured by the AR device in real time.
  • One pose information when the AR device determines the first pose information, it may determine the first position of the AR device in the scene coordinate system established based on the scene corresponding to the target scene image based on the target scene image captured by the AR device in real time.
  • the scene coordinate system may be a three-dimensional coordinate system
  • the coordinate origin of the scene coordinate system may be any point in the target scene corresponding to the scene coordinate system.
  • the first pose information and the second pose information in the scene coordinate system may be respectively determined based on the coordinate origin, and then based on the first pose information and the second pose information.
  • the second pose information determines the relative pose information.
  • any point in the position points corresponding to the first pose information and the second pose information can be selected as the coordinate origin.
  • the relative pose information is simpler.
  • the position point of the video playback area in the 3D scene model can be used as the coordinate origin of the scene coordinate system corresponding to the real scene image, so that the first pose information of the AR device determined based on the coordinate origin is the AR device and the video. Relative pose information between playback regions.
  • the pose information may include position information and attitude information, that is, three-dimensional coordinates and orientations in the scene coordinate system.
  • the following methods when determining the first pose information of the AR device in the scene coordinate system established based on the scene corresponding to the target scene image based on the target scene image captured by the AR device in real time, the following methods may be included. any of:
  • the position information of multiple target detection points in the target scene corresponding to the target scene image can be detected first, and the target pixel points corresponding to each target detection point in the target scene image can be determined, and then the corresponding target pixel points in the target scene image can be determined respectively. (for example, it can be obtained by performing depth detection on the target scene image), and finally, based on the depth information of the target pixel point, the first pose information of the AR device is determined.
  • the target detection point may be a preset position point in the scene where the AR device is located, such as a cup, a fan, a water dispenser, and the like.
  • the depth information of the target pixel can be used to represent the distance between the target detection point corresponding to the target pixel and the image acquisition device of the AR device.
  • the position coordinates of the target detection point in the scene coordinate system are preset and fixed.
  • the orientation in the first pose information can be determined by the coordinate information of the target detection point and the target pixel point corresponding to the target detection point in the scene image; and The position information of the AR device is determined based on the depth value of the target pixel point corresponding to the target detection point, so that the first pose information of the AR device can be determined.
  • It can be determined based on the three-dimensional scene model of the target scene where the AR device is located.
  • the target scene image acquired by the AR device in real time can be matched with a pre-built 3D scene model of the target scene where the AR device is located, and then based on the matching result, the first pose information of the AR device is determined.
  • the scene image of the AR device under various pose information of the target scene can be obtained.
  • the AR device can also be obtained. first pose information.
  • the video playback area includes at least one of the following: a video playback area located on at least one target display object in the target scene, a video playback area corresponding to a virtual playback device located in the target scene Video playback area.
  • the target display object may be a real object in the target scene, such as a billboard, a building, etc.
  • the video playback area corresponding to the virtual playback device may be a virtual TV set/virtual display screen with a display function.
  • the pose information of the video playback area in the 3D scene model has been determined, so the second pose information of the video playback area in the 3D scene model corresponding to the target scene can be regarded as predefined And it is fixed in the process of AR device displaying AR scene picture.
  • the relationship between the AR device and the AR device determines the relationship between the AR device and the AR device.
  • the relative pose relationship between the video playback areas may be determined by determining the relative distance and the relative orientation angle of the AR device relative to the video playback area.
  • the relative distance of the AR device relative to the video playback area may be determined according to the position information in the first position information and the position information in the second pose information; the relative orientation angle of the AR device relative to the video playback area may be is the angle between the shooting direction of the AR device and the direction toward the video playback area.
  • FIG. 3 shows a schematic diagram of the relative angle in the relative pose relationship provided by the embodiment of the present disclosure.
  • the facing angle may be as shown in FIG. 3 , and the relative facing angle is an angle formed by the horizontal extension line of the orientation of the AR device and the extension line facing the video playing area.
  • the loading of the video resource corresponding to the video playback area in response to the relative pose relationship satisfying a preset condition may be performed during detection of the connection between the AR device and the video playback area.
  • the relative distance satisfies the preset condition
  • the video resource corresponding to the video playing area is loaded.
  • the relative distance between the AR device and the video playback area satisfies a preset condition may be that the relative distance between the AR device and the video playback area is less than a set distance.
  • the set distance may be set according to the recognition accuracy of the AR device and the acquisition range of the image acquisition device.
  • the set distance may be set to 2 meters, and when the relative distance between the AR device and the video playing area is less than 2 meters, the video resource corresponding to the video playing area is loaded.
  • the relative distance between the AR device and the video playback area is less than the set distance, start loading the video resources corresponding to the video playback area.
  • the video resources are preloaded before the playback conditions are met, which can improve the The smoothness of the video resource playback after the playback conditions are met.
  • the limitation of distance can reduce the waste of loading video resources (for example, the playback conditions cannot be met after loading), and reduce the completion of the video loading and the video playback conditions are met. Afterwards, the playback is not clear due to the far distance between the AR device and the video playback area.
  • the loading of the video resource corresponding to the video playback area may be obtained from the server.
  • the video resource corresponding to the video playback area may be pre-loaded on the AR device before playback, so that the AR device can be directly loaded through the AR
  • the device determines whether to play the video resources corresponding to the video playback area, and the playback process of the video resources is also directly controlled by the AR device. Compared with the video resources obtained from the server during the video playback process, this method can improve the video playback process. Continuity in the system to enhance the user viewing experience.
  • the video resource bound to the area identification information may be loaded according to the area identification information corresponding to the video playing area.
  • the region identification information may be set in advance, and the region identification information is used to distinguish different video playback regions, and different video playback regions may correspond to different video resources.
  • the AR device may store a mapping relationship between the region identification information and the video resource identification. When it is determined that the relative pose information between any video playback region and the AR device satisfies a preset condition Next, based on the region identification information corresponding to the video playback region, the video resource identification corresponding to the region identification information can be searched from the mapping relationship, and then the video resource corresponding to the found video resource identification is loaded.
  • one of the video playback areas corresponds to at least one video resource.
  • the video when loading the video resources corresponding to the video playback area, due to the limited storage capacity of the AR device, the video can be loaded according to preset loading conditions resource.
  • the loading condition may be any one of user instructions, relative pose relationship, current time and other conditions.
  • the AR device when loading the video resource corresponding to the video playback area, the AR device may first display a video resource including the video resource identifier corresponding to the video playback area. Playlist, and then load the video resource corresponding to the selection instruction based on the selection instruction made by the user for the playlist.
  • the playlist when displaying the playlist on the AR device, the playlist can be superimposed on the preset position of the target scene image for display, and the user can generate a selection instruction for any video resource identifier by triggering the AR device.
  • the user may trigger the screen of the AR device to generate a selection instruction for any video resource identifier; it may also be that the user makes a target gesture, based on the The video resource identifier pointed to by the target gesture can generate a selection instruction for the video resource identifier.
  • the loading condition includes relative pose information
  • multiple video resources corresponding to the area identifier of the video playback area can be identified from the multiple video resources. , load a video resource that matches the relative distance in the relative pose information.
  • the set distance may be divided into different distance ranges, and different distance ranges correspond to different video resources, and then the target distance range to which the relative distance in the relative pose relationship belongs is determined, and the corresponding target distance range is loaded. video resources.
  • a distance range of 0 to 2 meters and a distance of 2 meters to 5 meters can be divided, and the video resource corresponding to the distance range of 0 to 2 meters is The video resource corresponding to the distance range of 5 meters is the video resource B. If the relative distance in the relative pose relationship is 2 meters, the video resource A can be loaded.
  • each of the multiple video resources corresponding to the region identification information may be determined first.
  • a playback time period corresponding to one video resource, and then according to the playback time period corresponding to the multiple video resources, among the multiple video resources, the video resource corresponding to the current time is selected to be loaded.
  • the video resource corresponding to the current time may be the video resource corresponding to the playback time period to which the current time belongs.
  • the video resource bound to the region identification information includes video resource A, video resource B, and video resource C
  • the corresponding playback time periods are 10:00 to 12:00, 14:00 to 16:00, and 17:00 to 19:00. If the current time is 11:00, the video resource A is loaded.
  • the playback time period closest to the current time is determined, and the video resource corresponding to the playback time period is determined as the video resource corresponding to the current time.
  • the video resource may not be loaded directly, and the target scene image captured by the AR device in real time may be directly displayed.
  • the loaded video resource after loading the video resource corresponding to the video playback area, the loaded video resource may be played in the video playback area when the displayed AR scene image includes video elements corresponding to the video resource.
  • the AR scene picture corresponding to the first pose information of the AR device may be acquired, and the acquired AR scene picture may be displayed on the AR device; the displayed AR scene picture includes the corresponding video resources.
  • the video element can be any one of the video elements contained in the AR scene picture that meets any of the following conditions:
  • the area occupied by the video element contained in the AR scene picture can be determined; when the proportion of the area in the total area of the video playback area is greater than or equal to the set proportion, Play the loaded video resource in the video playing area.
  • the preset ratio may be set to 50%, that is, when the area of the video playback area included in the AR scene screen accounts for more than or equal to 50% of the total area of the video playback area In the case of playing the loaded video resource in the video playing area.
  • the video element corresponding to the video resource can be determined; when the pixel of the video element is detected in the AR scene image, the loaded video resource is directly played in the video playback area.
  • the pixel of the video element is detected in the AR scene picture, that is, the video element corresponding to the video resource is rendered in the AR scene picture, in this case, the loaded video resource can be played directly.
  • Condition 3 when it is detected in the AR scene picture that the number of pixels occupied by the video element exceeds a preset value, the loaded video resource can be played in the video playback area.
  • the preset value can be set to 200, that is, when the number of pixels in the video playback area included in the target scene image is greater than or equal to 200 pixel units, the video playback area will be played in the video playback area.
  • the loaded video resource can be set to 200, that is, when the number of pixels in the video playback area included in the target scene image is greater than or equal to 200 pixel units, the video playback area will be played in the video playback area.
  • the AR device uses the AR device to determine whether the AR scene screen includes video elements corresponding to video resources, and only when the AR scene screen includes video elements corresponding to video resources will the video resources be played, so that the video resources can appear in the AR during playback.
  • the invalid playback of the video is reduced, and the resource utilization rate is improved.
  • the The video resource played and loaded in the video playback area may also be in the case that the displayed AR scene picture includes the video element corresponding to the video resource, in the relative pose relationship, determine the AR device's position.
  • the included angle between the shooting direction and the direction toward the video playback area, and when the included angle is within a set angle range, the loaded video resource is played in the video playback area.
  • the displayed AR scene picture includes the video element corresponding to the video resource, but in the relative pose relationship, the shooting direction of the AR device and the direction toward the video playback area The included angle between them is not within the set angle range.
  • the video element corresponding to the video resource can be rendered in the AR scene image (for example, the cover of the video resource can be displayed) without playing the video resource.
  • the AR device may move at any time, it is possible to detect the total area of the video playback area of the area occupied by the video element included in the AR scene image proportion of . In the case that the proportion of the area occupied by the video element included in the AR scene image in the total area of the video playing area is less than the set ratio, stop playing the loaded said video in the video playing area. video resources.
  • the video playback area may stop playing.
  • the loaded video resource in the case where the area occupied by the video elements included in the AR scene picture accounts for less than 50% of the total area of the video playback area, the video playback area may stop playing.
  • preset control buttons are also displayed in the AR scene screen.
  • the user can control the pause/play of the video resources through the preset control buttons in the AR device when the video resources are played. .
  • a control button is also displayed in the corresponding video play area in the AR device, which is used to control the video resource in response to the user's triggering operation on the control button Pause/Play.
  • the AR device detects that the control button is double-clicked, it controls the playback of the video resource to pause; when the AR device detects that the control button is long-pressed, it controls the video resource to play.
  • the gesture made by the user in the captured target scene image can also be detected in real time, and when the target gesture is detected, the video can be paused The video resource played in the playback area.
  • the position information of each preset position point of the hand in the target image in the target scene image can be detected, and based on the position information of each preset position point in the target scene image , determine the relative positional relationship between each preset position point, and then recognize the gesture made by the user in the target scene image based on the determined relative positional relationship.
  • the preset position points of the hand may be fingertips, joint points, wrists, etc. of each finger.
  • the video resources can be preloaded when the relative pose relationship between the AR device and the video playback area satisfies the preset conditions.
  • the video playback area used to play the video resources can be On the physical playback device in the target scene, there is no need to actually occupy the location space in the target scene, so that the actual location space resources and device resources can be saved.
  • preloading the video resources corresponding to the video playback area it can be performed when needed.
  • the pre-loaded video resources are played directly. Since the video resources are pre-loaded locally, the impact of the network environment on the video loading can be reduced, the situation of freezing during the video playback process can be reduced, and the smoothness of the video playback can be improved.
  • the embodiment of the present disclosure also provides a video resource processing apparatus corresponding to the video resource processing method.
  • a video resource processing apparatus corresponding to the video resource processing method.
  • the implementation of the apparatus reference may be made to the implementation of the method, and the repetition will not be repeated.
  • the video resource processing apparatus 400 includes: a first determination module 401 , a second determination module 402 , and a loading module 403 ; wherein,
  • the first determination module 401 is configured to determine the first pose information of the AR device based on the target scene image captured by the AR device in real time;
  • the second determining module 402 is configured to determine the AR device and the video playing area according to the first posture information and the second posture information of the preset video playing area in the 3D scene model corresponding to the target scene The relative pose relationship between;
  • the loading module 403 is configured to load a video resource corresponding to the video playing area in response to the relative pose relationship satisfying a preset condition.
  • the relative pose relationship includes a relative distance
  • the loading module 403 when loading the video resource corresponding to the video playback area in response to the relative pose relationship satisfying a preset condition, is configured as:
  • the video resource corresponding to the video playing area is loaded.
  • the loading module 403 in response to the relative distance between the AR device and the video playing area meeting a preset condition, when loading the video resource corresponding to the video playing area, configure for:
  • the apparatus further includes a playback module 404, configured as:
  • the playback module 404 when the displayed AR scene picture includes the video element, when playing the loaded video resource in the video playback area, is configured to:
  • the playback module 404 is configured to play the loaded video resource in the video playback area when the video element is included in the displayed AR scene. :
  • the displayed AR scene picture includes the video element
  • in the relative pose relationship obtain the angle between the shooting direction of the AR device and the direction toward the video playback area;
  • the included angle is within the set angle range, the loaded video resource is played in the video playback area.
  • the video resource processing apparatus 400 further includes a playback control module 405. After the loaded video resource is played in the video playback area, the playback control module 405 is configured to:
  • the first determination module 401 when determining the first pose information of the AR device based on the target scene image captured by the AR device in real time, is configured to:
  • the first pose information of the AR device is determined.
  • the video playback area includes at least one of the following: a video playback area located on at least one target display object in the target scene, a video corresponding to a virtual playback device located in the target scene play area.
  • the loading module 403 when loading the video resource corresponding to the video playing area, is configured as:
  • the video resource bound to the area identification information is loaded.
  • the loading module 403 when loading the video resource bound to the region identification information according to the region identification information corresponding to the video playback region, is configured as:
  • a video resource corresponding to the current time is selected to be loaded.
  • the video resources can be preloaded under the condition that the relative pose relationship between the AR device and the video playback area satisfies the preset conditions, that is, the video playback area not only used for playing the video resources can be carried in the target scene without being carried On the physical playback device, it is not necessary to actually occupy the location space in the target scene, so that the actual location space resources and device resources can be saved.
  • preloading the video resources corresponding to the video playback area when the video playback needs to be performed, Directly play the pre-loaded video resources. Since the video resources are pre-loaded locally, the impact of the network environment on the video loading can be reduced, the stuttering during the video playback process can be reduced, and the smoothness of the video playback can be improved.
  • a schematic structural diagram of a computer device 500 provided by an embodiment of the present disclosure includes a processor 501 , a memory 502 , and a bus 503 .
  • the memory 502 is used to store the execution instructions, including the memory 5021 and the external memory 5022; the memory 5021 here is also called the internal memory, which is used to temporarily store the operation data in the processor 501 and the data exchanged with the external memory 5022 such as the hard disk,
  • the processor 501 exchanges data with the external memory 5022 through the memory 5021.
  • the processor 501 communicates with the memory 502 through the bus 503, so that the processor 501 executes the following instructions:
  • the video resource corresponding to the video playing area is loaded.
  • Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the video resource processing method described in the foregoing method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • the computer program product of the video resource processing method provided by the embodiments of the present disclosure includes a computer-readable storage medium storing program codes, and the instructions included in the program codes can be used to execute the video resource processing methods described in the above method embodiments. For the steps, reference may be made to the above method embodiments, which will not be repeated here.
  • An embodiment of the present disclosure further provides a computer program, which implements any one of the video resource processing methods in the foregoing embodiments when the computer program is executed by a processor.
  • the computer program product can be implemented in hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and the like.
  • the present disclosure relates to the field of augmented reality.
  • the relevant features, states and attributes of the target object can be detected or recognized with the help of various visual correlation algorithms, so as to obtain the corresponding image information matching the specific application.
  • the AR effect that combines virtual and reality.
  • the target object may involve faces, limbs, gestures, movements, etc. related to the human body, or objects, markers, or sandboxes, display areas, or display items related to venues or venues.
  • Vision-related algorithms may involve visual localization, SLAM, 3D reconstruction, image registration, background segmentation, object keypoint extraction and tracking, object pose or depth detection, etc.
  • the specific application can not only involve interactive scenes such as navigation, navigation, explanation, reconstruction, and virtual effect overlay display related to real scenes or items, but also special effects processing related to people, such as makeup beautification, body beautification, special effects display, virtual Model display and other interactive scenarios.
  • the relevant features, states and attributes of the target object can be detected or recognized through the convolutional neural network.
  • the above convolutional neural network is a network model obtained by model training based on a deep learning framework.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium.
  • the computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .
  • Embodiments of the present disclosure provide a video resource processing method, device, storage medium, device, and program.
  • the method includes: determining first pose information of the AR device based on a target scene image captured by the AR device in real time; The first pose information and the second pose information of the preset video playback area in the three-dimensional scene model corresponding to the target scene, determine the relative pose relationship between the AR device and the video playback area; The relative pose relationship satisfies a preset condition, and the video resource corresponding to the video playback area is loaded.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本公开提供了一种视频资源处理方法、装置、计算机设备、存储介质及程序,其中,该方法包括:基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息;根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系;响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。如此,能够在AR设备与视频播放区域之间的相对位姿关系满足预设条件的情况下,预加载视频资源,即不仅能够用于播放视频资源的视频播放区域可以无需承载在目标场景中的实体播放设备上,也无需实际占用目标场景中的位置空间,从而可以节省现实的位置空间资源和设备资源,另外通过预先加载视频播放区域对应的视频资源,可以在需要进行视频播放时,直接播放预先加载的视频资源,由于视频资源预先加载到本地,可以降低网络环境对于视频加载的影响,减少视频播放过程中出现卡顿的情况,提升视频播放的流畅性。

Description

视频资源处理方法、装置、计算机设备、存储介质及程序
相关申请的交叉引用
本专利申请要求2021年02月2日提交的中国专利申请号为202110145949.0、申请人为深圳市慧鲤科技有限公司,申请名称为“一种视频资源处理方法、装置、计算机设备及存储介质”的优先权,该申请文件以引用的方式并入本申请中。
技术领域
本公开涉及增强显示技术领域,尤其涉及一种视频资源处理方法、装置、计算机设备、存储介质及程序。
背景技术
在景区、展览馆、车站等人流量较大的地方,出于宣传等目的,往往有播放视频的需求。
相关技术中,在进行视频播放时,一般是通过固定的显示设备(例如电子屏幕等)循环播放视频,这种方式一方面需要占用实体播放设备和实际的位置空间资源,另一方面,视频在进行播放时,可能会因为网络问题造成播放的视频卡顿的现象,播放效果较差。
发明内容
本公开实施例提出了一种视频资源处理方法、装置、计算机设备、存储介质及程序。
本公开实施例提供了一种视频资源处理方法,所述方法由电子设备执行,所述方法包括:
基于增强现实(Augmented Reality,AR)设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息;
根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系;
响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。
如此,可以在AR设备与视频播放区域之间的相对位姿关系满足预设条件的情况下,预加载视频资源。采用这种方式,不仅用于播放视频资源的视频播放区域可以无需承载在目标场景中的实体播放设备上,也无需实际占用目标场景中的位置空间,从而可以节省现实的位置空间资源和设备资源,而且,通过预先加载视频播放区域对应的视频资源,可以在需要进行视频播放时,直接播放预先加载的视频资源,由于视频资源预先加载到本地,可以降低网络环境对于 视频加载的影响,减少视频播放过程中出现卡顿的情况,提升视频播放的流畅性。
在一些实施例中,所述相对位姿关系包括相对距离;
所述响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源,包括:
响应于所述AR设备与所述视频播放区域之间的相对距离满足预设条件,加载所述视频播放区域对应的视频资源。
在一些实施例中,所述响应于所述AR设备与所述视频播放区域之间的相对距离满足预设条件,加载所述视频播放区域对应的视频资源,包括:
在所述AR设备与所述视频播放区域之间的相对距离小于设定距离的情况下,加载所述视频播放区域对应的视频资源。如此,在AR设备与所述视频播放区域之间的相对距离小于设定距离的情况下,开始加载视频播放区域对应的视频资源,一方面在满足播放条件前进行了视频资源的预加载,可以提高在满足播放条件后对视频资源播放的流畅性,另一方面,对距离的限制也一定程度上减少了视频资源的加载浪费情况(比如加载后一直满足不了播放条件),以及减少了在完成加载并且满足视频播放条件后,因AR设备与视频播放区域之间距离过远造成播放不清晰的问题。
在一些实施例中,在所述加载所述视频播放区域对应的视频资源之后,所述方法还包括:
确定所述视频资源对应的视频元素;在展示的AR场景画面中包含所述视频元素的情况下,在所述视频播放区域播放加载的所述视频资源。如此,通过AR设备判断AR场景画面中是否包括视频资源对应的视频元素,并在AR场景画面中包括视频资源对应的视频元素的情况下,才会播放视频资源,使得视频资源在播放时可以出现在AR设备的画面中,减少了视频的无效播放,提升了资源利用率。
在一些实施例中,所述在展示的AR场景画面中包含所述视频资源对应的视频元素的情况下,在所述视频播放区域播放加载的所述视频资源,包括:
确定所述AR场景画面中包含的所述视频元素所占的面积;在所述视频元素所占的面积在所述视频播放区域的总面积中的占比大于或等于设定比例的情况下,在所述视频播放区域播放加载的所述视频资源。如此,在AR场景画面中所包含的视频元素所占的面积足够大的情况下,才会播放视频资源,减少了资源浪费,也提升了视频资源的播放效果,提升了用户的观看体验。
在一些实施例中,所述在展示的AR场景画面中包含所述视频资源对应的视频元素的情况下,在所述视频播放区域播放加载的所述视频资源,包括:
在展示的AR场景画面中包含所述视频元素的情况下,在所述相对位姿关系中,获取所述AR设备的拍摄方向与朝向所述视频播放区域的方向之间的夹角;在所述夹角在设定角度范围内的情况下,在所述视频播放区域播放加载的所述视频资源。如此,在AR设备位于最佳观看角度、AR场景画面中所包含的视频元素所占的面积足够大的情况下,才会播放视频资源,进一步 提升了视频资源的播放效果和资源利用率,提升了用户的观看体验。
在一些实施例中,在所述视频播放区域播放加载的所述视频资源之后,还包括:
在所述AR场景画面中包含的所述视频元素所占的面积在所述视频播放区域的总面积中的占比小于设定比例的情况下,在所述视频播放区域停止播放加载的所述视频资源。如此,在AR场景画面中所包含的视频元素所占的面积较小情况下,用户可能无法通过AR场景画面中展示的视频资源,通过暂停播放视频资源,可以减少视频资源播放却无人观看带来的资源浪费。
在一些实施例中,所述基于所述目标场景图像确定所述AR设备的第一位姿信息,包括:
基于所述目标场景图像,以及预先构建的所述目标场景对应的三维场景模型,确定所述AR设备的第一位姿信息。如此,基于AR设备所在的目标场景的三维场景模型,可以获取AR设备在目标场景的各个位姿信息下的场景图像,通过将AR实时获取的目标场景图像与三维场景模型进行匹配,确定AR设备的第一位姿信息。
在一些实施例中,所述视频播放区域至少包括以下之一:位于所述目标场景中的至少一个目标展示对象上的视频播放区域、位于所述目标场景中的虚拟播放设备对应的视频播放区域。如此,可以在多种类型的视频播放区域中对视频资源进行预加载。
在一些实施例中,所述加载所述视频播放区域对应的视频资源,包括:
根据所述视频播放区域对应的区域标识信息,加载与所述区域标识信息绑定的视频资源。如此,在确定任一视频播放区域与AR设备之间的相对位姿信息满足预设条件的情况下,可以基于该视频播放区域对应的区域标识信息,查找与该区域标识信息对应的视频资源标识,然后加载查找到的视频资源标识对应的视频资源。
在一些实施例中,根据所述视频播放区域对应的区域标识信息,加载与所述区域标识信息绑定的视频资源,包括:
根据所述视频播放区域对应的区域标识信息,确定与所述区域标识信息绑定的多个视频资源;
根据所述多个视频资源对应的播放时间段,在所述多个视频资源中,选择加载与当前时间对应的视频资源。如此,在同一视频播放区域,可以设置在不同的时间段播放不同的视频资源,从而丰富展示的视频资源。
本公开实施例还提供一种视频资源处理装置,包括:
第一确定模块,配置为基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息;
第二确定模块,配置为根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系;
加载模块,配置为响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。
在一些实施例中,所述相对位姿关系包括相对距离;
所述加载模块,在响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源时,配置为:
响应于所述AR设备与所述视频播放区域之间的相对距离满足预设条件,加载所述视频播放区域对应的视频资源。
在一些实施例中,所述加载模块,在响应于所述AR设备与所述视频播放区域之间的相对距离满足预设条件,加载所述视频播放区域对应的视频资源时,配置为:
在所述AR设备与所述视频播放区域之间的相对距离小于设定距离的情况下,加载所述视频播放区域对应的视频资源。
在一些实施例中,所述装置还包括播放模块,配置为:
在所述加载所述视频播放区域对应的视频资源之后,确定所述视频资源对应的视频元素;在展示的AR场景画面中包含所述视频元素的情况下,在所述视频播放区域播放加载的所述视频资源。
在一些实施例中,所述播放模块,在展示的AR场景画面中包含所述视频元素的情况下,在所述视频播放区域播放加载的所述视频资源时,配置为:
确定所述AR场景画面中包含的所述视频元素所占的面积;在所述面积在所述视频播放区域的总面积中的占比大于或等于设定比例的情况下,在所述视频播放区域播放加载的所述视频资源。
在一些实施例中,所述播放模块,在展示的AR场景画面中包含所述视频元素的情况下,在所述视频播放区域播放加载的所述视频资源时,配置为:
在展示的AR场景画面中包含所述视频元素的情况下,在所述相对位姿关系中,获取所述AR设备的拍摄方向与朝向所述视频播放区域的方向之间的夹角;在所述夹角在设定角度范围内的情况下,在所述视频播放区域播放加载的所述视频资源。
在一些实施例中,所述视频资源处理装置还包括播放控制模块,在所述视频播放区域播放加载的所述视频资源之后,所述播放控制模块,配置为:
在所述AR场景画面中包含的所述视频元素所占的面积在所述视频播放区域的总面积中的占比小于设定比例的情况下,在所述视频播放区域停止播放加载的所述视频资源。
在一些实施例中,所述第一确定模块,在所述基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息时,配置为:
基于所述目标场景图像,以及预先构建的所述目标场景对应的三维场景模型,确定所述AR设备的第一位姿信息。
在一些实施例中,所述视频播放区域包括以下至少之一:位于所述目标场景中的至少一个目标展示对象上的视频播放区域、位于所述目标场景中的虚拟播放设备对应的视频播放区域。
在一些实施例中,所述加载模块,在所述加载所述视频播放区域对应的视频资源时,配置为:
根据所述视频播放区域对应的区域标识信息,加载与所述区域标识信息绑定的视频资源。
在一些实施例中,所述加载模块,在所述根据所述视频播放区域对应的区域标识信息,加载与所述区域标识信息绑定的视频资源时,配置为:
根据所述视频播放区域对应的区域标识信息,确定与所述区域标识信息绑定的多个视频资源;
根据所述多个视频资源对应的播放时间段,在所述多个视频资源中,选择加载与当前时间对应的视频资源。
本公开实施例还提供一种计算机设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,在计算机设备运行的情况下,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行上述任一实施例所述的视频资源处理方法。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述任一实施例所述的视频资源处理方法。
本公开实施例还提供一种计算机程序,所述计算机程序包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备的处理器执行如上述任一实施例所述的视频资源处理方法。
关于上述视频资源处理装置、计算机设备、计算机可读存储介质和计算机程序的效果描述参见上述视频资源处理方法的说明。
本公开实施例至少提供一种视频资源处理方法、装置、计算机设备、存储介质及程序,能够在AR设备与视频播放区域之间的相对位姿关系满足预设条件的情况下,预加载视频资源,即不仅能够用于播放视频资源的视频播放区域可以无需承载在目标场景中的实体播放设备上,也无需实际占用目标场景中的位置空间,从而可以节省现实的位置空间资源和设备资源,另外通过预先加载视频播放区域对应的视频资源,可以在需要进行视频播放时,直接播放预先加载的视频资源,由于视频资源预先加载到本地,可以降低网络环境对于视频加载的影响,减少视频播放过程中出现卡顿的情况,提升视频播放的流畅性。
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单 地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1示出了本公开实施例所提供的一种视频资源处理方法的流程图;
图2示出了本公开实施例所提供的一种视频资源处理方法的系统架构示意图;
图3示出了本公开实施例所提供的相对位姿关系中相对朝向夹角的示意图;
图4示出了本公开实施例所提供的一种视频资源处理装置的架构示意图;
图5示出了本公开实施例所提供的一种计算机设备500的结构示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
本公开实施例提供了一种视频资源处理方法、装置、计算机设备、存储介质及程序,可以在AR设备与视频播放区域之间的相对位姿关系满足预设条件的情况下,预加载视频资源。用于播放视频资源的视频播放区域可以无需承载在目标场景中的实体播放设备上,也无需实际占用目标场景中的位置空间,从而可以节省现实的位置空间资源和设备资源。另一方面,采用本公开实施例的方案,在进行视频播放时,播放的是预先加载的视频资源,由于视频资源预先加载到本地,可以减小网络环境对于视频加载的影响,减少视频播放过程中出现卡顿的情况,提升视频播放的流畅性。相关技术中,在播放虚拟视频时,采用一边从服务器加载视频资源一边进行播放的方式,容易受到网络环境的影响,例如若AR设备的当前网络状态不佳时,可能无法及时从服务器加载待播放的视频资源,从而导致播放卡顿的情况,进而影响视频播放效果。本公开实施例的方案是基于位姿关系对视频资源进行加载,在满足条件的情况下,可以一次性加载需要播放的视频资源,因此可提升视频播放的流畅性。
针对以上方案所存在的缺陷,均是发明人在经过实践并仔细研究后得出的结果,因此,上述问题的发现过程以及下文中本公开针对上述问题所提出的解决方案,都应该是发明人在本公开过程中对本公开做出的贡献。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
为便于对本实施例进行理解,首先对本公开实施例所公开的一种视频资源处理方法进行详细介绍,本公开实施例所提供的视频资源处理方法的执行主体一般为具有一定计算能力的计算机设备,可以为终端设备或其他处理设备,AR设备比如可以包括AR眼镜、平板电脑、智能手机、智能穿戴式设备等具有明显显示功能和数据处理功能的设备,AR设备可以通过应用程序连接云端服务器。
参见图1所示,为本公开实施例提供的视频资源处理方法的流程图,所述方法包括步骤S101至S103,其中:
S101:基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息;
S102:根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系;
S103:响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。
在本公开实施例中,能够在AR设备与视频播放区域之间的相对位姿关系满足预设条件的情况下,预加载视频资源,即不仅能够用于播放视频资源的视频播放区域可以无需承载在目标场景中的实体播放设备上,也无需实际占用目标场景中的位置空间,从而可以节省现实的位置空间资源和设备资源,另外通过预先加载视频播放区域对应的视频资源,可以在需要进行视频播放时,直接播放预先加载的视频资源,由于视频资源预先加载到本地,可以降低网络环境对于视频加载的影响,减少视频播放过程中出现卡顿的情况,提升视频播放的流畅性。
图2为可以应用本公开实施例的一种视频资源处理方法的系统架构示意图;如图2所示,该系统架构中包括:AR设备201、网络202、控制终端203。为实现支撑一个示例性应用,AR设备201和控制终端203通过网络202建立通信连接,AR设备201通过网络202向控制终端203上报目标场景图像,控制终端203响应于目标场景图像,确定AP设备的第一位姿信息;其次,根据第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定AR设备与所述视频播放区域之间的相对位姿关系;再次,响应于在相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。最后,控制终端203将加载的视频资源上传至网络202,并通过网络202发送给AR设备201。从而可以在AR设备与视频播放区域之间的相对位姿关系满足预设条件的情况下,预加载视频资源,这样,用于播放视频资源的视频播放区域可以无需承载在目标场景中的实体播放设备上,也无需实际占用目标场景中的位置空间,从而可以节省现实的位置空间资源和设备资源;而且,通过预先加载视频播放区域对应的视频资源,可以在需要进行视频播放时,直接播放预先加载的视频资源,由于视频资源预先加载到本地,可以降低网络环境对于视频加载的影响,减少视频播放过程中出现卡顿的情况,提升视频播放的流畅性。
作为示例控制终端203可以包括具有视觉信息处理能力的视觉处理设备或远程服务器。网络202可以采用有线或无线连接方式。其中,在控制终端203为视觉处理设备的情况下,AR设备201可以通过有线连接的方式与视觉处理设备通信连接,例如通过总线进行数据通信;在控制终端203为远程服务器的情况下,AR设备201可以通过无线网络与远程服务器进行数据交互。
或者,在一些场景中,AR设备201可以是带有视频采集模组的视觉处理设备,可以是带有摄像头的主机。这时,本公开实施例的视频资源处理方法可以由AR设备201执行,上述系统架构可以不包含网络202和控制终端203。
以下是针对上述步骤S101至S103的详细说明:
针对S101和S102:
所述目标场景图像可以为所述AR设备实时获取的现实场景的图像。其中,AR设备在拍摄目标场景图像时,可以是用户在触发AR设备的拍摄按钮之后,或者AR设备被启动之后,由AR设备拍摄得到。
在一种可能的实施方式中,所述基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息,可以是AR设备基于所述目标场景图像确定所述AR设备的第一位姿信息,也可以是AR设备将目标场景图像发送到服务器,由服务器基于目标场景图像,确定AR设备的第一位姿信息,然后AR设备从服务器获取确定的第一位姿信息。
在一些实施例中,AR设备在确定第一位姿信息时,可以基于AR设备实时拍摄的目标场景图像,确定所述AR设备在基于与目标场景图像对应的场景建立的场景坐标系中的第一位姿信息。
这里,所述场景坐标系可以是三维坐标系,所述场景坐标系的坐标原点可以是所述场景坐标系对应的目标场景中的任意一点。在获取所述相对位姿信息时,可以基于所述坐标原点分别确定所述在所述场景坐标系中的第一位姿信息和第二位姿信息,再基于所述第一位姿信息和第二位姿信息确定所述相对位姿信息。
为了简化计算,可以选择所述第一位姿信息和第二位姿信息对应的位置点中的任意一点作为所述坐标原点,这样,相比于在其他位置选选取任意一点当坐标原点在计算所述相对位姿信息时更简单。
示例性的,可以以视频播放区域在三维场景模型中的位置点作为现实场景图像对应的场景坐标系的坐标原点,这样基于坐标原点确定的AR设备的第一位姿信息即为AR设备和视频播放区域之间的相对位姿信息。
这里,所述位姿信息可以包括位置信息和姿态信息,也即所述场景坐标系中的三维坐标和朝向。
在一些实施例中,在基于AR设备实时拍摄的目标场景图像,确定所述AR设备在基于与目标场景图像对应的场景建立的场景坐标系中的第一位姿信息时,可以包括以下方法中的任意 一种:
方法一、
可以先检测目标场景图像对应的目标场景中多个目标检测点的位置信息,以及确定每一个目标检测点在目标场景图像中对应的目标像素点,然后确定目标场景图像中各个目标像素点分别对应的深度信息(例如可以通过对目标场景图像进行深度检测获得),最后基于目标像素点的深度信息,确定AR设备的第一位姿信息。
其中,所述目标检测点可以是预先设置好的AR设备所在场景中的位置点,例如可以是杯子、风扇、饮水机等。目标像素点的深度信息可以用来表示该目标像素点对应的目标检测点与AR设备的图像采集装置之间的距离。所述目标检测点在场景坐标系中的位置坐标是预先设置好,且固定不变的。
在一些实施例中,在确定AR设备的第一位姿信息时,可以通过目标检测点与目标检测点对应的目标像素点在场景图像中的坐标信息确定第一位姿信息中的朝向;以及基于目标检测点对应的目标像素点的深度值确定AR设备的位置信息,由此可以确定AR设备的第一位姿信息。
方法二、
可以基于AR设备所在的目标场景的三维场景模型确定。
在一些实施例中,可以将AR设备实时获取的目标场景图像与预先构建的AR设备所在目标场景的三维场景模型进行匹配,然后基于匹配结果,确定AR设备的第一位姿信息。
基于AR设备所在的目标场景的三维场景模型,可以获取AR设备在目标场景的各个位姿信息下的场景图像,通过将AR实时获取的目标场景图像与三维场景模型进行匹配,同样可以获取AR设备的第一位姿信息。
在一种可能的实施方式中,所述视频播放区域包括以下至少之一:位于所述目标场景中的至少一个目标展示对象上的视频播放区域、位于所述目标场景中的虚拟播放设备对应的视频播放区域。
其中,所述目标展示对象可以是目标场景中的真实物体,比如广告牌、建筑物等,所述虚拟播放设备对应的视频播放区域可以是虚拟电视机/虚拟显示屏等带有显示功能的虚拟载体上设置的视频播放区域。
在目标场景的三维场景模型构建时,便已确定视频播放区域在三维场景模型中的位姿信息,因此视频播放区域在目标场景对应的三维场景模型中的第二位姿信息可以视为预先定义且在AR设备展示AR场景画面的过程中固定不变的。
在一种可能的实施方式中,所述根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系,可以是确定AR设备相对所述视频播放区域的相对距离和相对朝向夹角。
所述AR设备相对所述视频播放区域的相对距离,可以根据第一位置信息中的位置信息和 第二位姿信息中的位置信息确定;AR设备相对所述视频播放区域的相对朝向夹角可以是AR设备的拍摄方向与朝向视频播放区域的方向之间的夹角,示例性的,图3示出了本公开实施例所提供的相对位姿关系中相对朝向夹角的示意图,所述相对朝向夹角可以如图3所示,所述相对朝向夹角为由所述AR设备的朝向的在水平方向上的延长线和所述朝向视频播放区域方向的延长线组成的夹角。
针对S103:
在一种可能的实施方式中,所述响应于相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源,可以是在检测所述AR设备与所述视频播放区域之间的相对距离满足预设条件的情况下,响应于AR设备与视频播放区域之间的相对距离满足预设条件,加载所述视频播放区域对应的视频资源。
其中,所述AR设备与所述视频播放区域之间的相对距离满足预设条件可以是所述AR设备与所述视频播放区域之间的相对距离小于设定距离。所述设定距离可以依据所述AR设备的识别精度和所述图像采集装置的采集范围进行设置。
示例性的,可以设置设定距离为2米,在所述AR设备与所述视频播放区域之间的相对距离小于2米的情况下,加载所述视频播放区域对应的视频资源。
在AR设备与所述视频播放区域之间的相对距离小于设定距离的情况下,开始加载视频播放区域对应的视频资源,一方面在满足播放条件前进行了视频资源的预加载,可以提高在满足播放条件后对视频资源播放的流畅性,另一方面,对距离的限制可以减少视频资源的加载浪费情况(比如加载后一直满足不了播放条件),以及减少在完成视频加载并且满足视频播放条件后,因AR设备与视频播放区域之间距离过远造成播放不清晰的问题。
这里,所述加载视频播放区域对应的视频资源可以是从服务器获取与该视频播放区域对应的视频资源,视频播放区域对应的视频资源可以在播放之前预先加载在AR设备上,这样可以直接通过AR设备判断是否播放视频播放区域对应的视频资源,且其视频资源的播放过程也是由AR设备直接进行控制的,与视频播放过程中再从服务器获取视频资源相比,这种方式可以提高视频播放过程中的连贯性,提升用户观看体验。
其中,在加载所述视频播放区域对应的视频资源时,可以根据所述视频播放区域对应的区域标识信息,加载与所述区域标识信息绑定的视频资源。
在一些实施例中,所述区域标识信息可以提前进行设置,所述区域标识信息用于区分不同的视频播放区域,不同的视频播放区域可以对应不同的视频资源。在一些实施例中,所述AR设备中可以存储有区域标识信息和视频资源标识之间的映射关系,在确定任一视频播放区域与AR设备之间的相对位姿信息满足预设条件的情况下,可以基于该视频播放区域对应的区域标识信息,从所述映射关系中查找与该区域标识信息对应的视频资源标识,然后加载查找到的视频资源标识对应的视频资源。这里,一个所述视频播放区域对应至少一个视频资源。
在一个所述视频播放区域对应的视频资源有多个的情况下,在加载所述视频播放区域对应的视频资源时,由于所述AR设备的存储能力有限,可以按照预设的加载条件加载视频资源。
这里,所述加载条件可以是用户指令、相对位姿关系、当前时间等条件中的任意一个。
在一种可能的实施方式中,在所述加载条件包括用户指令的情况下,在加载视频播放区域对应的视频资源时,可以先在AR设备上展示包括该视频播放区域对应的视频资源标识的播放列表,然后基于用户对于该播放列表做出的选择指令,加载与该选择指令对应的视频资源。
其中,在AR设备上展示播放列表时,可以将播放列表叠加在目标场景图像的预设位置处进行展示,用户可以通过触发AR设备,来生成对于任一视频资源标识的选择指令。
所述用户在触发AR设备生成对于任一视频资源标识的选择指令时,可以是用户触发AR设备的屏幕来生成对于任一视频资源标识的选择指令;也可以是用户做出目标手势,基于该目标手势所指向的视频资源标识可以生成对于该视频资源标识的选择指令。
在一种可能的实施方式中,在所述加载条件包括相对位姿信息的情况下,在加载视频播放区域对应的加载视频资源时,可以从与视频播放区域的区域标识对应的多个视频资源中,加载与相对位姿信息中的相对距离相匹配的视频资源。
在一些实施例中,可以将设定距离划分成不同的距离范围,不同的距离范围对应不同的视频资源,然后确定相对位姿关系中的相对距离所属的目标距离范围,并加载目标距离范围对应的视频资源。
示例性的,若设定距离为5米,则可以划分出0至2米,2米至5米的距离范围,与0至2米的距离范围对应的视频资源为视频资源A,与2米至5米的距离范围对应的视频资源为视频资源B,若相对位姿关系中的相对距离为2米,则可以加载视频资源A。
在一种可能的实施方式中,在所述加载条件包括当前时间的情况下,在加载与区域标识信息对应的视频资源时,可以先确定与所述区域标识信息对应的多个视频资源中每一个视频资源对应的播放时间段,然后根据所述多个视频资源对应的播放时间段,在所述多个视频资源中,选择加载与当前时间对应的视频资源。
所述与当前时间对应的视频资源可以是当前时间所属的播放时间段对应的视频资源,示例性的,若与区域标识信息绑定的视频资源有视频资源A、视频资源B、视频资源C,其对应的播放时间段分别为10:00至12:00,14:00至16:00,17:00至19:00,若当前时间为11:00,则加载视频资源A。
若当前时间不属于任一视频资源对应的播放时间段,则确定距离当前时间最近的播放时间段,将该播放时间段对应的视频资源确定为与当前时间对应的视频资源。
在另一种可能的实施方式中,若当前时间不属于任一视频资源对应的播放时间段,则可以直接不加载视频资源,直接展示AR设备实时拍摄的目标场景图像。
在一种可能的实施方式中,在加载视频播放区域对应的视频资源之后,还可以在展示的AR 场景画面中包含视频资源对应的视频元素的情况下,在视频播放区域播放加载的视频资源。
在一些实施例中,可以获取与AR设备的第一位姿信息相对应的AR场景画面,并在AR设备上展示获取的AR场景画面;所述展示的AR场景画面中包含所述视频资源对应的视频元素,可以是所述AR场景画面中所包含的视频元素满足以下条件中的任意一种:
条件1,可以确定在所述AR场景画面中包含的所述视频元素所占的面积;在所述面积在所述视频播放区域的总面积中的占比大于或等于设定比例的情况下,在所述视频播放区域播放加载的所述视频资源。
示例性的,可以设置预设比例为50%,也即当在所述AR场景画面中包含的所述视频播放区域的面积在所述视频播放区域的总面积中的占比大于或等于50%的情况下,在所述视频播放区域播放加载的所述视频资源。
基于这种方式,在AR场景画面中所包含的视频元素所占的面积足够大的情况下,才会播放视频资源,减少了资源浪费,也提升了视频资源的播放效果,提升了用户的观看体验。
条件2,可以确定所述视频资源对应的视频元素;在所述AR场景画面中检测出所述视频元素的像素的情况下,直接在所述视频播放区域播放加载的所述视频资源。
所述在所述AR场景画面中检测出所述视频元素的像素,即所述视频资源对应的视频元素在AR场景画面中被渲染出来,在这种情况下,可以直接播放加载的视频资源。
条件3,可以在所述AR场景画面中检测出所述视频元素所占的像素个数超过预设数值的情况下,在所述视频播放区域播放加载的所述视频资源。
示例性的,可以设置预设数值为200,也即在所述目标场景图像中包含的所述视频播放区域的像素个数大于或等于200个像素单位的情况下,在所述视频播放区域播放加载的所述视频资源。
通过AR设备判断AR场景画面中是否包括视频资源对应的视频元素,并在AR场景画面中包括视频资源对应的视频元素的情况下,才会播放视频资源,使得视频资源在播放时可以出现在AR设备的画面中,减少了视频的无效播放,提升了资源利用率。
在另一种可能的实施方式中,为确定视频资源在播放时,AR设备位于最佳的观看位置,所述在展示的AR场景画面中包含所述视频资源对应的视频元素的情况下,在所述视频播放区域播放加载的所述视频资源还可以是在展示的AR场景画面中包含所述视频资源对应的视频元素的情况下,在所述相对位姿关系中,确定所述AR设备的拍摄方向与朝向所述视频播放区域的方向之间的夹角,在所述夹角在设定角度范围内的情况下,在所述视频播放区域播放加载的所述视频资源。
基于这种方式,在AR设备位于最佳观看角度、AR场景画面中所包含的视频元素所占的面积足够大的情况下,才会播放视频资源,进一步提升了视频资源的播放效果和资源利用率,提升了用户的观看体验。
在一种可能的情景中,在展示的AR场景画面中包含所述视频资源对应的视频元素,但是所述相对位姿关系中,所述AR设备的拍摄方向与朝向所述视频播放区域的方向之间的夹角不在设定角度范围内,这种情况下可以在AR场景画面中渲染视频资源对应的视频元素(例如可以展示视频资源的封面),而并不播放视频资源。
在所述视频播放区域播放加载的所述视频资源之后,由于AR设备可能在随时移动,因此可以实施检测AR场景画面中包含的所述视频元素所占的面积在所述视频播放区域的总面积中的占比。在所述AR场景画面中包含的所述视频元素所占的面积在所述视频播放区域的总面积中的占比小于设定比例的情况下,在所述视频播放区域停止播放加载的所述视频资源。
示例性的,可以在所述AR场景画面中包含的所述视频元素所占的面积在所述视频播放区域的总面积中的占比小于50%的情况下,在所述视频播放区域停止播放加载的所述视频资源。
在一种可能的实施方式中,AR场景画面中还展示有预设的控制按钮用户可以在所述视频资源播放时,通过所述AR设备中预设的控制按钮,控制视频资源的暂停/播放。
在一些实施例中,在加载完所述视频资源后,在AR设备中对应的所述视频播放区域还展示有控制按钮,用于响应用户对所述控制按钮的触发操作,控制所述视频资源的暂停/播放。
示例性的,在视频播放区域播放加载的视频资源后,在AR设备检测到控制按钮被双击时,控制所述视频资源播放的暂停;在AR设备检测到控制按钮被长按时,控制所述视频资源进行播放。
在另外一种可能的实施方式中,在视频播放区域播放加载的视频资源之后,还可以实时检测拍摄的目标场景图像中用户做出的手势,在检测到目标手势的情况下,可以暂停在视频播放区域播放的视频资源。
其中,在检测目标场景图像中用户做出的手势时,可以检测目标图像中手部各个预设位置点在目标场景图像中的位置信息,基于各个预设位置点在目标场景图像中的位置信息,确定各个预设位置点之间的相对位置关系,然后基于确定的相对位置关系,识别目标场景图像中用户做出的手势。
示例性的,所述手部的预设位置点可以是各个手指的指尖、关节点、手腕等。
基于上述方法,可以在AR设备与视频播放区域之间的相对位姿关系满足预设条件的情况下,预加载视频资源,采用这种方式,不仅用于播放视频资源的视频播放区域可以无需承载在目标场景中的实体播放设备上,也无需实际占用目标场景中的位置空间,从而可以节省现实的位置空间资源和设备资源,而且,通过预先加载视频播放区域对应的视频资源,可以在需要进行视频播放时,直接播放预先加载的视频资源,由于视频资源预先加载到本地,可以降低网络环境对于视频加载的影响,减少视频播放过程中出现卡顿的情况,提升视频播放的流畅性。
本领域技术人员可以理解,在实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的执行顺序应当以其功能和可能的内在逻辑确 定。
基于同一发明构思,本公开实施例中还提供了与视频资源处理方法对应的视频资源处理装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述视频资源处理方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。
参照图4所示,为本公开实施例提供的一种视频资源处理装置的架构示意图,所述视频资源处理装置400包括:第一确定模块401、第二确定模块402、加载模块403;其中,
第一确定模块401,配置为基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息;
第二确定模块402,配置为根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系;
加载模块403,配置为响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。
一种可能的实施方式中,所述相对位姿关系包括相对距离;
所述加载模块403,在响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源时,配置为:
响应于所述AR设备与所述视频播放区域之间的相对距离满足预设条件,加载所述视频播放区域对应的视频资源。
一种可能的实施方式中,所述加载模块403,在响应于所述AR设备与所述视频播放区域之间的相对距离满足预设条件,加载所述视频播放区域对应的视频资源时,配置为:
在所述AR设备与所述视频播放区域之间的相对距离小于设定距离的情况下,加载所述视频播放区域对应的视频资源。
一种可能的实施方式中,所述装置还包括播放模块404,配置为:
在加载所述视频播放区域对应的视频资源之后,确定所述视频资源对应的视频元素;在展示的AR场景画面中包含所述视频元素的情况下,在所述视频播放区域播放加载的所述视频资源。
一种可能的实施方式中,所述播放模块404,在展示的AR场景画面中包含所述视频元素的情况下,在所述视频播放区域播放加载的所述视频资源时,配置为:
确定所述AR场景画面中包含的所述视频元素所占的面积;在所述面积在所述视频播放区域的总面积中的占比大于或等于设定比例的情况下,在所述视频播放区域播放加载的所述视频资源。
一种可能的实施方式中,所述播放模块404,在所述展示的AR场景画面中包含所述视视频元素的情况下,在所述视频播放区域播放加载的所述视频资源时,配置为:
在展示的AR场景画面中包含所述视频元素的情况下,在所述相对位姿关系中,获取所述AR设备的拍摄方向与朝向所述视频播放区域的方向之间的夹角;在所述夹角在设定角度范围内的情况下,在所述视频播放区域播放加载的所述视频资源。
一种可能的实施方式中,所述视频资源处理装置400,还包括播放控制模块405,在所述视频播放区域播放加载的所述视频资源之后,所述播放控制模块405,配置为:
在所述AR场景画面中包含的所述视频元素所占的面积在所述视频播放区域的总面积中的占比小于设定比例的情况下,在所述视频播放区域停止播放加载的所述视频资源。
一种可能的实施方式中,所述第一确定模块401,在所述基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息时,配置为:
基于所述目标场景图像,以及预先构建的所述目标场景对应的三维场景模型,确定所述AR设备的第一位姿信息。
一种可能的实施方式中,所述视频播放区域至少包括以下之一:位于所述目标场景中的至少一个目标展示对象上的视频播放区域、位于所述目标场景中的虚拟播放设备对应的视频播放区域。
一种可能的实施方式中,所述加载模块403,在所述加载所述视频播放区域对应的视频资源时,配置为:
根据所述视频播放区域对应的区域标识信息,加载与所述区域标识信息绑定的视频资源。
一种可能的实施方式中,所述加载模块403,在所述根据所述视频播放区域对应的区域标识信息,加载与所述区域标识信息绑定的视频资源时,配置为:
根据所述视频播放区域对应的区域标识信息,确定与所述区域标识信息绑定的多个视频资源;
根据所述多个视频资源对应的播放时间段,在所述多个视频资源中,选择加载与当前时间对应的视频资源。
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。
基于上述装置,可以在AR设备与视频播放区域之间的相对位姿关系满足预设条件的情况下,预加载视频资源,即不仅用于播放视频资源的视频播放区域可以无需承载在目标场景中的实体播放设备上,也无需实际占用目标场景中的位置空间,从而可以节省现实的位置空间资源和设备资源,而且,通过预先加载视频播放区域对应的视频资源,可以在需要进行视频播放时,直接播放预先加载的视频资源,由于视频资源预先加载到本地,可以降低网络环境对于视频加载的影响,减少视频播放过程中出现卡顿的情况,提升视频播放的流畅性。
基于同一技术构思,本公开实施例还提供了一种计算机设备。参照图5所示,为本公开实施例提供的计算机设备500的结构示意图,包括处理器501、存储器502、和总线503。其中, 存储器502用于存储执行指令,包括内存5021和外部存储器5022;这里的内存5021也称内存储器,用于暂时存放处理器501中的运算数据,以及与硬盘等外部存储器5022交换的数据,处理器501通过内存5021与外部存储器5022进行数据交换,在计算机设备500运行的情况下,处理器501与存储器502之间通过总线503通信,使得处理器501在执行以下指令:
基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息;
根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系;
响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的视频资源处理方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例所提供的视频资源处理方法的计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令可用于执行上述方法实施例中所述的视频资源处理方法的步骤,可参见上述方法实施例,在此不再赘述。
本公开实施例还提供一种计算机程序,该计算机程序被处理器执行时实现前述实施例的任意一种视频资源处理方法。该计算机程序产品可以通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品体现为计算机存储介质,在另一个可选实施例中,计算机程序产品体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
本公开涉及增强现实领域,通过获取现实环境中的目标对象的图像信息,进而借助各类视觉相关算法实现对目标对象的相关特征、状态及属性进行检测或识别处理,从而得到与具体应用匹配的虚拟与现实相结合的AR效果。示例性的,目标对象可涉及与人体相关的脸部、肢体、手势、动作等,或者与物体相关的标识物、标志物,或者与场馆或场所相关的沙盘、展示区域或展示物品等。视觉相关算法可涉及视觉定位、SLAM、三维重建、图像注册、背景分割、对象的关键点提取及跟踪、对象的位姿或深度检测等。具体应用不仅可以涉及跟真实场景或物品相关的导览、导航、讲解、重建、虚拟效果叠加展示等交互场景,还可以涉及与人相关的特效处理,比如妆容美化、肢体美化、特效展示、虚拟模型展示等交互场景。
可通过卷积神经网络,实现对目标对象的相关特征、状态及属性进行检测或识别处理。上述卷积神经网络是基于深度学习框架进行模型训练而得到的网络模型。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的 装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上所述实施例,仅为本公开的实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。
工业实用性
本公开实施例提供一种视频资源处理方法、装置、存储介质、设备及程序,该方法包括:基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息;根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系;响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。

Claims (15)

  1. 一种视频资源处理方法,所述方法由电子设备执行,所述方法包括:
    基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息;
    根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系;
    响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。
  2. 根据权利要求1所述的方法,其中,所述相对位姿关系包括相对距离;
    所述响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源,包括:
    响应于所述AR设备与所述视频播放区域之间的相对距离满足预设条件,加载所述视频播放区域对应的视频资源。
  3. 根据权利要求2所述的方法,其中,所述响应于所述AR设备与所述视频播放区域之间的相对距离满足预设条件,加载所述视频播放区域对应的视频资源,包括:
    在所述AR设备与所述视频播放区域之间的相对距离小于设定距离的情况下,加载所述视频播放区域对应的视频资源。
  4. 根据权利要求1至3任一所述的方法,其中,在所述加载所述视频播放区域对应的视频资源之后,所述方法还包括:
    确定所述视频资源对应的视频元素;
    在展示的AR场景画面中包含所述视频元素的情况下,在所述视频播放区域播放加载的所述视频资源。
  5. 根据权利要求4所述的方法,其中,所述在展示的AR场景画面中包含所述视频元素的情况下,在所述视频播放区域播放加载的所述视频资源,包括:
    确定所述AR场景画面中包含的所述视频元素所占的面积;
    在所述面积在所述视频播放区域的总面积中的占比大于或等于设定比例的情况下,在所述视频播放区域播放加载的所述视频资源。
  6. 根据权利要求4或5所述的方法,其中,所述在展示的AR场景画面中包含所述视频元素的情况下,在所述视频播放区域播放加载的所述视频资源,包括:
    在展示的AR场景画面中包含所述视频元素的情况下,在所述相对位姿关系中,获取所述AR设备的拍摄方向与朝向所述视频播放区域的方向之间的夹角;
    在所述夹角在设定角度范围内的情况下,在所述视频播放区域播放加载的所述视频资源。
  7. 根据权利要求4至6任一所述的方法,其中,在所述视频播放区域播放加载的所述视频 资源之后,还包括:
    在所述AR场景画面中包含的所述视频元素所占的面积在所述视频播放区域的总面积中的占比小于设定比例的情况下,在所述视频播放区域停止播放加载的所述视频资源。
  8. 根据权利要求1至7任一所述的方法,其中,所述基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息,包括:
    基于所述目标场景图像,以及预先构建的所述目标场景对应的三维场景模型,确定所述AR设备的第一位姿信息。
  9. 根据权利要求1至8任一所述的方法,其中,所述视频播放区域至少包括以下之一:位于所述目标场景中的至少一个目标展示对象上的视频播放区域、位于所述目标场景中的虚拟播放设备对应的视频播放区域。
  10. 根据权利要求1至9任一所述的方法,其中,所述加载所述视频播放区域对应的视频资源,包括:
    根据所述视频播放区域对应的区域标识信息,加载与所述区域标识信息绑定的视频资源。
  11. 根据权利要求10所述的方法,其中,所述根据所述视频播放区域对应的区域标识信息,加载与所述区域标识信息绑定的视频资源,包括:
    根据所述视频播放区域对应的区域标识信息,确定与所述区域标识信息绑定的多个视频资源;
    根据所述多个视频资源对应的播放时间段,在所述多个视频资源中,选择加载与当前时间对应的视频资源。
  12. 一种视频资源处理装置,包括:
    第一确定模块,配置为基于AR设备实时拍摄的目标场景图像,确定所述AR设备的第一位姿信息;
    第二确定模块,配置为根据所述第一位姿信息和预设的视频播放区域在目标场景对应的三维场景模型中的第二位姿信息,确定所述AR设备与所述视频播放区域之间的相对位姿关系;
    加载模块,配置为响应于所述相对位姿关系满足预设条件,加载所述视频播放区域对应的视频资源。
  13. 一种计算机设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,在计算机设备运行的情况下,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至11任一所述的视频资源处理方法。
  14. 一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至11任一所述的视频资源处理方法。
  15. 一种计算机程序,所述计算机程序包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备的处理器执行用于实现权利要求1至11任一所述的视频 资源处理方法。
PCT/CN2021/114547 2021-02-02 2021-08-25 视频资源处理方法、装置、计算机设备、存储介质及程序 WO2022166173A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110145949.0 2021-02-02
CN202110145949.0A CN112954437B (zh) 2021-02-02 2021-02-02 一种视频资源处理方法、装置、计算机设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022166173A1 true WO2022166173A1 (zh) 2022-08-11

Family

ID=76241863

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/114547 WO2022166173A1 (zh) 2021-02-02 2021-08-25 视频资源处理方法、装置、计算机设备、存储介质及程序

Country Status (2)

Country Link
CN (1) CN112954437B (zh)
WO (1) WO2022166173A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117765839A (zh) * 2023-12-25 2024-03-26 广东保伦电子股份有限公司 一种室内智慧导览方法、装置及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112954437B (zh) * 2021-02-02 2022-10-28 深圳市慧鲤科技有限公司 一种视频资源处理方法、装置、计算机设备及存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090300144A1 (en) * 2008-06-03 2009-12-03 Sony Computer Entertainment Inc. Hint-based streaming of auxiliary content assets for an interactive environment
CN110992859A (zh) * 2019-11-22 2020-04-10 北京新势界科技有限公司 一种基于ar导视的广告牌展示方法及装置
CN111653175A (zh) * 2020-06-09 2020-09-11 浙江商汤科技开发有限公司 一种虚拟沙盘展示方法及装置
CN111651051A (zh) * 2020-06-10 2020-09-11 浙江商汤科技开发有限公司 一种虚拟沙盘展示方法及装置
CN112287928A (zh) * 2020-10-20 2021-01-29 深圳市慧鲤科技有限公司 一种提示方法、装置、电子设备及存储介质
CN112288459A (zh) * 2020-01-21 2021-01-29 华为技术有限公司 一种广告的多屏协同方法及设备
CN112333498A (zh) * 2020-10-30 2021-02-05 深圳市慧鲤科技有限公司 一种展示控制方法、装置、计算机设备及存储介质
CN112954437A (zh) * 2021-02-02 2021-06-11 深圳市慧鲤科技有限公司 一种视频资源处理方法、装置、计算机设备及存储介质

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105898472A (zh) * 2015-11-30 2016-08-24 乐视网信息技术(北京)股份有限公司 视频播放控制方法、设备、系统以及客户端设备
CN107493497A (zh) * 2017-07-27 2017-12-19 努比亚技术有限公司 一种视频播放方法、终端和计算机可读存储介质
CN207337530U (zh) * 2017-10-23 2018-05-08 北京章鱼科技有限公司 一种新型智能展销终端
CN108304516A (zh) * 2018-01-23 2018-07-20 维沃移动通信有限公司 一种网络内容预加载方法及移动终端
CN108347657B (zh) * 2018-03-07 2021-04-20 北京奇艺世纪科技有限公司 一种显示弹幕信息的方法和装置
KR102574277B1 (ko) * 2018-05-04 2023-09-04 구글 엘엘씨 사용자와 자동화된 어시스턴트 인터페이스 간의 거리에 따른 자동화된 어시스턴트 콘텐츠의 생성 및/또는 적용
TWI672042B (zh) * 2018-06-20 2019-09-11 崑山科技大學 智慧型商品介紹系統及其方法
CN109990775B (zh) * 2019-04-11 2021-09-14 杭州简简科技有限公司 旅游地理定位方法及系统
CN110738737A (zh) * 2019-10-15 2020-01-31 北京市商汤科技开发有限公司 一种ar场景图像处理方法、装置、电子设备及存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090300144A1 (en) * 2008-06-03 2009-12-03 Sony Computer Entertainment Inc. Hint-based streaming of auxiliary content assets for an interactive environment
CN110992859A (zh) * 2019-11-22 2020-04-10 北京新势界科技有限公司 一种基于ar导视的广告牌展示方法及装置
CN112288459A (zh) * 2020-01-21 2021-01-29 华为技术有限公司 一种广告的多屏协同方法及设备
CN111653175A (zh) * 2020-06-09 2020-09-11 浙江商汤科技开发有限公司 一种虚拟沙盘展示方法及装置
CN111651051A (zh) * 2020-06-10 2020-09-11 浙江商汤科技开发有限公司 一种虚拟沙盘展示方法及装置
CN112287928A (zh) * 2020-10-20 2021-01-29 深圳市慧鲤科技有限公司 一种提示方法、装置、电子设备及存储介质
CN112333498A (zh) * 2020-10-30 2021-02-05 深圳市慧鲤科技有限公司 一种展示控制方法、装置、计算机设备及存储介质
CN112954437A (zh) * 2021-02-02 2021-06-11 深圳市慧鲤科技有限公司 一种视频资源处理方法、装置、计算机设备及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117765839A (zh) * 2023-12-25 2024-03-26 广东保伦电子股份有限公司 一种室内智慧导览方法、装置及存储介质

Also Published As

Publication number Publication date
CN112954437A (zh) 2021-06-11
CN112954437B (zh) 2022-10-28

Similar Documents

Publication Publication Date Title
US9911231B2 (en) Method and computing device for providing augmented reality
US20140248950A1 (en) System and method of interaction for mobile devices
CN112243583B (zh) 多端点混合现实会议
CN110716645A (zh) 一种增强现实数据呈现方法、装置、电子设备及存储介质
KR101227255B1 (ko) 마커 크기 기반 인터렉션 방법 및 이를 구현하기 위한 증강 현실 시스템
US20120192088A1 (en) Method and system for physical mapping in a virtual world
CN112148197A (zh) 增强现实ar交互方法、装置、电子设备及存储介质
US20160012644A1 (en) Augmented Reality System and Method
WO2022166173A1 (zh) 视频资源处理方法、装置、计算机设备、存储介质及程序
CN112148189A (zh) 一种ar场景下的交互方法、装置、电子设备及存储介质
CN112348968B (zh) 增强现实场景下的展示方法、装置、电子设备及存储介质
US20150172634A1 (en) Dynamic POV Composite 3D Video System
US11880999B2 (en) Personalized scene image processing method, apparatus and storage medium
US11087545B2 (en) Augmented reality method for displaying virtual object and terminal device therefor
KR20230044401A (ko) 확장 현실을 위한 개인 제어 인터페이스
US10818089B2 (en) Systems and methods to provide a shared interactive experience across multiple presentation devices
CN111527468A (zh) 一种隔空交互方法、装置和设备
CN111638797A (zh) 一种展示控制方法及装置
CN112637665B (zh) 增强现实场景下的展示方法、装置、电子设备及存储介质
CN112882576B (zh) Ar交互方法、装置、电子设备及存储介质
EP3172721B1 (en) Method and system for augmenting television watching experience
CN111833457A (zh) 图像处理方法、设备及存储介质
CN112905014A (zh) Ar场景下的交互方法、装置、电子设备及存储介质
WO2023030176A1 (zh) 视频处理方法、装置、计算机可读存储介质及计算机设备
CN112308977A (zh) 视频处理方法、视频处理装置和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21924172

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 10-11-2023)