CN117812406A

CN117812406A - Video processing method and device and electronic equipment

Info

Publication number: CN117812406A
Application number: CN202211177351.0A
Authority: CN
Inventors: 何伟; 孙黎阳; 张傲阳; 马茜
Original assignee: Douyin Vision Co Ltd; Lemon Inc Cayman Island
Current assignee: Douyin Vision Co Ltd; Lemon Inc Cayman Island
Priority date: 2022-09-26
Filing date: 2022-09-26
Publication date: 2024-04-02

Abstract

The disclosure provides a video processing method, a device and an electronic device, wherein the method comprises the following steps: acquiring a plurality of first playing delays of a plurality of users watching the first video and a plurality of first viewing angles of the plurality of users watching the first video, wherein the first playing delays are delays between the playing progress of the first video watched by the users and the current push progress of the first video; performing view classification processing on the plurality of first views based on the plurality of first play delays and the preset interval duration to obtain a plurality of view sets, wherein the first play delay associated with the first views in the view sets is in a play delay interval associated with the view sets, and the play delay interval is an interval determined based on the preset interval duration; based on the first view angles in the view angle sets, determining fusion view angles associated with the view angle sets to obtain a plurality of fusion view angles; based on the plurality of fused perspectives, a predicted perspective of the plurality of users over a future period of time is predicted. Accuracy of prediction of a user's viewing angle is improved.

Description

Video processing method and device and electronic equipment

Technical Field

The embodiment of the disclosure relates to the technical field of image processing, in particular to a video processing method, a video processing device and electronic equipment.

Background

The terminal device may predict the view angle of the user in the next period, and further play the full view video (e.g., panoramic video, virtual Reality (VR) video, etc.) in the view angle of the user.

Currently, the terminal device may acquire a real viewing angle of the user in the history period, and predict a viewing angle of the user in a future period through the real viewing angle in the history period. For example, the terminal device may acquire a plurality of real perspectives of the user in the history period, and perform linear regression processing on the plurality of real perspectives, so as to predict the perspective of the user in the future period. However, the partial view of the user over the history period is independent of viewing the full view video (e.g., the user randomly moves the view), thereby resulting in lower accuracy of the user view prediction.

Disclosure of Invention

The disclosure provides a video processing method, a video processing device and electronic equipment, which are used for solving the technical problem that in the prior art, the accuracy of user visual angle prediction is low.

In a first aspect, the present disclosure provides a video processing method, the method comprising:

acquiring a plurality of first playing delays of a plurality of users watching a first video and a plurality of first viewing angles when the plurality of users watch the first video, wherein the first playing delays are delays between the playing progress of the first video watched by the users and the current push progress of the first video;

Performing view classification processing on the first view angles based on the first play delays and a preset interval duration to obtain view sets, wherein the first play delay associated with the first view angle in the view sets is in a play delay interval associated with the view sets, and the play delay interval is an interval determined based on the preset interval duration;

determining fusion view angles associated with each view angle set based on the first view angles in each view angle set to obtain a plurality of fusion view angles;

based on the plurality of fused perspectives, a predicted perspective of the plurality of users over a future period of time is predicted.

In a second aspect, the present disclosure provides a video processing apparatus, including an acquisition module, a classification module, a determination module, and a prediction module, wherein:

the acquisition module is used for acquiring a plurality of first playing delays of a plurality of users watching a first video and a plurality of first viewing angles when the plurality of users watch the first video, wherein the first playing delays are delays between the playing progress of the first video watched by the users and the current push progress of the first video;

the classifying module is configured to perform view classifying processing on the plurality of first views based on the plurality of first play delays and a preset interval duration to obtain a plurality of view sets, where a first play delay associated with a first view in the view sets is within a play delay interval associated with the view sets, and the play delay interval is an interval determined based on the preset interval duration;

The determining module is used for determining fusion view angles associated with each view angle set based on the first view angles in each view angle set to obtain a plurality of fusion view angles;

the prediction module is used for predicting the predicted view angles of the users in a future period based on the fused view angles.

In a third aspect, an embodiment of the present disclosure provides an electronic device including: a processor and a memory;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored by the memory to cause the at least one processor to perform the video processing method as described above in the first aspect and the various possible aspects of the first aspect.

In a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium having stored therein computer-executable instructions which, when executed by a processor, implement the video processing method as described in the first aspect and the various possible aspects of the first aspect above.

In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program which, when executed by a processor, implements the video processing method as described above in the first aspect and the various possible aspects of the first aspect.

The disclosure provides a video processing method, a video processing device and electronic equipment, wherein the electronic equipment acquires a plurality of first playing delays of a plurality of users watching a first video and a plurality of first viewing angles when the plurality of users watching the first video, carries out view classification processing on the plurality of first viewing angles based on the plurality of first playing delays and a preset interval duration to obtain a plurality of view sets, wherein the first playing delays associated with the first viewing angles in the view sets are in a playing delay interval associated with the view sets, the playing delay interval is an interval determined based on a preset interval duration, and determines fusion viewing angles associated with each view set based on the first viewing angles in each view set to obtain a plurality of fusion viewing angles, and predicts predicted viewing angles of the plurality of users in a future period based on the plurality of fusion viewing angles. According to the method, the electronic device can predict the predicted view angles of the plurality of users in the future period based on the plurality of first view angles of the plurality of users, and the accuracy of view angle prediction of the users can be improved because the plurality of first view angles of the plurality of users can accurately reflect the content focused by the users on the first video, so that the influence of view angles irrelevant to watching the first video on view angle prediction is reduced.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the solutions in the prior art, a brief description will be given below of the drawings that are needed in the embodiments or the description of the prior art, it being obvious that the drawings in the following description are some embodiments of the present disclosure, and that other drawings may be obtained from these drawings without inventive effort to a person of ordinary skill in the art.

Fig. 1 is a schematic diagram of a first video according to an embodiment of the disclosure;

FIG. 2 is a schematic diagram of a process for determining a first video according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of a play delay according to an embodiment of the disclosure;

fig. 4 is a schematic view of an application scenario provided in an embodiment of the present disclosure;

fig. 5 is a schematic flow chart of a video processing method according to an embodiment of the disclosure;

FIG. 6 is a schematic view of a first view provided by an embodiment of the present disclosure;

fig. 7 is a schematic diagram of a playback delay interval according to an embodiment of the disclosure;

FIG. 8 is a flowchart of a method for determining a set of perspectives according to an embodiment of the present disclosure;

FIG. 9 is a schematic diagram of a process for determining a set of perspectives provided by an embodiment of the present disclosure;

Fig. 10 is a process schematic diagram of a video processing method according to an embodiment of the disclosure;

fig. 11 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present disclosure; the method comprises the steps of,

fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

In order to facilitate understanding, concepts related to the embodiments of the present disclosure are described below.

Terminal equipment: is a device with wireless receiving and transmitting function. The terminal device may be deployed on land, including indoors or outdoors, hand-held, wearable or vehicle-mounted; can also be deployed on the water surface (such as a ship, etc.). The terminal device may be a mobile phone (mobile phone), a tablet computer (Pad), a computer with a wireless transceiving function, a Virtual Reality (VR) terminal device, an augmented reality (augmented reality, AR) terminal device, a wireless terminal in industrial control (industrial control), a vehicle-mounted terminal device, a wireless terminal in unmanned driving (self driving), a wireless terminal in remote medical (remote medical), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation security (transportation safety), a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), a wearable terminal device, or the like. The terminal device according to the embodiments of the present disclosure may also be referred to as a terminal, a User Equipment (UE), an access terminal device, a vehicle terminal, an industrial control terminal, a UE unit, a UE station, a mobile station, a remote terminal device, a mobile device, a UE terminal device, a wireless communication device, a UE proxy, or a UE apparatus, etc. The terminal device may also be fixed or mobile.

First video: the first video may be a full view video. Alternatively, the full view video may be video of any view. For example, full view video may include Virtual Reality (VR) video, panoramic video, 180 degree view video, and the like. Alternatively, the first video may be a live full view video. For example, the first video may be a full-view video recorded in real time by the user through the terminal device, and the terminal device may send the full-view video recorded in real time to the electronic device, so that the electronic device sends the full-view video recorded in real time to the terminal devices of other users.

Next, a first video of an embodiment of the present disclosure will be described with reference to fig. 1.

Fig. 1 is a schematic diagram of a first video according to an embodiment of the disclosure. Referring to fig. 1, an image frame including a cube model and a first video is shown. The first video may be a VR video, and the image frame of the first video includes a top surface area, a back surface area, a bottom surface area, a left side surface area, a front surface area, and a right side surface area. The image in the top surface region of the image frame of the first video may be mapped to the front surface of the cube model, the image in the back surface region may be mapped to the back surface of the cube model, the image in the bottom surface region may be mapped to the bottom surface of the cube model, the image in the left side surface region may be mapped to the left side surface of the cube model, the image in the front surface region may be mapped to the front surface of the cube model, and the image in the right side surface region may be mapped to the right side surface of the cube model.

Optionally, the electronic device may receive a video playing request of the terminal device, and determine the first video according to the video playing request. For example, the terminal device may generate a video play request according to a trigger operation of the terminal device by a user, and send the video play request to the electronic device. For example, the terminal device may obtain a touch operation of the user on the control of the first video, generate a video playing request corresponding to the first video, and send the video playing request to the terminal device.

Next, a process of determining the first video by the electronic device will be described with reference to fig. 2.

Fig. 2 is a schematic diagram of a process for determining a first video according to an embodiment of the disclosure. Please refer to fig. 2, which includes a terminal device and an electronic device. The display page of the terminal equipment is a video playing page. The video playing page comprises a control of a video A, a control of a video B and a control of a video C. When a user clicks a control of the video A, the terminal equipment generates a video playing request, wherein the video playing request comprises an identification of the video A, and the terminal equipment sends the playing request of the video A to the electronic equipment. After receiving the playing request of the video A, the electronic equipment determines that the first video is the video A.

First playback delay: the first play delay may be a delay between a position of the first video currently viewed by the user and a current push position of the first video. For example, since the hardware device, network parameters, and network status of the terminal device used by each user are different, the delay of playing the first video by the terminal device of each user is different. For example, the first video is a live video, and if the playing position of the live video watched by the user differs from the playing position of the live video currently pushed by the user by 3 seconds, the first playing delay of the live video watched by the user is 3 seconds. For example, if the timestamp corresponding to the playing position of the live video watched by the user is timestamp a, and the timestamp corresponding to the playing position of the live video currently pushed is timestamp B, the first playing delay of the live video watched by the user is a duration corresponding to the difference between timestamp B and timestamp a.

Next, a plurality of first play delays for a plurality of users to watch the first video will be described with reference to fig. 3.

Fig. 3 is a schematic diagram of a playback delay according to an embodiment of the present disclosure. Referring to fig. 3, the method includes: live push of the first video, the location of user a viewing, and the location of user B viewing. The current push stream in the live push stream of the first video is the latest push stream. The delay between the position viewed by user a and the position of the current push is the play delay a and the delay between the position viewed by user B and the position of the current push is the play delay B. The playing delay A is the first playing delay of the user A, and the playing delay B is the first playing delay of the user B.

In the related art, the terminal device may predict the viewing angle of the user in the next period, and further determine the full viewing angle video content to be played in the next period. Currently, the terminal device may acquire a real viewing angle of the user in the history period, and predict a viewing angle of the user in a future period through the real viewing angle in the history period. For example, the terminal device may obtain a plurality of historical real views of the user in the historical period, and process the plurality of historical real views in a linear regression manner, so as to predict the view of the user in the future period. However, a part of the viewing angle of the user in the history period is irrelevant to viewing the full-view video, for example, when the user twists the neck, the viewing angle of the user also moves, but the viewing angle is irrelevant to the content of the user viewing the full-view video, and if the viewing angle in the future period of the user is predicted based on the viewing angle, the accuracy of the prediction of the viewing angle of the user is lower.

In order to solve the problems in the related art, an embodiment of the present disclosure provides a video processing method, in which an electronic device obtains a plurality of first play delays of a plurality of users watching a first video and a plurality of first views of the plurality of users watching the first video, determines a plurality of play timestamps associated with the plurality of first views based on the plurality of first play delays, classifies the plurality of first views based on a preset interval duration and the plurality of play timestamps to obtain a plurality of view sets, determines fusion views associated with each view set based on the first views in each view set, obtains a plurality of fusion views, predicts a predicted view of the plurality of users in a future period based on the plurality of fusion views, and transmits the predicted view to a terminal device. Therefore, the fused view angles corresponding to the view angle sets are determined for the first view angles, so that the content concerned by the first video by the users can be accurately reflected through the fused view angles, the influence of the view angles irrelevant to the watching of the first video on view angle prediction is reduced, and the accuracy of view angle prediction of the users is further improved.

Next, an application scenario of the embodiment of the present disclosure will be described with reference to fig. 4.

Fig. 4 is a schematic view of an application scenario provided in an embodiment of the present disclosure. Referring to fig. 4, an image frame including a first video, an electronic device, and a terminal device. The first video may be a VR video, and the image frame of the first video includes a top surface area, a back surface area, a bottom surface area, a left side surface area, a front surface area, and a right side surface area.

Referring to fig. 4, if the user viewing angle predicted by the electronic device is located in the front area, the electronic device sends a video in the front area of the first video to the terminal device, and the terminal device receives the video in the front area and can play the video in the front area in the display screen. The electronic device can predict the predicted viewing angles of the users in the future period based on the first viewing angles of the users, so that the influence of the viewing angles irrelevant to the first video viewing on the viewing angle prediction is reduced, and the accuracy of the viewing angle prediction of the users is improved.

It should be noted that fig. 2 is merely an exemplary application scenario illustrating an embodiment of the present disclosure, and is not limited to the application scenario of the embodiment of the present disclosure.

The following describes the technical solutions of the present disclosure and how the technical solutions of the present disclosure solve the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present disclosure will be described below with reference to the accompanying drawings.

Fig. 5 is a flowchart of a video processing method according to an embodiment of the disclosure. Referring to fig. 5, the method may include:

s501, acquiring a plurality of first playing delays for a plurality of users to watch the first video and a plurality of first viewing angles for the plurality of users to watch the first video.

The execution body of the embodiment of the disclosure may be an electronic device, or may be a video processing apparatus provided in the electronic device. The video processing device can be realized by software, and the video processing device can also be realized by a combination of software and hardware. Alternatively, the electronic device may be any device having data processing and communication functions. For example, the electronic device may be a server or the like.

Alternatively, the first viewing angle may include a viewing angle at which the user views the first video. For example, the first view may include a view center and a view range when the user views the first video. For example, when a user views a VR video of 360 degrees through a terminal device, if the user views a front area of the VR video, the first viewing angle of the user may be the front area or a part of the front area of the VR video (the viewing angle of the user may not cover all the front area), and if the user views a top area of the VR video, the first viewing angle of the user is the top area or a part of the top area of the VR video.

Next, a first viewing angle of the user will be described with reference to fig. 6.

Fig. 6 is a schematic view of a first view according to an embodiment of the disclosure. Referring to fig. 6, the method includes: an image frame of a first video and a first view of a user. The first video may be a VR video, and the image frame of the first video may include a top surface area, a back surface area, a bottom surface area, a left side surface area, a front surface area, and a right side surface area. The first view of the user covers a portion of the left side area and a portion of the front area in the image frame.

Optionally, the first play delay is a delay between a play progress of the first video watched by the user and a current push progress of the first video. For example, if the time stamp corresponding to the playing progress of the live video watched by the user is time stamp a, the time stamp corresponding to the current push progress of the live video is time stamp B, and the first playing delay is the time difference between time stamp B and time stamp a.

Optionally, for any one of the plurality of users, the electronic device may obtain the first play delay and the first viewing angle for the user to view the first video according to the following possible implementation manner: and receiving video playing information sent by the terminal equipment, and determining a first playing delay and a first viewing angle according to the video playing information. Alternatively, the video playing information may include a first playing delay and a first viewing angle for the user to view the first video. For example, the terminal device may periodically send, to the electronic device, a delay of the live video watched by the user and a viewing angle when the user watches the live video, and after receiving the video playing information, the electronic device may obtain the first playing delay and the first viewing angle.

S502, performing view classification processing on the first views based on the first play delays and the preset interval duration to obtain view sets.

Optionally, the set of views may include at least one first view. For example, the view angle set may include a first view angle of one user, or may include a first view angle of a plurality of users, which is not limited by the embodiments of the present disclosure.

Optionally, the view set is associated with a play delay interval, and the first play delay associated with the first view in the view set is within the play delay interval associated with the view set. For example, if the play delay interval associated with the view set is between 0 seconds and 1 second (corresponding to the current push position of the first video), the delay of viewing the first video by the users corresponding to all the first views in the view set is between 0 seconds and 1 second.

Optionally, the preset interval duration may be any duration. For example, the preset interval duration may be 1 second, 2 seconds, etc. Optionally, the preset interval duration may also be a duration determined according to the current network parameter. For example, if the current network state is good, the preset interval duration may be small, and if the current network state is poor, the preset interval duration is large. Alternatively, the preset interval duration may be any preset duration, which is not limited in the embodiments of the present disclosure. Optionally, the play delay interval is an interval determined based on a preset interval duration.

Optionally, the overlapping rate of the play delay interval between the multiple view-angle sets is less than or equal to a preset threshold. For example, if the preset threshold is 0.1 and the play delay interval associated with the view angle set a is 1 second to 2 seconds, the play delay interval associated with the view angle set B may be 0 second to 1.1 seconds, and the play delay interval associated with the view angle set C may be 1.9 seconds to 3 seconds.

Optionally, when the preset threshold is 0, the play delay areas among the multiple view angle sets do not overlap. For example, the electronic device performs view classification processing on the first views according to the first play delays to obtain a view set a, a view set B and a view set C, where a play delay interval associated with the view set a is delayed by 0 seconds to 1 second, a play delay interval associated with the view set B may be delayed by 1 second to 2 seconds, and a play delay interval associated with the view set C may be delayed by 2 seconds to 4 seconds.

Next, a playback delay section of the view angle set will be described with reference to fig. 7.

Fig. 7 is a schematic diagram of a playback delay interval according to an embodiment of the disclosure. Referring to fig. 7, live push including a first video. The current push stream in the live push stream of the first video is the latest push stream. The playback delay section of the view angle set a is a section delayed by 0 seconds to 2 seconds, and the playback delay section of the view angle set B is a section delayed by 2 seconds to 4 seconds.

S503, determining fusion view angles associated with each view angle set based on the first view angles in each view angle set, and obtaining a plurality of fusion view angles.

Alternatively, the blended view may be a view after the first view within the set of views is blended. Optionally, for any one view angle set, the electronic device may obtain multiple fused view angles based on the following possible implementation manners: the method includes the steps of obtaining a view number of a first view in a view set, and determining a fusion view associated with the view set based on the view number. For example, if the view set a includes 10 users 'first views and the view set B includes 15 users' first views, the view set a has 10 views and the view set B has 15 views.

Optionally, based on the number of views, determining the fusion view associated with the view set may include two cases:

case 1: the number of viewing angles is 1.

If the view number is 1, the first view in the view set is determined to be a blended view. For example, if the view set includes only a first view of one user, the electronic device determines the first view as a blended view of the view set.

Case 2: the number of viewing angles is greater than 1.

If the view number is greater than 1, determining a plurality of view weights associated with the first views based on a plurality of first play delays associated with the first views, and determining a fusion view according to the first views and the view weights. Optionally, the view weight associated with the first view is a weight occupied by the first view in all first views in the view set. For example, the view set includes a first view a, a first view B, and a first view C, where the view weight of the first view a may be 0.5, the view weight of the first view B may be 0.3, and the view weight of the first view C may be 0.2.

Optionally, the electronic device may determine a plurality of view weights associated with the plurality of first views according to the following possible implementation manner: a first time difference between a first playback delay corresponding to each first view and a left end point of a playback delay section corresponding to the set of views is determined. For example, the delay playing interval associated with the view angle set is delayed by 1 second to 2 seconds, if the first playing delay corresponding to the first view angle a is delayed by 1.2 seconds and the first playing delay corresponding to the first view angle B is delayed by 1.6 seconds, the first time difference corresponding to the first view angle a is 0.2 seconds, and the first time difference corresponding to the first view angle B is 0.6 seconds.

Optionally, a first preset relationship between the first time difference and the view angle weight is obtained. Optionally, the first preset relationship may include at least one time difference and a viewing angle weight corresponding to each time difference. For example, the first preset relationship may be as shown in table 1:

TABLE 1

Time difference	Viewing angle weight
		Time difference 1	Weight 1
Time difference 2	Weight 2
		Time difference 3	Weight 3
……	……

It should be noted that table 1 illustrates the first preset relationship by way of example only, and is not limited to the first preset relationship.

Optionally, according to the first preset relationship and the first time difference, determining the view angle weight corresponding to the first view angle. For example, if the first time difference corresponding to the first view is time difference 1, the view weight of the first view in the view set is weight 1; if the first time difference corresponding to the first view is time difference 2, the view weight of the first view in the view set is weight 2; if the first time difference corresponding to the first view is time difference 3, the view weight of the first view in the view set is weight 3.

Optionally, the first playback delay is inversely proportional to the view weight. For example, the larger the first play delay corresponding to the first view in the view set, the smaller the view weight corresponding to the first view, and the smaller the first play delay corresponding to the first view in the view set, the smaller the view weight corresponding to the first view. For example, if the first play delay of the first view angle in the view angle set is larger, the first time difference corresponding to the first view angle is larger, and the first view angle is an older view angle, so that the view angle weight corresponding to the first view angle is smaller, if the first play delay of the first view angle in the view angle set is smaller, the first time difference corresponding to the first view angle is smaller, and the first view angle is a newer view angle, so that the view angle weight corresponding to the first view angle is larger.

Optionally, determining the blended view according to the plurality of first views and the plurality of view weights, specifically includes: a plurality of sub-views is determined based on the plurality of first views and a plurality of view weights associated with the plurality of first views. For example, if the first video is a 360-degree full-view live video, each first view has a corresponding angle in space, and the sub-view corresponding to the first view can be determined by the angle and the view weight of the first view in space. For example, if the angle of the first view a in space is 30 degrees, the angle of the first view B in space is 60 degrees, the view angle weight of the first view a is 0.5, the view angle weight of the first view B is 0.5, the sub-view angle corresponding to the first view a is 15 degrees, and the sub-view angle corresponding to the first view B is 30 degrees.

Optionally, the multiple sub-views are fused to obtain a fused view. For example, if the sub-view corresponding to the first view a in the view angle set is 15 degrees and the sub-view corresponding to the first view B is 30 degrees, the blended view angle associated with the view angle set is 45 degrees.

Optionally, when determining the fused view corresponding to the latest view set, the fused view corresponding to the latest view set may be determined according to at least one first view in the latest view set, and when determining the fused view corresponding to other view sets, because the other view sets are older view sets, the fused view corresponding to the new view set than the other view sets may also affect the fused view in the other view sets, and when determining the fused view of the older view set, the fused view of the older view set may also be determined based on the fused view (the view weight may be a fixed value or an arbitrary value) of the newer view set and the multiple first views. For example, if the view set a is newer than the view set B (the play delay interval of the view set a may or may not be connected to the play delay interval of the view set B), and the fused view corresponding to the view set a is the fused view a, and the view set B includes the first view a and the first view B, the electronic device may determine the fused view of the view set B based on the fused view a, the first view B, and the view weight corresponding to each view.

S504, predicting the predicted view angles of a plurality of users in a future period based on the plurality of fusion views.

Optionally, the electronic device may predict a predicted perspective of the plurality of users over a future period of time based on the plurality of fused perspectives. For example, the electronic device may perform a linear regression process (e.g., linear regression, monotonic interval linear regression, or weighted linear regression) on the plurality of fused perspectives to obtain predicted perspectives for the plurality of users over a future period of time.

Optionally, the electronic device may process the plurality of fused perspectives through the first model to obtain predicted perspectives of the plurality of users in a future period. For example, the electronic device may input a plurality of fused perspectives to the first model, and the first model may output one predicted perspective corresponding to the plurality of fused perspectives. Optionally, the first model may be learned for a plurality of groups of samples, and the plurality of groups of samples may include a plurality of sample fusion perspectives and sample prediction perspectives corresponding to the plurality of sample fusion perspectives. For example, a sample fusion view 1, a sample fusion view 2, and a sample fusion view 3 are obtained, and a set of samples including the sample fusion view 1, the sample fusion view 2, the sample fusion view 3, and the sample prediction view 1 corresponding to the sample fusion view 1, the sample fusion view 2, and the sample fusion view 3 are obtained.

Optionally, if the execution body of the embodiment of the present disclosure is an electronic device, after the electronic device determines a predicted viewing angle of a plurality of users in a future period, the electronic device may send the predicted viewing angle to a plurality of terminal devices of the plurality of users, so that the plurality of terminal devices predict the viewing angle of the users through the predicted viewing angle. For example, in the practical application process, after determining the predicted viewing angle, the electronic device may send the predicted viewing angle to each terminal device, and the terminal device may determine the viewing angle of the user in the future period based on the predicted viewing angle and the historical viewing angle of the user acquired by the terminal device, so that the predicted viewing angle is used to assist the terminal device in predicting the viewing angle of the user in the future period, so that each user may independently determine the viewing angle in the future period, and accuracy of viewing angle prediction is improved.

Optionally, if the execution body of the embodiment of the present disclosure is an electronic device, after the electronic device determines a predicted viewing angle of a plurality of users in a future period, the electronic device may determine, according to the predicted viewing angle, a target content focused by the users in the future period, further obtain a video corresponding to the target content, and send the video corresponding to the target content to a terminal device, so that the terminal device plays the video, thereby saving resources of the terminal device, reducing delay of video playing, and improving accuracy of video playing and efficiency of video playing.

It should be noted that, the electronic device in the embodiment of the present disclosure is only described by way of example, and not limitation, and the execution subject in the embodiment of the present disclosure may be a terminal device or other devices with a data processing function.

The embodiment of the disclosure provides a video processing method, electronic equipment obtains a plurality of first playing delays of a plurality of users watching a first video and a plurality of first viewing angles of the plurality of users, performs viewing angle classification processing on the plurality of first viewing angles based on the plurality of first playing delays to obtain a plurality of viewing angle sets, wherein the overlapping rate of a playing delay interval between the plurality of viewing angle sets is smaller than or equal to a preset threshold value, obtains the number of viewing angles of the first viewing angles in the viewing angle sets, determines the first viewing angles in the viewing angle sets as fusion viewing angles if the number of viewing angles is 1, determines a plurality of viewing angle weights associated with the plurality of first viewing angles based on the plurality of first playing delays associated with the plurality of first viewing angles if the number of viewing angles is greater than 1, determines fusion viewing angles according to the plurality of first viewing angles and the plurality of viewing angle weights, and predicts a predicted viewing angle of the plurality of users in a future period based on the plurality of fusion viewing angles. Therefore, the fused view angles corresponding to the view angle sets are determined for the first view angles, so that the content focused by the users on the first video can be accurately reflected through the fused view angles, the influence of the view angles irrelevant to the watching of the first video on the view angle prediction is reduced, and the accuracy of the view angle prediction of the users is further improved.

On the basis of the embodiment shown in fig. 2, in the following, a method for performing view classification processing on a plurality of first views based on a plurality of first play delays and a preset interval duration to obtain a plurality of view sets in the video processing method is described with reference to fig. 8.

Fig. 8 is a flowchart of a method for determining a view angle set according to an embodiment of the present disclosure. Referring to fig. 8, the method includes:

s801, determining a plurality of play time stamps associated with a plurality of first views based on a plurality of first play delays.

Alternatively, the play timestamp associated with the first view may be a timestamp of the user's current viewing of the first video. For example, if the picture played by the terminal device is the picture of the first video at the time a, the play timestamp associated with the first view angle of the user may be the timestamp corresponding to the time a, and if the picture played by the terminal device is the picture of the first video at the time B, the play timestamp associated with the first view angle of the user may be the timestamp corresponding to the time B.

Optionally, the electronic device may determine a plurality of play timestamps associated with the plurality of first perspectives based on a feasible implementation of: the current playing time stamp of the first video is obtained. For example, the current play timestamp may be a timestamp corresponding to a time of the current push frame of the first video. For example, the first video is a live video, if the live video has been recorded to time a, the current playing time stamp of the first video is a time stamp corresponding to time a, and if the live video has been recorded to time B, the current playing time stamp of the first video is a time stamp corresponding to time B.

A plurality of play time stamps associated with the plurality of first perspectives is determined based on the plurality of first play delays and the current play time stamp. For example, a difference in the first playback delay corresponding to the current playback time stamp and the first view is determined as the playback time stamp associated with the first view. For example, when the current play time stamp is time stamp a and the first play delay corresponding to the first view is 3 seconds, the play time stamp associated with the first view is the time stamp of time stamp a 3 seconds before.

S802, classifying the first view angles based on a preset interval duration and a plurality of play time stamps to obtain a plurality of view angle sets.

Alternatively, the electronic device may obtain multiple view-angle sets based on the following possible implementation: an order of arrangement of the plurality of first views is determined based on the plurality of play time stamps. For example, the larger the play time stamp corresponding to the first view, the earlier the arrangement order of the first view, the smaller the play time stamp corresponding to the first view, and the later the arrangement order of the first view. For example, the play time stamp of the first view a is a time stamp a, the play time stamp of the first view B is a time stamp B, and if the time stamp a is greater than the time stamp B, the first view a is before the first view B in the arrangement sequence.

Based on the preset interval duration, classifying the first view angles according to the arrangement sequence to obtain a plurality of view angle sets. For example, if the preset interval duration is 2 seconds, the electronic device divides the first view angle into 2 view angle sets according to the arrangement sequence, where a play delay interval corresponding to each view angle set is 2 seconds. For example, the first views of the plurality of users include a first view a, a first view B, a first view C, and a first view C, a delay of the first view a is 0.2 seconds, a delay of the first view B is 0.5 seconds, a delay of the first view C is 1.3 seconds, a delay of the first view D is 1.6 seconds, and if the preset interval duration is 1 second, the electronic device determines the first view a and the first view B as 1 view set, and determines the first view C and the first view D as another view set.

Next, a process of determining the view angle set will be described with reference to fig. 9.

Fig. 9 is a schematic diagram of a process for determining a view angle set according to an embodiment of the disclosure. Please refer to fig. 9, which includes a first viewing angle. The first viewing angle includes a viewing angle a, a viewing angle B, a viewing angle C, a viewing angle D, a viewing angle E, and a viewing angle F. The retardation of view a is 0.3 seconds, the retardation of view B is 0.8 seconds, the retardation of view C is 1.1 seconds, the retardation of view D is 1.2 seconds, the retardation of view E is 1.8 seconds, and the retardation of view F is 2.5 seconds.

Referring to fig. 9, if the preset interval duration is 1 second, classifying the plurality of first views to obtain a view set a, a view set B, and a view set C. The view angle set A comprises a view angle A and a view angle B, the view angle set B comprises a view angle C, a view angle D and a view angle E, and the view angle set C comprises a view angle F. The play delay interval of the view angle set a is 0 to 1 second, the play delay interval of the view angle set B is 1 to 2 seconds, and the play delay interval of the view angle set C is 2 to 3 seconds. Therefore, the plurality of first visual angles can be accurately classified, and the accuracy of prediction of the visual angles of the user is further improved.

The embodiment of the disclosure provides a method for determining a view angle set, which is used for determining a plurality of play time stamps associated with a plurality of first view angles based on a plurality of first play delays, and classifying the plurality of first view angles based on a preset interval duration and the plurality of play time stamps to obtain the plurality of view angle sets. In this way, the electronic device can divide the plurality of first view angles into a plurality of view angle sets with equal continuous play delay intervals, thereby improving the accuracy of determining the view angle sets and improving the accuracy of view angle prediction.

On the basis of any one of the above embodiments, a procedure of the above video processing method will be described below with reference to fig. 10.

Fig. 10 is a process schematic diagram of a video processing method according to an embodiment of the disclosure. Please refer to fig. 10, which includes a terminal device and an electronic device. The display page of the terminal equipment is a video playing page. The video playing page comprises a control of a video A, a control of a video B and a control of a video C. When a user clicks a control of the video A, the terminal equipment generates a video playing request, wherein the video playing request comprises an identification of the video A, and the terminal equipment sends the playing request of the video A to the electronic equipment. After receiving the playing request of the video A, the electronic equipment determines that the first video is the video A.

Referring to fig. 10, the electronic device may acquire a plurality of first play delays and a plurality of first viewing angles for a plurality of users to watch the video a. The first viewing angle includes a viewing angle a, a viewing angle B, a viewing angle C, a viewing angle D, a viewing angle E, and a viewing angle F. The retardation of view a is 0.3 seconds, the retardation of view B is 0.8 seconds, the retardation of view C is 1.1 seconds, the retardation of view D is 1.2 seconds, the retardation of view E is 1.8 seconds, and the retardation of view F is 2.5 seconds.

Referring to fig. 10, if the preset interval duration is 1 second, classifying the plurality of first views to obtain a view set a, a view set B, and a view set C. The view angle set A comprises a view angle A and a view angle B, the view angle set B comprises a view angle C, a view angle D and a view angle E, and the view angle set C comprises a view angle F. The play delay interval of the view angle set a is 0 to 1 second, the play delay interval of the view angle set B is 1 to 2 seconds, and the play delay interval of the view angle set C is 2 to 3 seconds.

Referring to fig. 10, the electronic device obtains a blended view a according to the views a and B in the view set a, obtains a blended view B according to the views C, D and E in the view set B, determines the view F in the view set C as the blended view C, and predicts the predicted views of the plurality of users in the future period by the blended view a, the blended view B and the blended view C. The electronic device may send the predicted viewing angle to the terminal device. In this way, the first view angles of the users can accurately reflect the content focused by the users on the first video, so that the influence of view angles irrelevant to watching the first video on view angle prediction is reduced, and the accuracy of view angle prediction of the users can be improved.

Fig. 11 is a schematic structural diagram of a video processing apparatus according to an embodiment of the disclosure. Referring to fig. 11, the video processing apparatus 110 includes an acquisition module 111, a classification module 112, a determination module 113, and a prediction module 114, wherein:

the obtaining module 111 is configured to obtain a plurality of first play delays of a plurality of users watching a first video and a plurality of first viewing angles when the plurality of users watch the first video, where the first play delay is a delay between a play progress of the first video watched by the user and a current push progress of the first video;

The classifying module 112 is configured to perform view classifying processing on the plurality of first views based on the plurality of first play delays and a preset interval duration, so as to obtain a plurality of view sets, where a first play delay associated with a first view in the view sets is within a play delay interval associated with the view sets, and the play delay interval is an interval determined based on the preset interval duration;

the determining module 113 is configured to determine, based on the first view angle in each view angle set, a fusion view angle associated with each view angle set, so as to obtain a plurality of fusion views;

the prediction module 114 is configured to predict a predicted view of the plurality of users over a future period based on the plurality of fused views.

In accordance with one or more embodiments of the present disclosure, the classification module 112 is specifically configured to:

determining a plurality of play time stamps associated with the plurality of first perspectives based on the plurality of first play delays;

and classifying the plurality of first view angles based on the preset interval duration and the plurality of play time stamps to obtain the plurality of view angle sets.

Determining an arrangement order of the plurality of first views based on the plurality of play time stamps;

and classifying the first view angles according to the arrangement sequence based on the preset interval duration to obtain the view angle sets.

acquiring a current playing time stamp of the first video;

a plurality of play time stamps associated with the plurality of first perspectives is determined based on the plurality of first play delays and the current play time stamp.

In accordance with one or more embodiments of the present disclosure, the determining module 113 is specifically configured to:

acquiring the view angle number of the first view angle in the view angle set;

based on the number of views, a blended view associated with the set of views is determined.

if the view number is 1, determining the first view in the view set as the fusion view;

if the view number is greater than 1, determining a plurality of view weights associated with a plurality of first views based on a plurality of first play delays associated with the plurality of first views; and determining the fusion view according to the first view angles and the view weights.

determining a plurality of sub-views based on the plurality of first views and a plurality of view weights associated with the plurality of first views;

and carrying out fusion processing on the plurality of sub-view angles to obtain the fusion view angle.

In accordance with one or more embodiments of the present disclosure, the first play delay is inversely proportional to the viewing angle weight.

The video processing device provided in the embodiments of the present disclosure may be used to execute the technical solutions of the embodiments of the methods, and the implementation principle and the technical effects are similar, and are not repeated here.

Fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. Referring to fig. 12, a schematic diagram of an electronic device 1200 suitable for implementing embodiments of the present disclosure is shown, where the electronic device 1200 may be an electronic device or an electronic device. Among other things, the electronic devices may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA for short), tablet computers (Portable Android Device, PAD for short), portable multimedia players (Portable Media Player, PMP for short), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 12 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 12, the electronic apparatus 1200 may include a processing device (e.g., a central processor, a graphics processor, etc.) 1201, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1202 or a program loaded from a storage device 1208 into a random access Memory (Random Access Memory, RAM) 1203. In the RAM 1203, various programs and data required for the operation of the electronic apparatus 1200 are also stored. The processing device 1201, the ROM 1202, and the RAM 1203 are connected to each other through a bus 1204. An input/output (I/O) interface 1205 is also connected to the bus 1204.

In general, the following devices may be connected to the I/O interface 1205: input devices 1206 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 1207 including, for example, a liquid crystal display (Liquid Crystal Display, LCD for short), a speaker, a vibrator, and the like; storage 1208 including, for example, magnetic tape, hard disk, etc.; and a communication device 1209. The communication means 1209 may allow the electronic device 1200 to communicate wirelessly or by wire with other devices to exchange data. While fig. 12 shows an electronic device 1200 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 1209, or installed from the storage device 1208, or installed from the ROM 1202. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 1201.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the methods shown in the above-described embodiments.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or electronic device. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (Local Area Network, LAN for short) or a wide area network (Wide Area Network, WAN for short), or it may be connected to an external computer (e.g., connected via the internet using an internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the unit does not in any way constitute a limitation of the unit itself, for example the first acquisition unit may also be described as "unit acquiring at least two internet protocol addresses".

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type, usage range, usage scenario, etc. of the personal information related to the present disclosure in an appropriate manner according to the relevant legal regulations.

For example, in response to receiving an active request from a user, a prompt is sent to the user to explicitly prompt the user that the operation it is requesting to perform will require personal information to be obtained and used with the user. Thus, the user can autonomously select whether to provide personal information to software or hardware such as terminal equipment, application programs, electronic equipment or storage media for executing the operation of the technical scheme of the disclosure according to the prompt information.

As an alternative but non-limiting implementation, in response to receiving an active request from a user, the manner in which the prompt information is sent to the user may be, for example, a popup, in which the prompt information may be presented in a text manner. In addition, the popup window can also bear a selection control for the user to select to provide personal information for the terminal equipment in a 'consent' or 'disagreement' mode.

It will be appreciated that the above-described notification and user authorization process is merely illustrative and not limiting of the implementations of the present disclosure, and that other ways of satisfying relevant legal regulations may be applied to the implementations of the present disclosure.

It will be appreciated that the data (including but not limited to the data itself, the acquisition or use of the data) involved in the present technical solution should comply with the corresponding legal regulations and the requirements of the relevant regulations. The data may include information, parameters, messages, etc., such as tangential flow indication information.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. A video processing method, comprising:

2. The method of claim 1, wherein the performing view classification processing on the plurality of first views based on the plurality of first play delays and a preset interval duration to obtain a plurality of view sets includes:

3. The method according to claim 2, wherein the classifying the plurality of first views based on the preset interval duration and the plurality of play time stamps to obtain the plurality of view sets includes:

4. The method of claim 2, wherein determining a plurality of play time stamps associated with a plurality of first views based on a plurality of first play delays comprises:

acquiring a current playing time stamp of the first video;

5. The method of any of claims 1-4, wherein for any one set of views; determining, based on the first view within each view set, a blended view associated with each view set, comprising:

acquiring the view angle number of the first view angle in the view angle set;

6. The method of claim 5, wherein determining the blended view associated with the set of views based on the number of views comprises:

7. The method of claim 6, wherein determining the blended view from the plurality of first views and the plurality of view weights comprises:

8. The method of claim 6 or 7, wherein the first playout delay is inversely proportional to the view weight.

9. The video processing device is characterized by comprising an acquisition module, a classification module, a determination module and a prediction module, wherein:

10. An electronic device, comprising: a processor and a memory;

the memory stores computer-executable instructions;

the processor executing computer-executable instructions stored in the memory, causing the processor to perform the video processing method of any one of claims 1-8.

11. A computer readable storage medium having stored therein computer executable instructions which, when executed by a processor, implement the video processing method of any of claims 1-8.

12. A computer program product comprising a computer program which, when executed by a processor, implements the video processing method according to any one of claims 1-8.