CN110381353B

CN110381353B - Video scaling method, device, server, client and storage medium

Info

Publication number: CN110381353B
Application number: CN201910696677.6A
Authority: CN
Inventors: 尹小玉; 杨德兴
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-07-30
Filing date: 2019-07-30
Publication date: 2022-08-23
Anticipated expiration: 2039-07-30
Also published as: CN110381353A

Abstract

The disclosure relates to a video scaling method, a video scaling device, a server, a client and a storage medium. The scheme comprises the following steps: the server side obtains a video to be processed, and a video frame of the video to be processed comprises subtitle information; calculating caption proportion values of caption information of a video to be processed in a plurality of preset directions; and sending the plurality of caption proportion values to the client. And when the plurality of subtitle proportion values are all larger than or equal to a preset proportion threshold value, the client performs scaling processing on the video frame of the video to be processed to determine a target video frame. By applying the technical scheme provided by the embodiment of the disclosure, the problem that when the video frame is filled in the screen of the equipment, the video frame including the subtitle information is completely or partially cut off, so that the video watching effect is poor is solved.

Description

Video scaling method, device, server, client and storage medium

Technical Field

The present disclosure relates to the field of video processing technologies, and in particular, to a video scaling method, apparatus, server, client, and storage medium.

Background

At present, when a video is played, in order to improve the immersion of video playing, a device scales video frames in an equal proportion, so that the scaled video frames fill the screen of the device.

However, the aspect ratio of the screen is different for different devices. The aspect ratio of the video frame and the aspect ratio of the screen are not necessarily equal. If the aspect ratio of the video frame is not equal to the aspect ratio of the screen, the video frame is scaled in equal proportion, so that when the scaled video frame fills the screen of the equipment, the video frame is necessarily required to be cut, and the cut video frame area accounts for up to 17% of the total area of the video frame. Based on this, if the video frame includes the subtitle information, the subtitle information is likely to be cut out entirely or partially, resulting in poor video viewing effect.

Disclosure of Invention

The present disclosure provides a video zooming method, apparatus, server, client and storage medium, so as to at least solve the problem that when a video frame fills up a screen of a device, the video frame, including subtitle information, is completely or partially cut off, resulting in poor video viewing effect. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a video scaling method, applied to a server, including:

acquiring a video to be processed, wherein a video frame of the video to be processed comprises subtitle information;

calculating caption proportion values of the caption information of the video to be processed in a plurality of preset directions;

sending the plurality of subtitle proportion values to a client, so that when the plurality of subtitle proportion values are all larger than or equal to a preset proportion threshold value, the client performs scaling processing on a video frame of the video to be processed, and determines a target video frame, wherein the length of the target video frame in the plurality of preset directions is the same as the length of a screen of the client in the plurality of preset directions.

Optionally, the calculating of the scale values of the subtitles of the subtitle information of the video to be processed in multiple preset directions includes;

determining distances between caption information included in each video frame of the video to be processed and borders in a plurality of preset directions respectively to obtain a plurality of caption distances;

selecting the smallest subtitle distance from a plurality of subtitle distances corresponding to the frame in each preset direction as the frame distance corresponding to the frame in the preset direction;

and calculating the ratio of each frame distance to the length of the video frame of the video to be processed in the corresponding preset direction respectively to obtain a subtitle proportion value in each preset direction.

Optionally, each video frame of the video to be processed includes a plurality of subtitle information, and the step of determining distances between the subtitle information included in the video frame and borders in a plurality of preset directions to obtain a plurality of subtitle distances includes:

and determining the distance between each subtitle information and a frame in a plurality of preset directions according to the position coordinates and the length of the text frame in which the subtitle information is positioned in the plurality of preset directions, so as to obtain a plurality of subtitle distances.

Optionally, the original point of the video frame of the video to be processed is a pixel point at the top left corner of the video frame of the video to be processed, the position coordinate of the text box where each subtitle information is located in the preset direction includes a target coordinate of the pixel point at the top left corner of the text box where the subtitle information is located in the preset direction, and the length of the text box where each subtitle information is located in the preset direction includes a target length of the text box where the subtitle information is located in the preset direction;

the step of determining distances between the caption information and each frame in a plurality of preset directions according to the position coordinates and the lengths of the text box in which the caption information is located in the plurality of preset directions to obtain a plurality of caption distances includes:

determining the target coordinates of the text box in which the subtitle information is located in each preset direction as the subtitle distance between the subtitle information and the frame in the preset direction; or

And determining the difference value between the length of the video frame of the video to be processed in each preset direction and a target sum value as the caption distance between the caption information and a frame in the preset direction, wherein the target sum value is the sum value of the target coordinate and the target length of a text frame in which the caption information is positioned in the preset direction.

According to a second aspect of the embodiments of the present disclosure, there is provided a video scaling method applied to a client, including:

receiving caption proportion values of caption information of a video to be processed, which is sent by a server side, in a plurality of preset directions;

judging whether the caption proportion values are all larger than or equal to a preset proportion threshold value;

if yes, performing scaling processing on the video frame of the video to be processed, and determining a first target video frame, wherein the lengths of the first target video frame in multiple preset directions are the same as the lengths of the screen of the client in multiple preset directions.

Optionally, the caption proportion value is a ratio of a frame distance to a length of a video frame of the video to be processed in a preset direction, and the frame distance is a minimum value of distances between caption information included in the video frame of the video to be processed and the frame in the preset direction.

Optionally, the step of performing scaling processing on the video frame of the video to be processed to determine the first target video includes:

judging whether the aspect ratio of the video frame of the video to be processed is larger than that of the screen or not;

if so, scaling the video frame of the video to be processed in an equal ratio to obtain a first scaled video frame, wherein the width of the first scaled video frame is the same as that of the screen; clipping the first zoomed video frame in the height direction of the first zoomed video frame to obtain a first sub-target video frame, wherein the height of the first sub-target video frame is the same as the height of the screen;

if not, scaling the video frame of the video to be processed in an equal ratio to obtain a second scaled video frame, wherein the height of the second scaled video frame is the same as that of the screen; and clipping the second zoom video frame in the width direction of the second zoom video frame to obtain a second sub-target video frame, wherein the width of the second sub-target video frame is the same as the width of the screen.

Optionally, the method further includes:

the preset direction comprises the width direction of the video frame of the video to be processed, if the caption proportion values smaller than the preset proportion threshold exist in the caption proportion values, the video frame of the video to be processed is scaled in an equal proportion mode, a second target video frame is determined, and the width of the second target video frame is the same as the width of the screen;

and the preset direction comprises the height direction of the video frame of the video to be processed, if the caption proportion values smaller than the preset proportion threshold exist in the caption proportion values, the video frame of the video to be processed is scaled in an equal ratio, a third target video frame is determined, and the height of the third target video frame is the same as the height of the screen.

Optionally, the step of scaling the video frame of the video to be processed in an equal ratio and determining a second target video frame includes:

scaling the video frame of the video to be processed in an equal ratio to obtain a third scaled video frame, wherein the width of the third scaled video frame is the same as that of the screen; splicing a first comment frame for the third zoomed video frame in the height direction of the third zoomed video frame to obtain a second target video frame, wherein the height of the second target video frame is the same as the height of the screen;

the step of scaling the video frame of the video to be processed in an equal ratio and determining a third target video frame includes:

scaling the video frame of the video to be processed in an equal ratio to obtain a fourth scaled video frame, wherein the height of the fourth scaled video frame is the same as that of the screen; and splicing a second comment frame for the fourth zoomed video frame in the width direction of the fourth zoomed video frame to obtain a third target video frame, wherein the width of the third target video frame is the same as the width of the screen.

According to a third aspect of the embodiments of the present disclosure, there is provided a video scaling apparatus, applied to a server, including:

the acquisition module is configured to execute acquisition of a video to be processed, and a video frame of the video to be processed comprises subtitle information;

the calculation module is configured to calculate the caption proportion values of the caption information of the video to be processed in a plurality of preset directions;

the sending module is configured to send the plurality of subtitle proportion values to a client, so that when the plurality of subtitle proportion values are all larger than or equal to a preset proportion threshold value, the client performs scaling processing on a video frame of the video to be processed, and determines a target video frame, wherein the length of the target video frame in the preset direction is the same as the length of a screen of the client in the preset direction.

Optionally, the computing module is configured to specifically execute:

selecting the smallest caption distance from a plurality of caption distances corresponding to the frame in each preset direction as the frame distance corresponding to the frame in the preset direction;

and calculating the ratio of each frame distance to the length of the video frame of the video to be processed in the corresponding preset direction respectively to obtain a caption proportion value in each preset direction.

Optionally, each video frame of the video to be processed includes a plurality of subtitle information, and the computing module is configured to specifically execute:

the computing module is configured to specifically perform:

According to a fourth aspect of the embodiments of the present disclosure, there is provided a video scaling apparatus applied to a client, including:

the receiving module is configured to execute the caption proportion values of the caption information of the video to be processed, which is sent by the receiving server, in a plurality of preset directions;

the judging module is configured to judge whether the plurality of subtitle proportion values are all larger than or equal to a preset proportion threshold value;

and the zooming module is configured to zoom the video frame of the video to be processed and determine a first target video frame under the condition that the judgment result of the judging module is yes, wherein the lengths of the first target video frame in a plurality of preset directions are the same as the lengths of the screen of the client in the plurality of preset directions.

Optionally, the scaling module is configured to specifically perform:

if not, scaling the video frame of the video to be processed in an equal ratio to obtain a second scaled video frame, wherein the height of the second scaled video frame is the same as that of the screen; and clipping the first zoom video frame in the width direction of the second zoom video frame to obtain a second sub-target video frame, wherein the width of the second sub-target video frame is the same as the width of the screen.

Optionally, the scaling module is configured to further perform:

Optionally, the scaling module is configured to specifically perform:

the preset direction comprises the width direction of the video frame of the video to be processed, the video frame of the video to be processed is scaled in an equal ratio to obtain a third scaled video frame, and the width of the third scaled video frame is the same as that of the screen; splicing a first comment frame for the third zoomed video frame in the height direction of the third zoomed video frame to obtain a second target video frame, wherein the height of the second target video frame is the same as the height of the screen;

the preset direction comprises the height direction of the video frame of the video to be processed, and the video frame of the video to be processed is scaled in an equal ratio to obtain a fourth scaled video frame, wherein the height of the fourth scaled video frame is the same as the height of the screen; and splicing a second comment frame for the fourth zoomed video frame in the width direction of the fourth zoomed video frame to obtain a third target video frame, wherein the width of the third target video frame is the same as the width of the screen.

According to a fifth aspect of the embodiments of the present disclosure, there is provided a server, including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement any one of the video scaling methods applied to the server.

According to a sixth aspect of embodiments of the present disclosure, there is provided a client, including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement any of the video scaling methods applied to the client.

According to a seventh aspect of the embodiments of the present disclosure, there is provided a storage medium, where instructions, when executed by a processor of an electronic device, cause the electronic device to execute any one of the video scaling methods applied to a server.

According to an eighth aspect of the embodiments of the present disclosure, there is provided a storage medium, wherein instructions, when executed by a processor of an electronic device, cause the electronic device to execute any one of the video scaling methods applied to a client.

According to a ninth aspect of the embodiments of the present disclosure, there is provided a computer program product, which when executed by a processor of an electronic device, causes the electronic device to implement any one of the video scaling methods applied to a server.

According to a tenth aspect of the embodiments of the present disclosure, there is provided a computer program product, which when executed by a processor of an electronic device, causes the electronic device to implement any one of the video scaling methods applied to a client.

The technical scheme provided by the embodiment of the disclosure at least has the following beneficial effects:

in the embodiment of the disclosure, the server is provided with a preset direction, and the preset direction is a direction in which subtitle information may be cut off when a screen of the client is filled with video frames. And the server calculates the caption proportion values of the caption information of the video to be processed in a plurality of preset directions and sends the caption proportion values to the client. After the client acquires the multiple subtitle proportion values, if the multiple subtitle proportion values are determined to be larger than or equal to the preset proportion threshold value, the fact that the subtitle information is far away from the edge of the video frame in the preset direction can be determined, the video frame is filled in a screen of the client, when the video frame is cut, the subtitle information included in the video frame cannot be cut in the preset direction, the video frame of the video to be processed is zoomed, and the target video frame is determined. The length of the target video frame in the preset direction is the same as the length of the screen of the client in the preset direction. The method avoids cutting off the subtitle information in the preset direction, and solves the problem that when the video frame is filled in the screen of the equipment, the video frame including the subtitle information is completely or partially cut off, so that the video watching effect is poor.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 a-1 c are schematic diagrams of a video frame shown in accordance with an example embodiment.

Fig. 2 is a schematic diagram of the video frame filling screen of fig. 1 a.

Fig. 3 is a block diagram illustrating a video scaling system in accordance with an exemplary embodiment.

Fig. 4 is a flow chart illustrating a video scaling method according to an exemplary embodiment.

Fig. 5 is a flow diagram illustrating another method of video scaling according to an example embodiment.

FIG. 6 is a flow diagram illustrating another method of video scaling according to an example embodiment.

Fig. 7 is a schematic diagram illustrating another video frame in accordance with an example embodiment.

Fig. 8 is a flow chart illustrating another video scaling method according to an example embodiment.

Fig. 9 is a flow chart illustrating another video scaling method according to an example embodiment.

Fig. 10 a-10 e are schematic diagrams illustrating a video frame filling screen according to an exemplary embodiment.

FIG. 11 is a flow chart illustrating another method of video scaling according to an example embodiment.

Fig. 12 a-12 d are schematic diagrams illustrating a video frame filling screen according to an exemplary embodiment.

Fig. 13 is a signaling diagram illustrating a video scaling according to an example embodiment.

Fig. 14 is a block diagram illustrating a video scaling apparatus according to an example embodiment.

Fig. 15 is a block diagram illustrating another video scaling apparatus according to an example embodiment.

Fig. 16 is a block diagram illustrating a server in accordance with an example embodiment.

Fig. 17 is a block diagram illustrating a client according to an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in other sequences than those illustrated or described herein. The implementations described in the exemplary embodiments below do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the disclosure, as detailed in the appended claims.

At present, in order to improve the viewing effect of a user, video frames are scaled proportionally, so that the scaled video frames fill up the screen of the device. When the video frame is filled in the screen of the equipment, the video frame needs to be cut, and the area of the cut video frame accounts for up to 17 percent of the total area of the video frame. Based on this, if the video frame includes the subtitle information, the subtitle information is likely to be cut out completely or partially, resulting in poor video viewing effect.

For example, as shown in fig. 1a, the video frame includes subtitle information a "a 012345678901234567890123B". As shown in fig. 1b, the video frame includes subtitle information b "a 0123456789B" and subtitle information C "C012D". As shown in fig. 1c, the video frame includes the subtitle information d "eabcodefghif" and the subtitle information e "S0T". The description will be given by taking the video frame shown in fig. 1a as an example. The device scales the video frame equally to fill the screen of the device as shown in fig. 2. After the video frame fills the screen of the device, a part of the video frame, such as the cropping area shown in fig. 2, needs to be cropped, and at this time, the part of the character a and the part of the character B in the subtitle information a are cropped, resulting in poor video viewing effect.

In order to solve the problem that when a video frame fills the screen of a device, the video frame including subtitle information is completely or partially cut off, which results in poor video viewing effect, an embodiment of the present disclosure provides a video scaling system. As shown in fig. 3, the video scaling system includes a server 301 and a client 302. The client 302 is an electronic device having a screen, for example, the client 302 may be a mobile device such as a mobile phone and a tablet computer, or may be a device such as a PC (Personal computer).

Based on the video scaling system, the embodiment of the disclosure provides a video scaling method. As shown in fig. 4, fig. 4 is a flow chart illustrating a video scaling method according to an exemplary embodiment. The video scaling method is applied to the server 301 shown in fig. 3, and includes the following steps.

In step S401, a video to be processed is obtained, and a video frame of the video to be processed includes subtitle information.

When a client plays a certain video, a video request aiming at the video is sent to a server. And after receiving the video request, the server acquires the video as a video to be processed. The video to be processed comprises a plurality of video frames.

In the embodiment of the present disclosure, at least one video frame of the video to be processed includes subtitle information, and one video frame may include one or more pieces of subtitle information. The video frame shown in fig. 1a includes one subtitle information, and the video frames shown in fig. 1b and 1c include a plurality of subtitle information. Each subtitle information includes one or more characters.

In step S402, caption scale values of caption information of a video to be processed in a plurality of preset directions are calculated.

The preset direction is a direction in which subtitle information may be cut off when the video frames fill the screen of the client. The preset direction can be set according to actual requirements. The plurality of preset directions may include left, right, up and/or down directions of video frames of the video to be processed. The left and right directions of the video frame of the video to be processed belong to the width direction of the video frame of the video to be processed. The upper direction and the lower direction of the video frame of the video to be processed belong to the height direction of the video frame of the video to be processed.

In one example, if the video frame includes a piece of subtitle information, the preset direction may be set based on an arrangement direction of the subtitle information, which is an arrangement direction of the subtitle information including characters. For example, as shown in fig. 1a, the arrangement direction of the subtitle information a is a horizontal direction, that is, a width direction of a video frame, it may be determined that the subtitle information may be cut off when the width direction of the video frame is a direction in which the video frame fills up a screen of a client, and the preset direction may be set to be a left direction and a right direction of the video frame. For another example, if the video frame includes a piece of subtitle information, and the arrangement direction of the subtitle information is a vertical direction, that is, the height direction of the video frame, it may be determined that the subtitle information may be cut off when the height direction of the video frame is a direction in which the video frame fills the screen of the client, and the preset direction may be set to the upper and lower directions of the video frame.

In another example, if the video frame includes a plurality of pieces of subtitle information, if the plurality of pieces of subtitle information include subtitle information whose arrangement direction is a height direction and subtitle information whose arrangement direction is a width direction, it may be determined that the subtitle information may be cut out when both the height direction and the width direction of the video frame fill the screen of the client, and the preset direction may be set to include four directions of left, right, top, and bottom of the video frame. For example, as shown in fig. 1c, the arrangement direction of the subtitle information d is the width direction of the video frame, the arrangement direction of the subtitle information e is the height direction of the video frame, and the preset direction may be set to include four directions of left, right, up, and down of the video frame.

If the arrangement directions of the plurality of pieces of subtitle information are all width directions, it can be determined that the subtitle information may be cut off when the width direction of the video frame is a direction in which the video frame fills the screen of the client, and the preset direction can be set to include the left direction and the right direction of the video frame. For example, as shown in fig. 1b, the arrangement directions of the subtitle information b and the subtitle information c are both the width direction of the video frame, and the preset direction may be set to include the width direction of the video frame.

If the arrangement directions of the plurality of pieces of subtitle information are all height directions, it can be determined that the subtitle information may be cut out when the height direction of the video frame is a direction in which the video frame fills the screen of the client, and the preset direction can be set to include an upper direction and a lower direction of the video frame.

The caption proportion value of the caption information in the preset direction can be determined according to the ratio of the caption distance to the length of the video to be processed in the preset direction. The caption distance may be a distance between the caption information and a frame of the video to be processed in the preset direction.

In the embodiment of the disclosure, after the server side obtains the video to be processed, the server side calculates the caption proportion values of the caption information of the video to be processed in a plurality of preset directions.

In step S403, the multiple subtitle proportion values are sent to the client, so that when the multiple subtitle proportion values are all greater than or equal to the preset proportion threshold, the client performs scaling processing on the video frame of the video to be processed, and determines a target video frame, where lengths of the target video frame in multiple preset directions are the same as lengths of a screen of the client in multiple preset directions.

The preset proportion threshold is set according to actual requirements. If the caption proportion value is larger than or equal to the preset proportion threshold value, it can be determined that the caption information is far away from the edge of the video frame in the preset direction of the caption distance corresponding to the caption proportion value, and the caption information cannot be cut off when the video frame is filled in the screen of the client. If the subtitle proportion value is smaller than the preset proportion threshold value, it can be determined that the subtitle information is closer to the edge of the video frame in the preset direction of the subtitle distance corresponding to the subtitle proportion value, and the subtitle information can be cut off when the video frame is filled in the screen of the client. Alternatively, the preset proportion threshold may be set to 6%, 7%, etc.

And after the server side calculates a plurality of subtitle proportion values, the server side sends the plurality of subtitle proportion values to the client side according to a form agreed with the client side in advance. The client receives the plurality of subtitle proportion values, if the plurality of subtitle proportion values are all larger than or equal to a preset proportion threshold value, the subtitle information is determined not to be cut off when the video frames are filled in the screen of the client in the preset direction, the video frames of the video to be processed are subjected to zooming processing, the target video frames are determined, the target video frames are filled in the screen of the client, and the length of the target video frames in the preset direction is the same as the length of the screen of the client in the preset direction. The method avoids cutting off the subtitle information in the preset direction, and solves the problem that when the video frame is filled in the screen of the equipment, the video frame including the subtitle information is completely or partially cut off, so that the video watching effect is poor.

Under the condition that each video frame of the video to be processed comprises a plurality of subtitle information, based on the embodiment of the video scaling method shown in fig. 4, the embodiment of the present disclosure further provides a video scaling method. Referring to fig. 5, fig. 5 is a flow chart illustrating another video scaling method according to an example embodiment. The method is applied to the server and can comprise the following steps.

In step S501, a video to be processed is acquired, and a video frame of the video to be processed includes subtitle information. Step S501 is the same as step S401.

In step S502, for each video frame of the video to be processed, distances between the caption information included in the video frame and borders in a plurality of preset directions are determined, so as to obtain a plurality of caption distances.

For example, the video to be processed includes video frame F ₁ 、F ₂ And F ₃ . The preset directions include left and right directions of the video frame.

For video frame F ₁ The server determines the video frame F ₁ The distance between the included caption information and the frame in the left direction (i.e., the left frame) is obtained as the caption distance d ₀₁₁ The server determines the video frame F ₁ The distance between the included caption information and the frame in the right direction (i.e., the right frame) is obtained as the caption distance d ₀₁₂ 。

For video frame F ₂ The server determines the video frame F ₂ The distance between the included caption information and the left frame is obtained to obtain the caption distance d ₀₂₁ The server determines the video frame F ₂ The distance between the included caption information and the right frame is obtained as the caption distance d ₀₂₂ 。

For video frame F ₃ The server determines the video frame F ₃ The distance between the included caption information and the left frame is obtained as the caption distance d ₀₃₁ The server determines the video frame F ₃ The distance between the included caption information and the right frame is obtained to obtain the caption distance d ₀₃₂ 。

At this time, the server obtains a plurality of caption distances, which are respectively: caption distance d corresponding to left frame ₀₁₁ 、d ₀₂₁ 、d ₀₃₁ Distance d of caption corresponding to right frame ₀₁₂ 、d ₀₂₂ 、d ₀₃₂ 。

In step S503, the minimum caption distance is selected from the plurality of caption distances corresponding to the borders in each preset direction, and the minimum caption distance is used as the border distance corresponding to the borders in the preset direction.

And aiming at the frame in each preset direction, after the server side obtains a plurality of caption distances corresponding to the frame in the preset direction, selecting the minimum caption distance from the plurality of caption distances corresponding to the frame in the preset direction as the frame distance corresponding to the frame.

The description will be made by taking the example in step S502 as an example. If d is ₀₁₁ <d ₀₂₁ <d ₀₃₁ ，d ₀₁₂ >d ₀₃₂ >d ₀₂₂ Then the server side is started from d ₀₁₁ 、d ₀₂₁ 、d ₀₃₁ In (1), selecting d ₀₁₁ As the frame distance corresponding to the left frame, from d ₀₁₂ 、d ₀₂₂ 、d ₀₃₂ In (1), select d ₀₂₂ As the frame distance corresponding to the right frame.

In step S504, a ratio between each frame distance and a length of a video frame of the video to be processed in a corresponding preset direction is calculated to obtain a subtitle proportion value in each preset direction.

If the preset direction is the left direction and the right direction of the video frame of the video to be processed, the length of the video frame of the video to be processed in the preset direction is the width of the video frame of the video to be processed. If the preset direction is the upper direction and the lower direction of the video frame of the video to be processed, the length of the video frame of the video to be processed in the preset direction is the height of the video frame of the video to be processed. And aiming at each determined frame distance, the server calculates the ratio of the frame distance to the length of the video frame of the video to be processed in the corresponding preset direction to obtain a caption proportion value in each preset direction.

The example in step 503 is still used as an example for explanation. The server side obtains the frame distance d corresponding to the left frame ₀₁₁ The frame distance d corresponding to the right frame ₀₂₂ . Suppose the width of a video frame of a video to be processed is D ₀ Then the server can calculate the frame distance d ₀₁₁ Corresponding caption scale value, i.e. caption scale value delta in the left direction ₁ ＝d ₀₁₁ /D ₀ Distance d of the frame ₀₂₂ Corresponding caption scale value, i.e. caption scale value delta in the right direction ₂ ＝d ₀₂₂ /D ₀ 。

In step S505, the multiple subtitle proportion values are sent to the client, so that when the multiple subtitle proportion values are all greater than or equal to the preset proportion threshold, the client performs scaling processing on the video frame of the video to be processed, and determines a target video frame, where lengths of the target video frame in multiple preset directions are the same as lengths of a screen of the client in multiple preset directions. Step S505 is the same as step S403.

In the embodiment of the present disclosure, for each preset direction, the server selects a minimum subtitle distance from a plurality of subtitle distances corresponding to the frame in the preset direction, and uses the minimum subtitle distance as the frame distance corresponding to the frame in the preset direction, that is, a frame distance is determined in a preset direction. And calculating a subtitle proportion value in a preset direction based on a frame distance in the preset direction, and sending the subtitle proportion value to the client. The client performs unified zooming processing on the video frames of the video to be processed according to one subtitle proportion value in each preset direction, and the watching effect of the video is further improved.

In the case that each video frame of the video to be processed includes a plurality of subtitle information, based on the embodiment of the video scaling method shown in fig. 5, the embodiment of the present disclosure further provides a video scaling method. Referring to fig. 6, fig. 6 is a flow chart illustrating another video scaling method according to an example embodiment. The method is applied to the server and can comprise the following steps.

In step S601, a video to be processed is obtained, and a video frame of the video to be processed includes subtitle information. Step S601 is the same as step S501.

In step S602, for each subtitle information included in each video frame of the video to be processed, distances between the subtitle information and borders in multiple preset directions are determined according to position coordinates and lengths of a text box in which the subtitle information is located in the multiple preset directions, so as to obtain multiple subtitle distances.

Each video frame includes a plurality of subtitle information. And for each subtitle information, the server determines the distance between the subtitle information and the borders in the plurality of preset directions respectively according to the position coordinates and the lengths of the text box in which the subtitle information is located in the plurality of preset directions, so as to obtain a plurality of subtitle distances.

In the embodiment of the present disclosure, the caption distance between the caption information and the frame in the preset direction is: and in a preset direction, the minimum value of the distance between each pixel point included in the text box where the caption information is located and the frame.

In an optional embodiment, the origin of the video frame of the video to be processed is a pixel point at the upper left corner of the video frame of the video to be processed. The position coordinates of the text box where each subtitle information is located in the preset direction comprise target coordinates of pixel points at the upper left corner of the text box where the subtitle information is located in the preset direction. The length of each text box in which the subtitle information is located in the preset direction comprises the target length of the text box in which the subtitle information is located in the preset direction. The target coordinates of the pixel point at the upper left corner of the text box where the caption information is located in the preset direction are as follows: and the distance between the pixel point at the upper left corner of the text box where the caption information is located and the pixel point at the upper left corner of the video frame of the video to be processed in the preset direction.

And for each video frame comprising each subtitle information, the server determines the target coordinates of the text box where the subtitle information is located in each preset direction as the subtitle distance between the subtitle information and the frame in the preset direction. Or, for each video frame including each subtitle information, the server determines a difference between a length of the video frame of the video to be processed in each preset direction and a target sum value as a subtitle distance between the subtitle information and a frame in the preset direction, and the target sum value is a sum value of a target coordinate and a target length of a text box in which the subtitle information is located in the preset direction.

The description will be given by taking the video frame shown in fig. 7 as an example. The video frame shown in fig. 7 includes subtitle information 1, subtitle information 2, and subtitle information 3. The text box in which the subtitle information 1 is located is a text box 1, the text box in which the subtitle information 2 is located is a text box 2, and the text box in which the subtitle information 3 is located is a text box 3. The preset directions include left and right directions of the video frame.

Black dots p in fig. 7 ₁ Coordinates x in both left and right directions ₁ Black dots p for the position coordinates of the text box 1 in both the left and right directions ₂ Coordinates x in both left and right directions ₂ Black dots p for the position coordinates of the text box 2 in both the left and right directions ₃ Coordinates x in both left and right directions ₃ The position coordinates of the text box 3 in both the left and right directions. The length of the text box 1 in the left and right directions is w ₁ The length of the text box 2 in the left and right directions is w ₂ The length of the text box 3 in the left and right directions is w ₃ . Video frame width of D _w Height of video frame D _h 。

The server side locates the coordinate x of the text box 1 of the caption information 1 ₁ Determining the caption distance 11 between the caption information 1 and the left frame 11, and setting the coordinate x of the text frame 2 where the caption information 2 is located ₂ Determining the coordinate x of the text box 3 where the caption information 3 is located as the caption distance 12 between the caption information 2 and the left frame 11 ₃ Determined as the subtitle distance 13 between the subtitle information 3 and the left frame 11. At this time, the server obtains a plurality of caption distances corresponding to the frame 11.

In addition, the server calculates the coordinate x of the text box 1 where the caption information 1 is located ₁ The length w of the text box 1 in the left and right directions ₁ Sum of (x) ₁ +w ₁ ) Calculating the coordinate x of the text box 2 where the caption information 2 is located ₂ The length w of the text box 2 in the left and right directions ₂ Sum of (x) ₂ +w ₂ ) Calculating the coordinate x of the text box 3 where the caption information 3 is located ₃ The length w of the text box 3 in the left and right directions ₃ Sum of (x) ₃ +w ₃ ). The server determines the caption distance 21 between the caption information 1 and the right frame 12 as D _w -(x ₁ +w ₁ ) Determining the caption distance 22 between the caption information 2 and the right frame 12 as D _w -(x ₂ +w ₂ ) Determining the caption distance 23 between the caption information 3 and the right border 12 as D _w -(x ₃ +w ₃ ). At this time, the server obtains a plurality of caption distances corresponding to the frame 12.

As another example, the preset directions include four directions of left, right, up, and down of the video frame. Black dots p in fig. 7 ₁ Coordinates x in both left and right directions ₁ And a coordinate y in both the up and down directions ₁ As position coordinates of the text box 1, black dots p ₂ Coordinates in both left and right directionsx ₂ And a coordinate y in both the up and down directions ₂ As position coordinates of the text box 2, black dots p ₃ Coordinates x in both left and right directions ₃ And a coordinate y in both the up and down directions ₃ Is the position coordinates of the text box 3. The length of the text box 1 in the left and right directions is w ₁ The length of the text box 2 in the left and right directions is w ₂ The length of the text box 3 in the left and right directions is w ₃ . The length of the text box 1 in the up and down directions is h ₁ The length of the text box 2 in the up and down directions is h ₂ The length of the text box 3 in the up and down directions is h ₃ 。

The server obtains a plurality of caption distances corresponding to the frame 11 (refer to the above description), and may further determine the coordinate y of the text box 1 where the caption information 1 is located ₁ The caption distance 31 between the caption information 1 and the upper frame 13 (i.e., the upper frame 13) is determined, and the coordinate y of the text box 2 in which the caption information 2 is located is determined ₂ The caption distance 32 between the caption information 2 and the upper frame 13 is determined, and the coordinate y of the text box 3 where the caption information 3 is located is determined ₃ The subtitle distance 33 between the subtitle information 3 and the upper border 13 is determined. At this time, the server obtains a plurality of caption distances corresponding to the frame 13.

In addition, the server calculates the coordinate y of the text box 1 where the caption information 1 is located ₁ The length h of the text box 1 in the upper and lower directions ₁ Sum of (y) ₁ +h ₁ ) Calculating the coordinate y of the text box 2 where the character subtitle information 2 is located ₂ Length h of text box 2 in up and down directions ₂ Sum of (y) ₂ +h ₂ ) Calculating the coordinate y of the text box 3 where the caption information 3 is located ₃ And the length h of the text box 3 in the upper and lower directions ₃ Sum of (y) ₃ +h ₃ ). The server determines that the character distance 41 between the caption information 1 and the frame 14 in the height direction (i.e., the lower frame 14) is D _h -(y ₁ +h ₁ ) Determining the character distance 42 between the caption information 2 and the lower frame 14 as D _h -(y ₂ +h ₂ ) Determining the character distance between the caption information 3 and the lower frame 1443 is D _h -(y ₃ +h ₃ ). At this time, the server obtains a plurality of caption distances corresponding to the frame 14.

In this embodiment of the present disclosure, the origin of the video frame of the video to be processed may also be a pixel point at the upper right corner, a pixel point at the lower left corner, or a pixel point at the lower right corner of the video frame of the video to be processed, which is not specifically limited to this, as long as it is ensured that the caption distance between the caption information and the frame in the preset direction is: and in a preset direction, the minimum value of the distance between each pixel point included in the text box where the caption information is located and the frame. In the embodiment of the disclosure, the server may identify the subtitle information by using a preset identification algorithm, and further determine position coordinates and lengths of the text box where each subtitle information is located in a plurality of preset directions. The preset Recognition algorithm may include, but is not limited to, an OCR (Optical Character Recognition) algorithm, a convolutional neural network model, and the like.

In step S603, a minimum caption distance is selected from a plurality of caption distances corresponding to the borders in each preset direction, and the minimum caption distance is used as the border distance corresponding to the borders in the preset direction. Step S603 is the same as step S503.

The description will be made by taking an example in step S602. The server obtains a plurality of caption distances corresponding to the left frame 11: { x ₁ ，x ₂ ，x ₃ And a plurality of caption distances corresponding to the right frame 12: { D _w -(x ₁ +w ₁ )，D _w -(x ₂ +w ₂ )，D _w -(x ₃ +w ₃ ) A plurality of caption distances corresponding to the upper frame 13: { y ₁ ，y ₂ ，y ₃ A plurality of caption distances corresponding to the lower frame 14: { D _h -(y ₁ +h ₁ )，D _h -(y ₂ +h ₂ )，D _h -(y ₃ +h ₃ )}。

Wherein x is ₁ <x ₂ <x ₃ From { x } server ₁ ，x ₂ ，x ₃ Select x from } ₁ As the frame distance corresponding to the left frame 11. D _w -(x ₃ +w ₃ )<D _w -(x ₁ +w ₁ )<D _w -(x ₂ +w ₂ ) Server slave { D _w -(x ₁ +w ₁ )，D _w -(x ₂ +w ₂ )，D _w -(x ₃ +w ₃ ) Selecting D from _w -(x ₃ +w ₃ ) As the frame distance corresponding to the right frame 12. y is ₃ <y ₂ <y ₁ From { y } server ₁ ，y ₂ ，y ₃ Select y of ₃ As the frame distance corresponding to the upper frame 13. D _h -(y ₁ +h ₁ )<D _h -(y ₂ +h ₂ )<D _h -(y ₃ +h ₃ ) Server slave { D _h -(y ₁ +h ₁ )，D _h -(y ₂ +h ₂ )，D _h -(y ₃ +h ₃ ) Selecting D from _h -(y ₁ +h ₁ ) As the bezel distance corresponding to the lower bezel 14.

In step S604, a ratio between each frame distance and a length of a video frame of the video to be processed in a corresponding preset direction is calculated, so as to obtain a subtitle proportion value in each preset direction. Step S604 is the same as step S504.

In step S605, the multiple subtitle proportion values are sent to the client, so that when the multiple subtitle proportion values are all greater than or equal to the preset proportion threshold, the client performs scaling processing on the video frame of the video to be processed, and determines a target video frame, where lengths of the target video frame in multiple preset directions are the same as lengths of a screen of the client in multiple preset directions. Step S605 is the same as step S505.

By applying the embodiment shown in fig. 6, the position coordinates and the lengths of the text boxes in which the subtitle information is located in the preset direction are adopted, and the distance between the subtitle information and the border of the video frame in the preset direction is accurately determined, so that the client is prompted to accurately fill the screen with the video frame of the video to be processed.

Based on the embodiments of the video scaling method shown in fig. 4-6, the embodiments of the present disclosure further provide a video scaling method. Referring to fig. 8, fig. 8 is a flow chart illustrating another video scaling method according to an example embodiment. The method is applied to the client and comprises the following steps.

In step S801, a plurality of caption scale values of the caption information of the video to be processed sent by the service end in a plurality of preset directions are received.

In one example, the caption proportion value may be a ratio of a frame distance to a length of a video frame of the video to be processed in a preset direction, where the frame distance is a minimum value of distances between caption information included in the video frame of the video to be processed and the frame in the preset direction.

In the embodiment of the present disclosure, the preset direction is a direction in which subtitle information may be cut out when the video frame fills the screen of the client. The preset direction can be set according to actual requirements. The preset directions may include four directions of left, right, up and/or down of a video frame of the video to be processed. The process of calculating the multiple subtitle proportion values by the server may refer to the description in fig. 4 and fig. 6, and details are not repeated here.

In step S802, it is determined whether each of the caption scale values is greater than or equal to a preset scale threshold, and if yes, step S803 is executed.

The preset proportion threshold is set according to actual requirements. If the caption proportion value is larger than or equal to the preset proportion threshold value, it can be determined that the caption information is far away from the edge of the video frame in the preset direction of calculating the caption distance corresponding to the caption proportion value, and the caption information cannot be cut off when the video frame is filled in the screen of the client. If the subtitle proportion value is smaller than the preset proportion threshold value, it can be determined that the subtitle information is closer to the edge of the video frame in the preset direction of calculating the subtitle distance corresponding to the subtitle proportion value, and the subtitle information can be cut off when the video frame is filled in the screen of the client.

After receiving the multiple subtitle proportion values, the client judges whether subtitle information is subtracted in a preset direction when the video frame is filled in a screen of the client, namely whether the multiple subtitle proportion values are all larger than or equal to a preset proportion threshold value.

In step S803, a video frame of the video to be processed is scaled to determine a first target video frame, where lengths of the first target video frame in multiple preset directions are the same as lengths of a screen of the client in multiple preset directions.

If the plurality of subtitle proportion values are all larger than or equal to the preset proportion threshold value, determining that subtitle information cannot be cut off when the video frames are filled in the screen of the client side in the preset direction, carrying out scaling processing on the video frames of the video to be processed, determining a target video frame, wherein the target video frame is filled in the screen of the client side, and the length of the target video frame in the preset direction is the same as the length of the screen of the client side in the preset direction. The method avoids cutting off the subtitle information in the preset direction, and solves the problem that when the video frame is filled in the screen of the equipment, the video frame including the subtitle information is completely or partially cut off, so that the video watching effect is poor.

Here, the client performs scaling processing on each video frame of the video to be processed to obtain a first target video frame corresponding to each video frame. The target video is composed of these first target video frames.

Based on the embodiment of the video scaling method shown in fig. 8, the embodiment of the present disclosure further provides a video scaling method. Referring to fig. 9, fig. 9 is a flow chart illustrating another video scaling method according to an example embodiment. The method is applied to the client and can comprise the following steps.

In step S901, a plurality of subtitle proportion values of subtitle information of a video to be processed sent by a service end in a plurality of preset directions are received. In one example, the caption proportion value may be a ratio of a frame distance to a length of the to-be-processed video in the preset direction, where the frame distance is a minimum value of distances between caption information included in a video frame of the to-be-processed video and the frame in the preset direction. Step S901 is the same as step S801.

In step S902, it is determined whether each of the caption scale values is greater than or equal to a preset scale threshold, and if so, step S903 is executed. Step S902 is the same as step S802.

In step S903, it is determined whether the aspect ratio of the video frame of the video to be processed is larger than the aspect ratio of the screen. If yes, go to step S904. If not, go to step S905.

The aspect ratio is a ratio of height to width.

When the plurality of subtitle proportion values are all larger than or equal to the preset proportion threshold value, the client can determine that subtitle information cannot be cut off when the video frames are filled in the screen of the client in the preset direction, and in order to fill the video frames in the screen of the client and cut off the area of the video frames less, the client judges whether the aspect ratio of the video frames of the video to be processed is larger than the aspect ratio of the screen.

In step S904, scaling the video frame of the video to be processed in an equal ratio to obtain a first scaled video frame, where the width of the first scaled video frame is the same as the width of the screen; and cutting the first zoom video frame in the height direction of the first zoom video frame to obtain a first sub-target video frame, wherein the height of the first sub-target video frame is the same as the height of the screen.

When the aspect ratio of the video frame of the video to be processed is larger than the aspect ratio of the screen, the client scales the video frame of the video to be processed in an equal ratio to obtain a first scaled video frame, and the width of the first scaled video frame is the same as that of the screen. At this time, the height of the first scaled video frame is greater than the height of the screen. The client side cuts out the part of the first zoom video frame, the upper direction and the lower direction of which exceed the screen, namely cuts out the first zoom video frame in the height direction of the first zoom video frame to obtain a first sub-target video frame, wherein the height of the first sub-target video frame is the same as the height of the screen, and the width of the first sub-target video frame is the same as the width of the screen.

For example, as the video frame a shown in fig. 10a and the screen M1 of the client shown in fig. 10b, the aspect ratio of the video frame a is larger than that of the screen M1. When the plurality of subtitle scale values are all larger than or equal to the preset scale threshold value, the client scales the video frame a in an equal ratio mode to obtain a scaled video frame a0 (namely, a first scaled video frame), and the width of the scaled video frame a0 is the same as that of the screen M1. At this time, the height of the scaled video frame a0 is greater than the height of the screen M1. The client clips the portion of the zoomed video frame a0 in the height direction beyond the screen M1 (the clipped region as shaded in fig. 10 c), resulting in the sub-target video frame a1 (i.e., the first sub-target video frame), the height of the sub-target video frame a1 being the same as the height of the screen M1.

In step S905, scaling the video frame of the video to be processed in an equal ratio to obtain a second scaled video frame, where the height of the second scaled video frame is the same as the height of the screen; and cutting the second zoom video frame in the width direction of the second zoom video frame to obtain a second sub-target video frame, wherein the width of the second sub-target video frame is the same as the width of the screen.

When the aspect ratio of the video frame of the video to be processed is determined to be smaller than or equal to the aspect ratio of the screen, the client scales the video frame of the video to be processed in an equal ratio mode to obtain a second scaled video frame, and the height of the second scaled video frame is the same as the height of the screen. At this time, the width of the second scaled video frame is greater than the width of the screen. The client clips the parts of the second zoomed video frame, of which the left and right directions exceed the screen, namely clips the second zoomed video frame in the width direction of the second zoomed video frame to obtain a second sub-target video frame, wherein the width of the second sub-target video frame is the same as the width of the screen, and the height of the second sub-target video frame is the same as the height of the screen.

For example, as shown in the video frame a shown in fig. 10a and the screen M2 of the client shown in fig. 10d, the aspect ratio of the video frame a is less than or equal to that of the screen M2. When the plurality of subtitle scale values are all larger than or equal to the preset scale threshold value, the client scales the video frame a in a proportional manner to obtain a scaled video frame a2 (i.e. a second scaled video frame), and the height of the scaled video frame a2 is the same as the height of the screen M2. At this time, the width of the scaled video frame a2 is equal to or greater than the width of the screen M2. The client crops the portion of the zoomed video frame a2 that is width-wise beyond the screen M2 (the cropped area as the shaded portion in fig. 10 e), resulting in sub-target video frame a3 (i.e., the second sub-target video frame), the width of sub-target video frame a3 being the same as the width of screen M2.

Applying the embodiment shown in fig. 9, the client can fill the screen of the client with video frames and cut out the area of the video frames less.

Based on the embodiment of the video scaling method shown in fig. 8, the embodiment of the present disclosure further provides a video scaling method. Referring to fig. 11, fig. 11 is a flowchart illustrating another video scaling method according to an example embodiment. The method is applied to the client and can comprise the following steps.

In step S1101, a plurality of caption scale values of the caption information of the video to be processed sent by the service end in a plurality of preset directions are received. In one example, the caption proportion value may be a ratio of a frame distance to a length of the to-be-processed video in the preset direction, where the frame distance is a minimum value of distances between caption information included in a video frame of the to-be-processed video and the frame in the preset direction. Step S1101 is the same as step S801.

In step S1102, it is determined whether each of the plurality of subtitle scale values is equal to or greater than a preset scale threshold. If yes, go to step S1103. If not, step S1104 is executed. Step S1102 is the same as step S802.

In step S1103, a video frame of the video to be processed is scaled, and a first target video frame is determined, where lengths of the first target video frame in multiple preset directions are the same as lengths of a screen of the client in multiple preset directions. Step S1103 is the same as step S803.

In step S1104, video frames of the video to be processed are scaled proportionally, and a target video frame is determined, wherein the lengths of the target video frame in the preset directions are the same as the lengths of the screen in the preset directions.

In the embodiment of the present disclosure, if the preset direction includes a width direction of a video frame of the video to be processed, that is, the preset direction includes a left direction and a right direction, a length in the preset direction is a width. If the preset direction includes a height direction of a video frame of the video to be processed, that is, the preset direction includes an up direction and a down direction, the length in the preset direction is the height.

Under the condition that the preset direction comprises the width direction of the video frame of the video to be processed, if the plurality of subtitle proportion values are uneven and are larger than or equal to the preset proportion threshold value, namely the subtitle proportion values smaller than the preset proportion threshold value exist in the plurality of subtitle proportion values, the client can determine that subtitle information can be cut out in the width direction when the video frame of the video to be processed is filled in the screen of the client, the video frame of the video to be processed is subjected to equal-ratio scaling, a second target video frame is determined, the width of the second target video frame is the same as the width of the screen, and the video frame of the video to be processed after equal-ratio scaling is filled in the screen of the equipment in the width direction.

In an optional embodiment, in order to improve the viewing effect of the video, when the preset direction includes the width direction of the video frame of the video to be processed, if the variation of the multiple subtitle scale values is not less than the preset scale threshold, the client scales the video frame of the video to be processed in an equal ratio to obtain a third scaled video frame, where the width of the third scaled video frame is the same as the width of the screen. At this time, the third scaled video frame fills the screen in the width direction. And if the third zoomed video frame is not full of the screen in the height direction, the client splices the first comment frame for the third zoomed video frame in the height direction of the third zoomed video frame to obtain a second target video frame, wherein the height of the second target video frame is the same as the height of the screen. In the height direction, the first comment frame may be spliced above the third zoomed video frame or spliced below the third zoomed video frame. The width of the first comment box is the same as the width of the screen. The user may enter comment information in the first comment box.

For example, as shown in fig. 10a for video frame a and fig. 12a for screen M3 of the client. Under the condition that the preset direction comprises the width direction of the video frame of the video to be processed, when the unevenness of the multiple subtitle scale values is greater than or equal to the preset scale threshold, the client scales the video frame a in an equal ratio manner to obtain a scaled video frame c0 (namely, a third scaled video frame), wherein the width of the scaled video frame c0 is the same as the width of the screen M3. At this time, if the height of the scaled video frame c0 is less than the height of the screen M3. In the height direction, the client stitches comment box 1 (i.e., the first comment box) for the zoomed video frame c0, resulting in a target video frame c1 (i.e., the second target video frame), as shown in fig. 12 b. At this time, the target video frame c1 fills the entire screen M3.

If the height of the third zoomed video frame is greater than the height of the screen, the client can clip the third zoomed video frame, and the height of the obtained video frame is equal to the height of the screen.

Under the condition that the preset direction comprises the height direction of the video frame of the video to be processed, if the plurality of subtitle proportion values are not uniform and are larger than or equal to the preset proportion threshold value, namely the subtitle proportion values smaller than the preset proportion threshold value exist in the plurality of subtitle proportion values, the client can determine that subtitle information can be cut off in the height direction when the video frame of the video to be processed is filled in the screen of the client, the video frame of the video to be processed is scaled in an equal ratio, a third target video frame is determined, the height of the third target video frame is the same as the height of the screen, namely the video frame of the video to be processed after being scaled in the equal ratio is filled in the screen of the equipment in the height direction.

In another optional embodiment, in order to improve the viewing effect of the video, under the condition that the preset direction includes the height direction of the video frame of the video to be processed, if the non-uniformity of the multiple subtitle scale values is greater than or equal to the preset scale threshold, the client scales the video frame of the video to be processed in an equal ratio to obtain a fourth scaled video frame, and the height of the fourth scaled video frame is the same as the height of the screen. At this time, the fourth zoomed video frame fills the screen in the height direction. And if the fourth zoomed video frame does not fill the screen in the width direction, the client side splices a second comment frame for the fourth zoomed video frame in the width direction of the fourth zoomed video frame to obtain a third target video frame, wherein the width of the third target video frame is the same as the width of the screen. In the width direction, the second comment frame may be stitched to the left of the fourth zoomed video frame, and may also be stitched to the right of the fourth zoomed video frame. The height of the second comment frame is the same as the height of the screen. The user may enter comment information in the second comment box.

For example, as shown in video frame a of fig. 10a and screen M4 of the client shown in fig. 12 c. Under the condition that the preset direction comprises the height direction of the video frame of the video to be processed, when the unevenness of the multiple subtitle scale values is greater than or equal to the preset scale threshold, the client scales the video frame a in an equal ratio manner to obtain a scaled video frame d0 (namely, a fourth scaled video frame), wherein the height of the scaled video frame d0 is the same as the height of the screen M4. At this time, if the width of the scaled video frame d0 is smaller than the width of the screen M4. In the width direction, the client stitches comment frame 2 (i.e., the second comment frame) for zoomed video frame d0, resulting in target video frame d1 (i.e., the third target video frame), as shown in fig. 12 d. At this time, the target video frame d1 fills the entire screen M4.

If the width of the fourth zoomed video frame is larger than the width of the screen, the client can clip the fourth zoomed video frame, and the width of the obtained video frame is equal to the width of the screen.

The following describes the video scaling method provided by the embodiment of the present disclosure in detail with reference to the video scaling system shown in fig. 3 and the signaling diagram shown in fig. 13.

In step S1, the server 301 obtains a video to be processed, where a video frame of the video to be processed includes subtitle information.

Step S2, for each video frame of the video to be processed, the server 301 determines distances between the caption information included in the video frame and the borders in the multiple preset directions, respectively, to obtain multiple caption distances; selecting the smallest caption distance from a plurality of caption distances corresponding to the frame in each preset direction as the frame distance corresponding to the frame in the preset direction; and calculating the ratio of the distance of each frame to the length of the video frame of the video to be processed in the corresponding preset direction to obtain a plurality of caption proportional values.

In step S3, the server 301 sends the plurality of caption scale values to the client 302.

In step S4, the client 302 determines whether each of the plurality of subtitle scale values is greater than or equal to a preset scale threshold. If yes, go to step S5. If not, step S8 is executed if the preset direction includes the width direction of the video frame of the video to be processed, and step S10 is executed if the preset direction includes the height direction of the video frame of the video to be processed.

In step S5, the client 302 determines whether the aspect ratio of the video frame of the video to be processed is greater than the aspect ratio of the screen. If yes, go to step S6. If not, step S7 is executed.

Step S6, the client 302 scales the video frame of the video to be processed in equal proportion to obtain a first scaled video frame, and the width of the first scaled video frame is the same as that of the screen; and cutting the first zoom video frame in the height direction of the first zoom video frame to obtain a first sub-target video frame, wherein the height of the first sub-target video frame is the same as the height of the screen.

Step S7, the client 302 zooms the video frame of the video to be processed in equal proportion to obtain a second zoomed video frame, and the height of the second zoomed video frame is the same as the height of the screen; and cutting the second zoom video frame in the width direction of the second zoom video frame to obtain a second sub-target video frame, wherein the width of the second sub-target video frame is the same as the width of the screen.

In step S8, the client 302 scales the video frames of the video to be processed in an equal ratio to obtain a third scaled video frame, where the width of the third scaled video frame is the same as the width of the screen.

In step S9, the client 302 splices the first comment frame for the third zoomed video frame in the height direction of the third zoomed video frame to obtain a second target video frame, where the height of the second target video frame is the same as the height of the screen, and the width of the second target video frame is the same as the width of the screen.

In step S10, the client 302 scales the video frames of the video to be processed in an equal ratio to obtain a fourth scaled video frame, where the height of the fourth scaled video frame is the same as the height of the screen.

In step S11, the client 302 splices a second comment frame for the fourth zoomed video frame in the width direction of the third zoomed video frame to obtain a third target video frame, where the width of the third target video frame is the same as the width of the screen, and the height of the third target video frame is the same as the height of the screen.

The description of the above steps S1-S11 is relatively simple, and reference may be made to the description of fig. 3-11.

Based on the embodiments of the video scaling method shown in fig. 4 to 13, the embodiments of the present disclosure provide a video scaling apparatus. Referring to fig. 14, fig. 14 is a block diagram of a video scaling apparatus according to an exemplary embodiment, applied to a server, the apparatus including: an acquisition module 1401, a calculation module 1402, and a transmission module 1403.

An obtaining module 1401 configured to perform obtaining a video to be processed, where a video frame of the video to be processed includes subtitle information.

The calculating module 1402 is configured to perform calculating caption scale values of the caption information of the video to be processed in a plurality of preset directions.

A sending module 1403 configured to execute sending the multiple subtitle scale values to the client, so that when the multiple subtitle scale values are all greater than or equal to a preset scale threshold, the client performs scaling processing on a video frame of the video to be processed, and determines a target video frame, where lengths of the target video frame in multiple preset directions are the same as lengths of a screen of the client in multiple preset directions.

In an alternative embodiment, the computing module 1402 is configured to specifically perform:

In an alternative embodiment, each video frame of the video to be processed includes a plurality of subtitle information, and the calculation module 1402 is configured to specifically perform:

and determining the distance between each subtitle information and the frame in the preset directions according to the position coordinates and the length of the text frame in which the subtitle information is positioned in the preset directions for each subtitle information of each video frame of the video to be processed, so as to obtain a plurality of subtitle distances.

In an optional embodiment, the origin of the video frame of the video to be processed is a pixel point at the top left corner of the video frame of the video to be processed, the position coordinate of each text box where the subtitle information is located in the preset direction includes a target coordinate of the pixel point at the top left corner of the text box where the subtitle information is located in the preset direction, and the length of each text box where the subtitle information is located in the preset direction includes a target length of the text box where the subtitle information is located in the preset direction.

A calculation module 1402 configured to specifically perform:

And determining the difference value between the length of a video frame of the video to be processed in each preset direction and a target sum value as the subtitle distance between the subtitle information and a frame in the preset direction, wherein the target sum value is the sum value of the target coordinate and the target length of a text frame where the subtitle information is located in the preset direction.

In the embodiment of the disclosure, the server is provided with a preset direction, and the preset direction is a direction in which subtitle information may be cut off when a screen of the client is filled with video frames. And the server calculates the caption proportion values of the caption information of the video to be processed in a plurality of preset directions and sends the caption proportion values to the client. After the client acquires the plurality of subtitle proportion values, if the plurality of subtitle proportion values are determined to be larger than or equal to the preset proportion threshold value, the fact that the subtitle information is far away from the edge of the video frame in the preset direction can be determined, the video frame is filled in a screen of the client, when the video frame is cut, the subtitle information included in the video frame cannot be cut in the preset direction, the video frame of the video to be processed is zoomed, and the target video frame is determined. The length of the target video frame in the preset direction is the same as the length of the screen of the client in the preset direction. The method avoids cutting off the subtitle information in the preset direction, and solves the problem that when the video frame is filled in the screen of the equipment, the video frame including the subtitle information is completely or partially cut off, so that the video watching effect is poor.

Based on the embodiments of the video scaling method shown in fig. 4 to 13, the embodiments of the present disclosure further provide a video scaling apparatus. Referring to fig. 15, fig. 15 is a block diagram illustrating another video scaling apparatus applied to a client according to an exemplary embodiment, the apparatus including: a receiving module 1501, a determining module 1502, and a scaling module 1503.

The receiving module 1501 is configured to execute a plurality of caption proportion values of the caption information of the video to be processed sent by the receiving server in a plurality of preset directions, where the caption proportion value is a ratio of a frame distance to a length of the video to be processed in the preset directions, and the frame distance is a minimum value of distances between the caption information included in a video frame of the video to be processed and the frames in the preset directions.

The determining module 1502 is configured to perform the determination whether each of the plurality of caption scale values is greater than or equal to a preset scale threshold.

A scaling module 1503, configured to perform scaling processing on the video frame of the video to be processed to determine a first target video frame if the determination result of the determining module 1502 is yes, where lengths of the first target video frame in the multiple preset directions are the same as lengths of the screen of the client in the multiple preset directions.

In an optional embodiment, the subtitle proportion value is a ratio of a frame distance to a length of a video frame of the video to be processed in the preset direction, and the frame distance is a minimum value of distances between subtitle information included in the video frame of the video to be processed and the frame in the preset direction.

In an alternative embodiment, the scaling module 1503 is configured to specifically execute:

judging whether the aspect ratio of a video frame of a video to be processed is larger than that of a screen or not;

if so, scaling the video frame of the video to be processed in an equal ratio to obtain a first scaled video frame, wherein the width of the first scaled video frame is the same as that of the screen; cutting the first zoom video frame in the height direction of the first zoom video frame to obtain a first sub-target video frame, wherein the height of the first sub-target video frame is the same as the height of the screen;

if not, scaling the video frame of the video to be processed in an equal ratio to obtain a second scaled video frame, wherein the height of the second scaled video frame is the same as that of the screen; and cutting the first zoom video frame in the width direction of the second zoom video frame to obtain a second sub-target video frame, wherein the width of the second sub-target video frame is the same as the width of the screen.

In an optional embodiment, the scaling module 1503 is configured to further perform:

the preset direction comprises the width direction of a video frame of the video to be processed, if a subtitle proportion value smaller than a preset proportion threshold exists in the plurality of subtitle proportion values, the video frame of the video to be processed is scaled in an equal ratio mode, a second target video frame is determined, and the width of the second target video frame is the same as the width of a screen;

and the preset direction comprises the height direction of the video frame of the video to be processed, if the caption proportion values smaller than the preset proportion threshold exist in the plurality of caption proportion values, the video frame of the video to be processed is scaled in an equal ratio, a third target video frame is determined, and the height of the third target video frame is the same as the height of the screen.

the preset direction comprises the width direction of a video frame of the video to be processed, and the video frame of the video to be processed is scaled in an equal ratio to obtain a third scaled video frame, wherein the width of the third scaled video frame is the same as the width of the screen; splicing the first comment frame for the third zoomed video frame in the height direction of the third zoomed video frame to obtain a second target video frame, wherein the height of the second target video frame is the same as the height of the screen;

the preset direction comprises the height direction of a video frame of the video to be processed, and the video frame of the video to be processed is scaled in an equal ratio to obtain a fourth scaled video frame, wherein the height of the fourth scaled video frame is the same as the height of the screen; and splicing a second comment frame for the fourth zoomed video frame in the width direction of the fourth zoomed video frame to obtain a third target video frame, wherein the width of the third target video frame is the same as the width of the screen.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Based on the embodiments of the video scaling methods shown in fig. 4 to fig. 7, the embodiments of the present disclosure further provide a server. Fig. 16 is a block diagram illustrating a server in accordance with an example embodiment. The server side comprises: a processor 1601 and a memory 1602 for storing processor-executable instructions. Wherein the processor 1601 is configured to execute instructions to implement the video scaling method illustrated in fig. 4-7 described above.

Based on the embodiments of the video scaling method shown in fig. 8 to 12d, the embodiments of the present disclosure further provide a client. Fig. 17 is a block diagram illustrating a client in accordance with an exemplary embodiment. The client comprises: a processor 1701 and a memory 1702 for storing processor-executable instructions. The processor 1701 is configured to execute instructions to implement the video scaling method described above and illustrated in fig. 8-12 d, among other things.

Based on the video scaling method embodiments shown in fig. 4-7, in an exemplary embodiment, the present disclosure also provides a storage medium including instructions, for example, a memory 1602 including instructions, which are executable by a processor 1601 of a server to perform the video scaling method shown in fig. 4-7,

based on the video scaling method embodiments shown in fig. 8-12 d, in an exemplary embodiment, the disclosed embodiments further provide a storage medium comprising instructions, such as the memory 1702 comprising instructions, which are executable by the processor 1701 of the client to perform the video scaling method shown in fig. 8-12 d.

Based on the embodiments of the video scaling method shown in fig. 4-7, in an exemplary embodiment, the disclosed embodiments also provide a computer program product, such as instructions included in the memory 1602. The computer program product, when executed by a processor of the server, causes the electronic device to implement the video scaling method shown in fig. 4-7 described above.

Based on the video scaling method embodiments shown in fig. 8-12 d, in an exemplary embodiment, the disclosed embodiments also provide a computer program product, such as instructions included in the memory 1702. The computer program product, when executed by a processor of the client, causes the electronic device to implement the video scaling method illustrated in fig. 8-12 d described above.

Alternatively, the storage medium may be a non-transitory computer readable storage medium, for example, a ROM (Read Only Memory), a RAM (Random Access Memory), a CD-ROM (Compact Disc ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

Alternatively, the Processor may be a general-purpose Processor including a CPU (Central Processing Unit), an NP (Network Processor), or the like; but also DSPs (Digital Signal Processing), ASICs (Application Specific Integrated circuits), FPGAs (Field Programmable Gate arrays) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A video scaling method is applied to a server and comprises the following steps:

calculating caption proportion values of the caption information of the video to be processed in a plurality of preset directions; the preset directions are directions set based on the arrangement direction of the subtitle information, and the preset directions are pre-estimated directions in which the subtitle information can be cut off when the video frames fill the screen of the client;

sending the plurality of subtitle proportion values to the client, so that when the plurality of subtitle proportion values are all larger than or equal to a preset proportion threshold value, the client performs scaling processing on a video frame of the video to be processed, and determines a target video frame, wherein the lengths of the target video frame in the plurality of preset directions are the same as the lengths of a screen of the client in the plurality of preset directions;

calculating caption proportion values of the caption information of the video to be processed in a plurality of preset directions, wherein the calculation comprises the following steps;

2. The method according to claim 1, wherein each video frame of the video to be processed includes a plurality of caption information, and the step of determining distances between the caption information included in the video frame and the borders in the plurality of preset directions to obtain a plurality of caption distances comprises:

3. The method according to claim 2, wherein the origin of the video frame of the video to be processed is a pixel point at the top left corner of the video frame of the video to be processed, the position coordinate of the text box in the preset direction of each subtitle information includes a target coordinate of the pixel point at the top left corner of the text box in the preset direction of the subtitle information, and the length of the text box in the preset direction of each subtitle information includes a target length of the text box in the preset direction of the subtitle information;

And determining the difference value between the length of the video frame of the video to be processed in each preset direction and a target sum value as the subtitle distance between the subtitle information and a frame in the preset direction, wherein the target sum value is the sum value of the target coordinate and the target length of a text frame in which the subtitle information is located in the preset direction.

4. A video scaling method is applied to a client and comprises the following steps:

receiving caption proportion values of caption information of a video to be processed in a plurality of preset directions, which are sent by a server side; the preset directions are directions set based on the arrangement direction of the subtitle information, and the preset directions are pre-estimated directions in which the subtitle information can be cut off when the video frames of the video to be processed are filled in a screen of a client;

judging whether the caption proportion values are all larger than or equal to a preset proportion threshold value or not;

if yes, performing scaling processing on the video frame of the video to be processed, and determining a first target video frame, wherein the lengths of the first target video frame in a plurality of preset directions are the same as the lengths of the screen of the client in the plurality of preset directions;

the caption proportion value is the ratio of the frame distance to the length of the video frame of the video to be processed in the preset direction, and the frame distance is the minimum value of the distance between the caption information included in the video frame of the video to be processed and the frame in the preset direction.

5. The method according to claim 4, wherein the step of scaling the video frame of the video to be processed to determine the first target video comprises:

if not, scaling the video frame of the video to be processed in an equal ratio to obtain a second scaled video frame, wherein the height of the second scaled video frame is the same as that of the screen; and clipping the second zoomed video frame in the width direction of the second zoomed video frame to obtain a second sub-target video frame, wherein the width of the second sub-target video frame is the same as the width of the screen.

6. The method of claim 4, further comprising:

7. The method according to claim 6, wherein the step of scaling the video frames of the video to be processed in an equal ratio to determine a second target video frame comprises:

scaling the video frame of the video to be processed in an equal ratio to obtain a third scaled video frame, wherein the width of the third scaled video frame is the same as that of the screen; splicing a first comment frame for the third zoomed video frame in the height direction of the third zoomed video frame to obtain a second target video frame, wherein the height of the second target video frame is the same as that of the screen;

8. A video scaling apparatus, applied to a server, comprising:

the acquisition module is configured to acquire a video to be processed, and a video frame of the video to be processed comprises subtitle information;

the calculation module is configured to calculate the caption proportion values of the caption information of the video to be processed in a plurality of preset directions; the preset directions are directions set based on the arrangement direction of the subtitle information, and the preset directions are pre-estimated directions from which the subtitle information can be cut out when the video frames fill the screen of the client;

a sending module configured to execute sending the plurality of caption proportion values to the client, so that when the plurality of caption proportion values are all greater than or equal to a preset proportion threshold value, the client performs scaling processing on a video frame of the video to be processed, and determines a target video frame, wherein the length of the target video frame in the preset direction is the same as the length of a screen of the client in the preset direction;

wherein the computing module is configured to specifically perform:

9. The apparatus of claim 8, wherein each video frame of the video to be processed comprises a plurality of subtitle information, and wherein the computing module is configured to perform:

and determining the distance between each subtitle information and a frame in a plurality of preset directions according to the position coordinates and the lengths of the text box in which the subtitle information is positioned in the plurality of preset directions aiming at each subtitle information of the video frame to obtain a plurality of subtitle distances.

10. The apparatus according to claim 9, wherein the origin of the video frame of the video to be processed is a pixel point at the top left corner of the video frame of the video to be processed, the position coordinate of the text box in the preset direction of each subtitle information includes a target coordinate of the pixel point at the top left corner of the text box in the preset direction of the subtitle information, and the length of the text box in the preset direction of each subtitle information includes a target length of the text box in the preset direction of the subtitle information;

the computing module is configured to specifically perform:

determining target coordinates of a text box where the subtitle information is located in each preset direction as a subtitle distance between the subtitle information and a frame in the preset direction; or

11. A video scaling apparatus applied to a client side includes:

the receiving module is configured to execute the caption proportion values of the caption information of the video to be processed, which is sent by the receiving server, in a plurality of preset directions; the preset directions are directions set based on the arrangement direction of the subtitle information, and the preset directions are pre-estimated directions in which the subtitle information can be cut out when the video frames of the video to be processed are filled in the screen of the client;

the judging module is configured to judge whether the caption proportion values are all larger than or equal to a preset proportion threshold value;

the zooming module is configured to zoom the video frame of the video to be processed and determine a first target video frame under the condition that the judgment result of the judging module is yes, wherein the lengths of the first target video frame in a plurality of preset directions are the same as the lengths of the screen of the client in the plurality of preset directions;

12. The apparatus of claim 11, wherein the scaling module is configured to perform in particular:

13. The apparatus of claim 11, wherein the scaling module is configured to further perform:

if the caption proportion values which are smaller than the preset proportion threshold value exist in the caption proportion values, scaling the video frame of the video to be processed in an equal proportion mode, and determining a second target video frame, wherein the width of the second target video frame is the same as the width of the screen;

14. The apparatus according to claim 13, wherein the scaling module is configured to perform in particular:

15. A server, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video scaling method of any of claims 1 to 3.

16. A client, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video scaling method of any of claims 4 to 7.

17. A storage medium in which instructions, when executed by a processor of an electronic device, cause the electronic device to perform the video scaling method of any of claims 1 to 3.

18. A storage medium in which instructions, when executed by a processor of an electronic device, cause the electronic device to perform the video scaling method of any of claims 4 to 7.