CN114422794A

CN114422794A - Dynamic video definition processing method based on front camera

Info

Publication number: CN114422794A
Application number: CN202111597316.XA
Authority: CN
Inventors: 唐勇; 陆林; 王凡
Original assignee: Xuancai Interactive Network Science And Technology Co ltd
Current assignee: Xuancai Interactive Network Science And Technology Co ltd
Priority date: 2021-12-24
Filing date: 2021-12-24
Publication date: 2022-04-29

Abstract

The invention relates to the technical field of image data processing, in particular to a dynamic video definition processing method based on a front camera, which comprises the steps of collecting a user head picture through the front camera of a screen to generate head information; calculating the distance between the user and the screen based on the head information to obtain distance information and obtain a user instruction; decoding the user instruction to obtain decoded data; encoding the historical video based on the distance information and the decoded data to obtain video and audio data; the video and audio data are decrypted and then displayed through the screen, so that the picture quantity of the video and audio data can be adjusted according to the distance between a user and the screen, and the problems of screen splash, blockage and response delay caused by unbalanced resource allocation in the conventional video coding method are solved.

Description

Dynamic video definition processing method based on front camera

Technical Field

The invention relates to the technical field of image data processing, in particular to a dynamic video definition processing method based on a front camera.

Background

Cloud games provide users with higher-quality game experience in a brand-new way, and cloud game service providers adopt virtualization technology to run a plurality of games on one GPU simultaneously in order to save hardware cost.

However, in the existing video encoding method, due to the asynchronism and non-preemptive nature of image processing operations by the GUP, the problem of unbalanced resource allocation exists, so that a game service level protocol is not guaranteed, the number of frames of some games, namely the number of pictures, is very high, the number of frames of some games is very low, and in a wireless network environment, a client loses a part of frames after images are transmitted, so that games with low number of frames are caused, the pictures have serious phenomena of screen splash and pause, the service quality is seriously influenced, and the service quality can cause users to abandon cloud games.

Disclosure of Invention

The invention aims to provide a dynamic video definition processing method based on a front camera, and aims to solve the problems of screen splash, blockage and response delay caused by unbalanced resource allocation in the conventional video coding method.

In order to achieve the above object, the present invention provides a method for processing sharpness of a dynamic video based on a front camera, comprising the following steps:

acquiring a user head picture through a front camera of a screen to generate head information;

calculating the distance between the user and the screen based on the head information to obtain distance information and obtain a user instruction;

decoding the user instruction to obtain decoded data;

adjusting the number of pictures after the historical video is coded based on the distance information and the decoding data to obtain video and audio data;

and displaying the video and audio data through the screen after decrypting the video and audio data.

The method comprises the following steps of acquiring a user head picture through a front camera of a screen, and generating head information in a specific mode:

acquiring a user head picture through a front camera of a screen to obtain picture information;

and extracting the head features of the user based on the picture information to obtain head information.

Wherein, the distance between the user and the screen is calculated based on the head information, and the specific way of obtaining the distance information is as follows:

scanning the head information to obtain scanning information;

removing the facial shadow of the head in the scanning information to obtain preprocessing information;

capturing the positions of the head and the eyes in the preprocessing information to obtain position points;

and calculating the distance between the position point and the screen to obtain distance information.

The specific mode of obtaining the video and audio data is that the picture quantity is adjusted after the historical video is coded based on the distance information and the decoding data;

comparing the position information with a distance threshold value to generate a coding instruction;

encoding the historical video based on the decoding data to obtain a video code stream;

and adjusting the number of pictures of the video code stream based on the coding instruction to obtain video and audio data.

Wherein, the specific way of encoding the historical video based on the decoding data to obtain the preprocessed video is as follows:

dividing the historical video into a plurality of square pixel blocks based on the decoding data, and respectively and sequentially performing spatial prediction on each pixel block to obtain a prediction residual error;

transforming and quantizing the prediction residual error to obtain a transformation coefficient;

and entropy coding is carried out on the transformation coefficient to obtain a video code stream.

The invention relates to a dynamic video definition processing method based on a front camera, which comprises the steps of collecting a user head picture through the front camera of a screen to generate head information; calculating the distance between the user and the screen based on the head information to obtain distance information and obtain a user instruction; decoding the user instruction to obtain decoded data; adjusting the number of pictures after the historical video is coded based on the distance information and the decoding data to obtain video and audio data; the video and audio data are decrypted and then displayed through the screen, so that the picture quantity of the video and audio data can be adjusted according to the distance between a user and the screen, and the problems of screen splash, blockage and response delay caused by unbalanced resource allocation in the conventional video coding method are solved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a method for processing sharpness of a dynamic video based on a front camera according to the present invention.

Fig. 2 is a flow chart of capturing a picture of the user's head by a front camera of the screen, generating head information.

Fig. 3 is a flowchart for calculating a distance between a user and the screen based on the head information, resulting in distance information.

Fig. 4 is a flowchart of adjusting the number of pictures after historical video encoding based on the distance information and the decoded data to obtain audio and video data.

Fig. 5 is a flow chart of encoding a historical video based on the decoded data, resulting in a pre-processed video.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

Referring to fig. 1 to 5, the present invention provides a method for processing sharpness of a dynamic video based on a front camera, comprising the following steps:

s1, acquiring a user head picture through a front camera of the screen to generate head information;

the concrete mode is as follows: s11, acquiring a picture of the head of a user through a front camera of a screen to obtain picture information;

specifically, the screen and the front camera are cloud game clients, if the front camera is not configured on the screen, the front camera is configured on the screen, a user instruction can be input through a keyboard input outside the screen, and the front camera needs to be ensured to be in a working state before use.

S12, extracting the head feature of the user based on the picture information to obtain the head information.

Specifically, the cloud game service end extracts the head features of the user based on the picture information to obtain the head information, only extracts the head information, and filters the rest redundant and miscellaneous information, so that the processing accuracy of the head information after the influence of the redundant and miscellaneous information is avoided.

S2, calculating the distance between the user and the screen based on the head information to obtain distance information and obtain a user instruction;

specifically, the distance between the user and the screen is calculated through the cloud game service terminal based on the head information, so that distance information is obtained, and a user instruction is obtained.

Calculating the distance between the user and the screen based on the head information, wherein the specific way of obtaining the distance information is as follows: s21, scanning the head information to obtain scanning information;

specifically, the head information is scanned through the cloud game service end to obtain scanning information.

S22, removing the face shadow of the head in the scanning information to obtain preprocessing information;

specifically, the shadow of the face is taken out, so that the interference of the shadow on the acquisition of eyes can be avoided.

S23, capturing the positions of the head and the eyes in the preprocessing information to obtain position points;

specifically, the midpoint of the line connecting the two eyes is taken as a position point.

S24, calculating the distance between the position point and the screen to obtain distance information.

Specifically, by calculating the distance between the eyes and the screen, the accuracy of the judgment of the eye fatigue of the user can be increased.

S3, decoding the user instruction to obtain decoded data;

specifically, the user instruction is decoded by the cloud game server to obtain decoded data.

S4, regulating the number of pictures after the historical video is coded based on the distance information and the decoding data to obtain video and audio data;

and regulating the number of pictures after the historical video is coded by the cloud game service terminal based on the distance information and the decoding data to obtain video and audio data.

The concrete mode is as follows: s41, comparing the position information with a distance threshold value to generate a coding instruction;

specifically, the characteristics of the screen picture viewed by the user are as follows: when the distance from the screen is too close, eyes are easy to fatigue, when human eyes are in a fatigue state, the attention to the picture is reduced, in this case, the requirement on the quality of the picture is reduced, and when the distance from the screen is too far, the sense of the picture is reduced. When people feel tired of eyes or the cloud game server pushes high-quality codes too far away from the screen, the cloud game server does not feel the high-quality codes, therefore, the position distance M is compared with the distance threshold value M, if M is less than M, the user is close to the screen, after the user plays the game for a time T, the game playing time T of the user can be obtained by subtracting the remaining time from the remaining time when the user starts playing the game, the user feels tired of eyes, and the number of pictures of the cloud service recorded game is properly reduced. If M is greater than M, the fact that the user is far away from the screen is indicated, the user has a reduced picture perception, and at the moment, the cloud game server side should appropriately reduce the number of recorded game pictures.

S42, based on the decoded data, encoding the historical video to obtain a video code stream;

the concrete mode is as follows: s421, dividing the historical video into a plurality of square pixel blocks based on the decoding data, and respectively and sequentially performing spatial prediction on each pixel block to obtain a prediction residual error;

specifically, the historical video is divided into a plurality of square pixel blocks, for example, 16x16 pixels are used as basic units in H.264/AVC (the H.264/AVC standard is jointly developed by ITU-T and ISO/IEC and is positioned to cover the whole video application field).

S422, transforming and quantizing the prediction residual error to obtain a transformation coefficient;

specifically, the specific way of transforming and quantizing the prediction residual to obtain the transform coefficient is as follows: and carrying out forward transformation, quantization, inverse quantization and inverse transformation on the prediction residual error to obtain a transformation coefficient.

And S423, entropy coding is carried out on the transformation coefficient to obtain a video code stream.

Specifically, the transform coefficient is subjected to loop filtering and then entropy coding to obtain a video code stream

S43, adjusting the number of pictures of the video code stream based on the coding instruction to obtain video and audio data.

And S5, decrypting the video and audio data and displaying the decrypted video and audio data through the screen.

Specifically, the video and audio data are decrypted by the cloud game server and then displayed on the screen of the cloud game client.

Although the foregoing disclosure shows only a preferred embodiment of the present invention, it is understood that the scope of the present invention is not limited thereto, and those skilled in the art can understand that all or part of the processes of the foregoing embodiment can be implemented and equivalents thereof can be made according to the claims of the present invention.

Claims

1. A dynamic video definition processing method based on a front camera is characterized by comprising the following steps:

decoding the user instruction to obtain decoded data;

2. The front-facing camera based dynamic video sharpness processing method of claim 1,

3. The front-facing camera based dynamic video sharpness processing method of claim 1,

the specific way of calculating the distance between the user and the screen based on the head information to obtain the distance information is as follows:

scanning the head information to obtain scanning information;

4. The front-facing camera based dynamic video sharpness processing method of claim 1,

5. The front-facing camera based dynamic video sharpness processing method of claim 4,

the specific method for encoding the historical video based on the decoded data to obtain the preprocessed video is as follows: