CN113422995B

CN113422995B - Video processing method based on AI model and portable electronic device

Info

Publication number: CN113422995B
Application number: CN202110818681.2A
Authority: CN
Inventors: 刘夏聪; 刘盼; 郜超军; 杨明珊
Original assignee: Zhengzhou University; Zhuhai Geehy Semiconductor Co Ltd
Current assignee: Zhengzhou University; Zhuhai Geehy Semiconductor Co Ltd
Priority date: 2021-02-04
Filing date: 2021-07-20
Publication date: 2023-06-23
Anticipated expiration: 2041-07-20
Also published as: CN113422995A

Abstract

The application provides a video processing method based on an AI model and a portable electronic device, wherein the method comprises the following steps: acquiring at least one environmental data; the at least one environmental data includes: light data, and/or sound data, and/or air temperature data, and/or air pressure data; obtaining at least one data of a target video clip, the at least one data of the target video clip comprising: sound data, and/or image data, and/or subtitle data; the target video clip is a video clip to be played by the electronic device; inputting at least one environmental data and at least one data of the target video segment into a first model to obtain a first processing strategy of the target video segment, wherein the first model is used for analyzing the processing strategy of the video segment; and playing the target video clip according to the first processing strategy. The method and the device can automatically adjust the processing strategy in the video playing process, reduce the electric quantity consumed by the electronic equipment for playing the video, and achieve the purpose of saving electricity.

Description

Video processing method based on AI model and portable electronic device

Technical Field

The application relates to the technical field of intelligent terminals, in particular to a video processing method based on an AI model and a portable electronic device.

Background

Currently, if a user moves from indoor to outdoor while using an electronic device, in order to make the user able to see clearly the content displayed on the display screen, the electronic device may increase the brightness of the display screen, thereby causing an increase in power consumption of the electronic device. If the electronic device is playing video when the environment of the electronic device is changed from indoor to outdoor, the power consumption of the electronic device is increased more seriously, so that the power consumption of the electronic device is overlarge.

Disclosure of Invention

The application provides a video processing method based on an AI model and a portable electronic device, which can enable the electric quantity consumed by playing video by electronic equipment to be more reasonable.

In a first aspect, an embodiment of the present application provides a video processing method based on an AI model, including:

acquiring at least one environmental data; the at least one environmental data includes: light data, and/or sound data, and/or air temperature data, and/or air pressure data;

obtaining at least one data of a target video clip, the at least one data of the target video clip comprising: sound data, and/or image data, and/or subtitle data; the target video clip is a video clip to be played by the electronic device;

Inputting the at least one environmental data and the at least one data of the target video segment into a first model, and obtaining a first processing strategy of the target video segment, wherein the first model is used for analyzing the processing strategy of the video segment;

and playing the target video clip according to the first processing strategy.

In one possible implementation, the first processing policy includes: the image processor is directed to the processing strategy of the target video segment and/or the data processor is directed to the processing strategy of the target video segment and/or the display driver is directed to the processing strategy of the target video segment.

In one possible implementation, the first processing policy includes: the data processor plays the target video clip according to the first processing policy with respect to the processing policy of the target video clip, including:

and for the target video segment, decoding the code stream of the target video segment according to the processing strategy of the data processor for the target video segment in the first processing strategy to obtain decoded data.

In one possible implementation, the first processing policy includes: the image processor plays the target video clip according to the first processing policy with respect to the processing policy of the target video clip, including:

And for each video frame in the target video segment, rendering the video frame according to the processing strategy of the image processor aiming at the target video segment in the first processing strategy.

In one possible implementation, the first processing policy includes: the display driver is directed to the processing strategy of the target video segment, and playing the target video segment according to the first processing strategy comprises the following steps:

and for each video frame in the target video segment, displaying the video frame according to the processing strategy of the display driver aiming at the target video segment in the first processing strategy.

In one possible implementation, the processing strategy of the image processor for the target video segment includes: image rendering resolution, and/or on-off state of an image sharpening algorithm, and/or on-off state of a contrast enhancement algorithm;

and/or the number of the groups of groups,

the processing strategy of the data processor for the target video segment comprises the following steps: decoding accuracy, and/or frame skipping, and/or frame rate of the target video clip, and/or video encapsulation format of the target video clip, and/or code rate stream of the target video clip, and/or resolution of the target video clip;

And/or the number of the groups of groups,

the processing strategy of the display driver for the target video segment comprises: screen refresh frequency, and/or screen resolution.

In one possible implementation manner, the first model is obtained through pre-training, and the training method includes:

acquiring a training sample marked with a processing strategy; each of the training samples comprises: samples of each of the at least one environmental data, samples of each of the at least one data;

and inputting the training sample into a preset model for training to obtain the first model.

In one possible implementation, the first model is an AI-aware neural network composed of an AI-aware neural network accelerator and a recurrent neural network.

In a second aspect, an embodiment of the present application provides a video processing method, applied to an electronic device, including:

determining a first environment type from the at least one environment data;

determining a first processing strategy of the target video clip according to the first environment type; the target video clip is a video clip to be played by the electronic device;

And playing the target video clip according to the first processing strategy.

In one possible implementation manner, if obtaining one environment data, the determining the first environment type according to the at least one environment data includes:

for the environmental data, determining a second environment type corresponding to the acquired environmental data as the first environment type according to a corresponding relation between a data interval preset for the environmental data and the second environment type;

or alternatively, the process may be performed,

if at least two environmental data are acquired, the determining a first environmental type according to the at least one environmental data includes:

for each type of environment data, determining a second environment type corresponding to the acquired environment data according to a corresponding relation between a data interval preset for the environment data and the second environment type;

and determining the first environment type according to the second environment type corresponding to each acquired environment data.

In one possible implementation manner, the determining the first environment type according to the second environment type corresponding to each acquired environment data includes:

calculating a first value according to the preset weight of each acquired environmental data and the value corresponding to the second environmental type;

And determining a third environment type corresponding to the first value according to the corresponding relation between the preset value interval and the third environment type, and obtaining the first environment type.

In another possible implementation, if one type of environment data is acquired, the determining the first environment type according to the at least one type of environment data may further determine the first environment type by inputting the at least one type of environment data into a pre-trained second model. At this time, it includes:

and inputting the at least one environmental data into a preset second model to obtain the environmental type output by the second model, and obtaining the first environmental type.

Alternatively, the second model may be an AI-aware neural network composed of an AI-aware neural network accelerator, and a Recurrent Neural Network (RNN).

The specific training method of the second model may include: obtaining training samples, each training sample comprising: samples of each of the at least two environmental data, and an environmental type of the training sample; and inputting the training sample into a preset model for training to obtain a second model.

In one possible implementation manner, the determining the first processing policy of the target video according to the first environment type includes:

Determining a processing strategy corresponding to the first environment type as the first processing strategy according to a corresponding relation between a preset environment type and the processing strategy; the first processing strategy comprises the following steps: the image processor is directed to the processing strategy of the target video segment and/or the data processor is directed to the processing strategy of the target video segment and/or the display driver is directed to the processing strategy of the target video segment.

for the target video segment, decoding the code stream of the target video segment according to the processing strategy of the data processor for the target video segment in the first processing strategy to obtain decoded data;

and/or the number of the groups of groups,

the first processing strategy comprises the following steps: the image processor plays the target video clip according to the first processing policy with respect to the processing policy of the target video clip, including:

For each video frame in the target video segment, rendering the video frame according to the processing strategy of the image processor aiming at the target video segment in the first processing strategy;

and/or the number of the groups of groups,

the first processing strategy comprises the following steps: the display driver is directed to the processing strategy of the target video segment, and playing the target video segment according to the first processing strategy comprises the following steps:

and/or the number of the groups of groups,

And/or the number of the groups of groups,

In a third aspect, embodiments of the present application provide a portable electronic device, including:

a first acquisition unit configured to acquire at least one environmental data; the at least one environmental data includes: light data, and/or sound data, and/or air temperature data, and/or air pressure data;

a second obtaining unit, configured to obtain at least one data of a target video segment, where the at least one data of the target video segment includes: sound data, and/or image data, and/or subtitle data; the target video clip is a video clip to be played by the electronic device;

the strategy decision unit is used for inputting the at least one environmental data and the at least one data of the target video segment into a first model to obtain a first processing strategy of the target video segment, and the first model is used for analyzing the processing strategy of the video segment;

and the playing unit is used for playing the target video clip according to the first processing strategy.

In a fourth aspect, embodiments of the present application provide a portable electronic device, including:

An acquisition unit configured to acquire at least one environmental data; the at least one environmental data includes: light data, and/or sound data, and/or air temperature data, and/or air pressure data;

a first determining unit configured to determine a first environment type according to the at least one environment data;

the second determining unit is used for determining a first processing strategy of the target video clip according to the first environment type; the target video clip is a video clip to be played by the electronic device;

In a fifth aspect, embodiments of the present application provide an electronic device, including:

a display screen; one or more processors; a memory; and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions, which when executed by the device, cause the device to perform the method of any of the first or second aspects.

In a sixth aspect, embodiments of the present application provide a computer readable storage medium having a computer program stored therein, which when run on a computer, causes the computer to perform the method of any one of the first or second aspects.

In a seventh aspect, the present application provides a computer program for performing the method of the first aspect when the computer program is executed by a computer.

In one possible design, the program in the seventh aspect may be stored in whole or in part on a storage medium packaged with the processor, or in part or in whole on a memory not packaged with the processor.

In the video processing method, at least one environmental data is acquired, wherein the at least one environmental data comprises: light data, and/or sound data, and/or air temperature data, and/or air pressure data, and obtaining at least one data of a target video clip, wherein the at least one data of the target video clip comprises: sound data, and/or image data, and/or subtitle data, where the target video clip is a video clip to be played by the electronic device, and at least one environmental data and at least one data of the target video clip are input into a first model to obtain a first processing policy of the target video clip, where the first model is used to analyze the processing policy of the video clip, and play the target video clip according to the first processing policy, so that the processing policy can be automatically adjusted in a video playing process according to the environmental data and the data of the target video clip, so that the electric quantity consumed by playing the video by the electronic device is more reasonable, and the purpose of saving electricity is achieved.

Drawings

FIG. 1 is a schematic structural diagram of one embodiment of an electronic device of the present application;

FIG. 2 is a flowchart of one embodiment of a video playback method of the present application;

FIG. 3 is a flowchart of another embodiment of a video playing method according to the present application;

FIG. 4A is a flowchart of a video playing method according to another embodiment of the present application;

FIG. 4B is a schematic diagram of a training method of the first model of the present application;

FIG. 5 is a flowchart of a video playing method according to another embodiment of the present application;

FIG. 6 is a schematic structural diagram of an embodiment of a video playing device according to the present application;

fig. 7 is a schematic structural diagram of another embodiment of a video playing device according to the present application;

FIG. 8 is a schematic diagram of a prior art video playback flow;

fig. 9 is a schematic diagram of a video playing flow according to an embodiment of the present application.

Detailed Description

The terminology used in the description section of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the present application.

First, terms involved in the embodiments of the present application will be described by way of example, but not limitation:

artificial intelligence (Artificial Intelligence, AI) is a new technical science to study, develop theories, methods, techniques and application systems for simulating, extending and expanding human intelligence.

A System on Chip (SoC), also referred to as a System on Chip, means that it is a product, which is an integrated circuit with a dedicated target, containing the entire System and having embedded software. It is also a technique to achieve the whole process from determining the system functions, to software/hardware partitioning, and to complete the design.

A digital signal processor (Digital Signal Processor, DSP), a microprocessor particularly suited for performing digital signal processing operations, has a major application in rapidly implementing various digital signal processing algorithms in real time.

AI cultivation: AI cultivation in the field of artificial intelligence utilizes a large number of accelerators such as graphics processors (Graphics Processing Unit, GPU) or central processing units (Central Processing Unit, CPU) to provide computing power, find a suitable neural network architecture and calculate to obtain the optimal structural parameters of the neural network, so that the network can complete specific work. In popular terms, a machine is "fed" with a large amount of data that it learns to identify and distinguish objects.

The video playing method of the embodiment of the application can be applied to electronic equipment, such as mobile phones, tablet computers (PADs), personal Computers (PCs) and the like. The method can be used as one function in a video playing application (App) in the electronic equipment or as one video playing control function provided by an operating system of the electronic equipment. The function can be set by a user to be started or not, and after the function is started, the video playing method can be triggered and executed, so that the purposes of reasonably consuming the electric quantity of the electronic equipment and further saving the electric quantity of the electronic equipment in the video playing process are achieved.

First, a possible implementation structure of the electronic device of the present application will be described. As shown in fig. 1, the electronic device 100 may include: processor 110, memory 120, display 130. The electronic device 100 may further include: a light sensor 140, a microphone 150, a temperature sensor 160, an air pressure sensor 170, etc. The above structures may communicate with each other via an internal connection path to transfer control and/or data signals, the memory 120 for storing a computer program, and the processor 110 for calling and running the computer program from the memory 120.

The processor 110 and the memory 120 may be combined into a single processing device, more commonly a separate component, and the processor 110 is configured to execute program code stored in the memory 120 to perform the functions described above. In particular, the memory 120 may also be integrated into the processor 110 or may be separate from the processor 110.

It should be appreciated that the processor 110 in the electronic device 100 shown in fig. 1 may be a system-on-a-chip SoC, and the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.

The electronic device 100 implements display functions through a GPU, a display screen 130, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 130 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.

The digital signal processor is used for processing digital signals, and can process other digital signals besides digital image signals.

The light sensor 140 is used to sense the ambient light level.

The temperature sensor 160 is used to detect the ambient temperature.

The air pressure sensor 170 is used to measure air pressure.

The microphone 150 is used for collecting sound signals and converting the sound signals into electrical signals.

It should be understood that the electronic device 100 shown in fig. 1 is capable of implementing various processes of the methods provided by embodiments of the present application. The operations and/or functions of the respective modules in the electronic device 100 are respectively for implementing the respective flows in the above-described method embodiments. Reference may be made specifically to the descriptions in the embodiments of the methods of the present application, and detailed descriptions are omitted here as appropriate to avoid redundancy.

The video playing method of the present application is described in detail below in conjunction with the above electronic device structure.

The video clip in the embodiments of the present application may be one video clip in a plurality of video clips obtained by dividing a video file according to a duration, and if the video file is not divided into video clips, it may be considered that a video file is one video clip. The length of the video clip is not limited in this embodiment.

Fig. 2 is a flowchart of one embodiment of a video playing method of the present application, as shown in fig. 1, where the method may include:

step 201: the electronic equipment acquires at least one environmental data; the at least one environmental data includes: light data, and/or sound data, and/or air temperature data, and/or air pressure data.

The light data can be detected by a light sensor in the electronic equipment; the sound data may be detected by a microphone in the electronic device; the air temperature data can be detected by a temperature sensor in the electronic equipment; the barometric pressure data may be detected by a barometric pressure sensor in the electronic device.

Step 202: the electronic device determines a first environment type based on the at least one environment data.

The first environment type is used to describe the type of environment in which the electronic device is located.

The implementation of this step is described below in two possible scenarios.

Scene one: in step 201, environmental data is obtained, and accordingly, in this step, a first environmental type is determined according to the environmental data, where this step may include:

and determining the environment type corresponding to the data interval where the environmental data is located as a first environment type according to the corresponding relation between the data interval preset for the environmental data and the environment type.

The method for dividing the environment types is not limited in this embodiment, for example, the environment types may be divided into: indoor and outdoor; alternatively, the environment types may be classified into according to the degree of silence: a noisy environment, a quiet environment; alternatively, the environment types may be classified into according to the degree of silence: a noisy environment, a quieter environment, and a quiet environment; etc.

For example:

if the environment data is light data, the environment type corresponding to the first light intensity interval (0, a) can be preset to be indoor, the environment type corresponding to the second light intensity interval (a, + -infinity) can be preset to be outdoor, the unit of the light intensity can be candela (cd), if the light intensity acquired by the electronic equipment is located in the first light intensity interval, the first environment type is determined to be indoor, otherwise, the first environment type is determined to be outdoor.

If the environment data is sound data, it may be preset that the environment type corresponding to the first volume interval (0, B) is indoor, the second volume interval (B, ++ infinity) is outdoor; the unit of the volume may be decibel (dB), if the volume of the sound acquired by the electronic device is located in the first volume interval, it is determined that the first environment type is indoor, otherwise, it is determined that the first environment type is outdoor.

If the environmental data is air temperature data, a first temperature interval (- ≡, c) is outdoor, the second temperature interval (c, ++ infinity) is indoor; the unit of the temperature may be degrees celsius (deg.c), if the temperature acquired by the electronic device is located in the first temperature interval, determining that the first environment type is outdoor, otherwise, determining that the first environment type is indoor.

If the environmental data is barometric pressure data, a first barometric pressure interval (- ≡, c) is outdoor, the second pressure interval (c, ++ infinity) is indoor; the unit of air pressure may be a hundred pascals (hPa), if the air pressure obtained by the electronic device is within the first air pressure interval, the first environment type is determined to be outdoor, otherwise, the first environment type is determined to be indoor.

Scene II: in step 201, at least two environmental data are acquired, and accordingly, in this step, a first environmental type is determined according to the at least two environmental data, where this step may include:

for each type of environment data, determining a second environment type corresponding to the environment data according to a corresponding relation between a data interval preset for the environment data and the second environment type;

and determining the first environment type according to the second environment type corresponding to the at least two environment data.

The determining the second environment type corresponding to each environment data in the scenario may refer to the corresponding description in the scenario one, which is not described herein. The manner of division of the second environment type corresponding to each environment data may be the same or different. The same example is given for the division: assuming that at least two kinds of environment data are light data and sound data, the second environment type corresponding to the light data is classified into indoor and outdoor, and the second environment type corresponding to the sound data is also classified into indoor and outdoor. Different examples of the division modes are as follows: assuming that at least two kinds of environment data are light data and sound data, the second environment type corresponding to the light data is divided into indoor and outdoor, and the second environment type corresponding to the sound data is: noisy and quiet.

The determining the first environment type according to the second environment type corresponding to the at least two environment data may include:

calculating a first numerical value according to preset weight of each environmental data and a second environmental type corresponding to the environmental data;

and determining a third environment type corresponding to the first numerical value according to the corresponding relation between the preset numerical value interval and the third environment type, so as to obtain the first environment type.

For example, different corresponding values may be set in advance for different second environment types, and weights may be preset for each type of environment data, and if the environment data 1 and the environment data 2 are acquired in step 201, the first value a may be calculated by the following formula: a=a1b1+a2b2, a1 is the weight of the environmental data 1, B1 is the value corresponding to the second environmental type corresponding to the environmental data 1, a2 is the weight of the environmental data 2, and B2 is the value corresponding to the second environmental type corresponding to the environmental data 2. The values corresponding to each of the second environment types may be the same or different, and the specific values are not limited in the embodiments of the present application. The specific value of the weight corresponding to each environment type is not limited in the embodiment of the present application. After the first value A is obtained through calculation, searching a corresponding relation between a preset value interval and a third environment type, obtaining the third environment type corresponding to the value interval where the first value A is located, and taking the obtained third environment type as the first environment type. The division of the third environment type may be the same as or different from the second environment type, and embodiments of the present application are not limited.

In another possible implementation, the step may further determine the first environment type by inputting the at least one environmental data into a pre-trained second model. At this time, the step may include:

The second model is used for analyzing the environment types corresponding to the at least two environment data. The second model can be obtained by AI model cultivation in the manner of AI cultivation. Specifically, a large number of training samples, for example, more than thousands of training samples, can be fed to an AI machine, and the AI machine calculates to obtain the optimal network architecture and neural network structure parameters for calculating the environment type according to the relation features among the data of the training samples, so as to obtain the second model.

For example, assuming that the electronic device acquires only one type of environment data and the environment data is sound data, the following AI-based model may be used as an initial model of the first model: p (x|v) =γ1x1, where X1 is sound data in the environment, γ1 is a weighting parameter to be calculated in the AI cultivation process, and P (x|v) is an environment type; wherein, if the environmental data is light data, X1 in the initial model may represent the light data in the environment. Similarly, the above initial model may be further extended to a case where the electronic device obtains at least two environmental data, where the electronic device obtains two environmental data, and the two environmental data are respectively sound data and light data, for example, the following AI basic model may be used as the initial model of the first model: p (x|v) =γ1x1+γ2x2, where X1 is sound data in the environment, X2 is light data in the environment, γ1 and γ2 are weighting parameters to be calculated in the AI cultivation process, and P (x|v) is the environment type.

Step 203: the electronic equipment determines a first processing strategy of the target video clip according to the first environment type; the target video clip is a video clip to be played by the electronic device.

The target video clip and the video clip currently being played by the electronic device may be video clips of the same video file or video clips of different video files. For example, if the video file 1 is currently being played, the video file 1 includes the video clip 11 and the video clip 12, and the next preset video file, such as the video file 2, is automatically played after the video file 1 is played, and the video file 2 includes the video clip 21 and the video clip 22, then the target video clip may be the video clip 12 if the video clip currently being played is the video clip 11, and the target video clip may be the video clip 21 if the video clip currently being played is the video clip 12. It should be noted that, the foregoing examples take the example that the target video clip is the next video clip of the video clip currently being played as an example, and are not intended to limit the positional relationship between the target video clip and the video clip currently being played, for example, the target video clip may also be the second video clip after the video clip currently being played, and the like, which is not limited in this application.

If the corresponding relation between the environment type and the processing policy can be preset, the corresponding relation can be searched according to the first environment type in the step, and the processing policy corresponding to the first environment type is obtained and used as the first processing policy.

In setting the processing policy corresponding to the environment type, the setting may be based on whether the environment is suitable for watching the video when the user is in different environments, and the attention of the user to the video being played, for example, if the environment type includes indoor and outdoor, the attention of the user to the video being played by the electronic device is generally relatively smaller because of too strong light, so the processing policy corresponding to the outdoor may be more suitable for saving the electric quantity of the electronic device, the attention of the user to the video being played by the electronic device is generally relatively larger because of softer light when the user is indoor, and the processing policy corresponding to the indoor may be more suitable for enabling the video being played to have better visual effect for the user.

Wherein the first processing policy may include: the image processor is directed to the processing strategy of the target video segment and/or the data processor is directed to the processing strategy of the target video segment and/or the display driver is directed to the processing strategy of the target video segment.

The processing strategies of the image processor may include, but are not limited to: image rendering resolution, and/or on-off state of an image sharpening algorithm, and/or on-off state of a contrast enhancement algorithm;

The processing strategies of the data processor may include, but are not limited to: decoding accuracy, and/or frame skipping, and/or frame rate of the target video clip, and/or video encapsulation format of the target video clip, and/or code rate stream of the target video clip, and/or resolution of the target video clip;

the processing strategies of the display driver may include, but are not limited to: screen refresh frequency, and/or screen resolution.

Wherein the image processor may be a GPU or an acceleration processor (Accelerated Processing Units, APU) as described in fig. 1; the data processor may be the DSP described in fig. 1. The high power APU or GPU is most of the time in deep sleep state while the low power SoC with AI processing capability is in listening or monitoring state, when the AI accelerator running on the SoC with low power detects the wake-up element, the SoC is responsible for waking up the APU while the APU performs more complex speech recognition algorithms and performs the corresponding operations, such as playing music, or voice call, etc. In such a system, the smart SoC and APU each have the responsibility, and the overall power consumption can be controlled to a low level since the APU is in a deep sleep state most of the time.

Wherein, the image rendering resolution refers to: resolution of video images output by the image processor. In general, the higher the image rendering resolution, the more power the electronic device consumes to play the target video clip, and conversely, the lower the image rendering resolution, the more power the electronic device consumes to play the target video clip.

An image sharpening algorithm is an image processing method used in an image processor to make the edges of a video image sharper. Generally, the larger the power consumption of the electronic device to play the target video clip, the smaller the power consumption of the electronic device to play the target video clip.

The contrast enhancement algorithm is an image processing method in an image processor for adjusting the gray scale of pixels in a video image, and can improve the visual effect of the video image. In general, the larger the electric quantity consumed by the electronic device to play the target video clip is, the smaller the electric quantity consumed by the electronic device to play the target video clip is.

The decoding accuracy may be a decoding accuracy of the target video segment decoded by the data processor. Generally, the higher the decoding accuracy, the larger the amount of power consumed by the electronic device to play the target video clip, and the lower the decoding accuracy, the smaller the amount of power consumed by the electronic device to play the target video clip.

Frame skipping is a way to reduce the number of video frames displayed per second, and the number of frames skipped each time can be identified by setting a number, for example, a parameter value of 1 for frame skipping indicates that consecutive 1-frame images are skipped, a parameter value of 2 for frame skipping indicates that consecutive 2-frame images are skipped. The larger the skip frame number is, the larger the electric quantity consumed by the electronic equipment for playing the target video clip is, the smaller the skip frame number is, and the smaller the electric quantity consumed by the electronic equipment for playing the target video clip is.

The screen refresh frequency refers to the number of times the screen is refreshed in a unit time, which may be, for example, 1 second. In general, the higher the screen refresh frequency, the larger the amount of power consumed by the electronic device to play the target video clip, and the lower the screen refresh frequency, the smaller the amount of power consumed by the electronic device to play the target video clip.

The screen resolution refers to the number of pixels of the screen displayed in the vertical and horizontal directions, and the unit is px. The higher the screen resolution setting, the clearer the displayed video image, otherwise the more blurred the displayed video image. In general, the higher the screen refresh frequency, the larger the amount of power consumed by the electronic device to play the target video clip, and the higher the screen refresh frequency, the smaller the amount of power consumed by the electronic device to play the target video clip.

The frame rate of the video clips may be set to different values according to different levels of the decoding code rate, continuing the foregoing example, a relatively high frame rate may be set for an environment type of "indoor", a relatively high power consumption of the electronic device, and a relatively low frame rate may be set for an environment type of "outdoor", a relatively low power consumption of the electronic device.

Similar to the frame rate, the video encapsulation format may be set to different values according to different levels of the decoding code rate, continuing the foregoing example, a video encapsulation format with relatively good visual effect may be set for an environment type of "indoor", a video encapsulation format with relatively poor visual effect may be set for an environment type of "indoor", and, taking three encapsulation formats of MPEG2, VC-1 and h.264, which are currently mainstream, as examples, the general visual effect is ordered as h.264> VC-1> MPEG2, and the video encapsulation format may be set as h.264 for an environment type of "indoor", the power consumption for electronic equipment is relatively large, and the power consumption for an environment type of "outdoor" is set as MPEG2, and the power consumption for electronic equipment is relatively small.

Similar to the frame rate and the video encapsulation format, the code rate stream can be set to different values according to different levels of the decoding code rate, and the higher the code rate stream is, the larger the power consumption of the electronic equipment is, the lower the code rate stream is, and the smaller the power consumption of the electronic equipment is.

Likewise, the resolution may be set to different values according to different levels of the decoding rate, the higher the resolution, the lower the decoding rate, and the lower the resolution. Continuing with the foregoing example, a relatively higher resolution may be set for an environment type that is "indoor", such as 3840 x 2048, a relatively greater power consumption for the electronic device, and a relatively lower resolution may be set for an environment type that is "outdoor", such as 1280 x 720, a relatively lesser power consumption for the electronic device.

Step 204: and playing the target video clip according to the first processing strategy.

The step may include:

and receiving a playing instruction of the target video segment, decoding a code stream of the target video segment, rendering each frame of video frame of the target video segment according to the decoded data, and displaying each frame of video frame of the target video segment on a screen in turn.

If the first processing strategy comprises: if the data processor aims at the processing strategy of the target video segment, the electronic device can decode the code stream of the target video segment through the data processor according to the processing strategy of the data processor aiming at the target video segment.

If the first processing strategy comprises: if the image processor is directed to the processing strategy of the target video segment, the electronic device may render each frame of video frame in the target video segment through the image processor according to the processing strategy of the image processor directed to the target video segment.

If the first processing strategy comprises: and if the display driver aims at the processing strategy of the target video segment, the electronic device can display each frame of video frame in the target video segment through the display driver according to the processing strategy of the display driver aiming at the target video segment.

In the method shown in fig. 2, the processing strategy of the target video clip is adjusted according to the environmental data, so that a processing strategy with relatively better visual effect can be provided in an environment with high video attention of a user to the electronic equipment, and a processing strategy with more electric quantity saving of the electronic equipment is provided in an environment with low video attention of the user to the electronic equipment, so that the processing strategy of the video clip is dynamically adjusted, the electronic equipment can meet the watching requirement of the user when playing the video, and the electric quantity of the electronic equipment is more reasonably used, thereby achieving the purpose of saving electricity.

Optionally, referring to fig. 3, before step 201, the method may further include:

step 301: the electronic equipment receives a playing instruction of the target video file, divides the target video file into video clips and determines the target video clips.

The user can instruct to play the video file by selecting one video file and selecting a play control for the video file, and accordingly, the electronic device can receive a play instruction for the target video file, namely, the video file selected by the user.

When the target video file is divided, the lengths of the video segments obtained by dividing may be the same or different, and the number of the video segments divided by the target video file is not limited in the embodiment of the present application, and may be any natural number. It should be noted that, in a general target video file, the smallest unit may be a video frame, and the smallest video clip may be one video frame.

In step 201, the video segments obtained by dividing the target video file may be sequentially used as the target video segments according to the playing sequence. Optionally, for timeliness of playing the target video file, a certain number of video clips with a front position in the target video file can be directly played according to a preset processing strategy, and video clips after the certain number of video clips are sequentially used as target video clips. The specific value of the certain number is not limited in the embodiment of the present application, and may be specifically set based on the angle that the electronic device can provide a smooth video playing effect for a user, in relation to the length of the video clip, the processing speed of the electronic device, and the like; in addition, the processing policy used by the certain number of video clips may be a processing policy that makes a visual effect of video playing relatively worse based on power saving consideration, or the preset processing policy may be a processing policy that makes a visual effect of video playing relatively better based on a user viewing effect consideration, which is not limited in the embodiment of the present application.

Fig. 4A is a flowchart of another embodiment of a video playing method of the present application, as shown in fig. 4A, the method may include:

step 401: the electronic equipment acquires at least one environmental data; the at least one environmental data includes: light data, and/or sound data, and/or air temperature data, and/or air pressure data.

The implementation of this step may refer to the description in step 201, and is not described here in detail.

Step 402: the electronic device obtains at least one data of a target video clip, the at least one data of the target video clip comprising: sound data, and/or image data, and/or subtitle data; the target video clip is a video clip to be played by the electronic device.

The order of execution between step 401 and step 402 is not limited.

The implementation of the target video clip in this step may refer to the corresponding description in step 202, which is not repeated here.

Step 403: the electronic device inputs at least one environmental data and at least one data of the target video segment into a first model, a first processing strategy of the target video segment is obtained, and the first model is used for outputting the processing strategy of the video segment.

Alternatively, the first model may be obtained by AI model cultivation by way of AI cultivation. Specifically, a large number of training samples, for example, more than thousands of training samples, can be fed to an AI machine, and the AI machine calculates, according to the characteristics of the relationship between the data of the training samples, optimal network architecture and neural network structure parameters for calculating a processing strategy for the target video segment, so as to obtain the first model.

Alternatively, the first model may be an AI-aware neural network composed of an AI-aware neural network accelerator, and a Recurrent Neural Network (RNN).

The specific training method of the first model may include: acquiring a training sample marked with a processing strategy; each of the training samples comprises: samples of each of the at least one environmental data, samples of each of the at least one data of the video clip; and inputting the training sample into a preset model for training to obtain the first model. Referring to fig. 4B, taking the preset model as an AI basic model as an example, a training method schematic diagram of the first model is shown.

For example, assuming that the electronic device acquires only one type of environment data and the environment data is sound data, acquiring image data and sound data of a target video clip, the following AI-based model may be used as an initial model of the first model: p (x|v) =α1x1+α2y1+α3y2=β1z1+β2z2, where X1 is sound data in the environment, Y1 is sound data of a target video clip, Y2 is image data of the target video clip, Z1 represents a processing policy of an image processor, Z2 represents a processing policy of a display driver, α1, α2, α3, β1, β2 are weighting parameters to be calculated in the AI cultivation process, and P (x|v) is a first processing policy of the target video clip; wherein, if the environmental data is light data, X1 in the initial model may represent the light data in the environment. Similarly, the initial model may be extended to several cases of environmental data, and several cases of data of the target video segments, which are not listed here.

Further, the sound data in the initial model may be further subdivided into data such as volume, timbre, audio, spatial distribution of sound, etc., and at this time, weighting parameters may be set for each of the subdivided sound data by referring to the foregoing examples, and the weighting parameters of each of the sound data may be calculated by training, so that the first model may be more accurate for the distinction of the environment. Similarly, if the initial model includes light data, the light data may be subdivided into data such as light intensity, light band, and illuminance change, and at this time, weighting parameters may be set for each kind of subdivided light data respectively by referring to the foregoing examples, and the weighting parameters of each kind of light data may be calculated by means of sample training, so that the first model is more accurate for distinguishing the environment.

When the processing strategy is marked for the training sample, the user can set the attention degree of the video being played based on whether the user is suitable for watching the video in the scene corresponding to the training sample. For example, if the environment type includes indoor and outdoor, the data of the video clip is image data, the user is not suitable for watching the video when the light is too strong outdoors, and the image data played by the video clip is a landscape, the attention of the user to the video being played by the electronic device is relatively small, so the processing strategy marked for the training sample may be more prone to save the electric quantity of the electronic device, the user is suitable for watching the video when the light is softer indoors, and the image data played by the video clip is a character fight scene, the attention of the user to the video being played by the electronic device is relatively large, and the processing strategy marked for the training sample may be more prone to make the played video have better visual effect for the user.

Step 404: and the electronic equipment plays the target video clip according to the first processing strategy.

Optionally, referring to fig. 5, before step 401, the method may further include:

step 501: the electronic equipment receives a playing instruction of the target video file, divides the target video file into video clips and determines the target video clips.

The implementation of this step may refer to the description in step 301, and is not described here in detail.

The method shown in fig. 4A and fig. 5 dynamically adjusts the processing strategy of the video according to the environmental data and the data included in the target video clip, so that the processing strategy can be automatically adjusted in the video playing process, the electric quantity consumed by the electronic equipment for playing the video is more reasonable, and the purpose of saving electricity is achieved.

It is to be understood that some or all of the steps or operations in the above embodiments are merely examples, and embodiments of the present application may also perform other operations or variations of various operations. Furthermore, the various steps may be performed in a different order presented in the above embodiments, and it is possible that not all of the operations in the above embodiments are performed.

Fig. 6 is a block diagram of one embodiment of a video playing device according to the present application, and as shown in fig. 6, the device 600 may include:

A first acquisition unit 610 for acquiring at least one environmental data; the at least one environmental data includes: light data, and/or sound data, and/or air temperature data, and/or air pressure data;

a second obtaining unit 620, configured to obtain at least one data of a target video segment, where the at least one data of the target video segment includes: sound data, and/or image data, and/or subtitle data; the target video clip is a video clip to be played by the electronic device;

a policy decision unit 630, configured to input the at least one environmental data and the at least one data of the target video segment into a first model, to obtain a first processing policy of the target video segment, where the first model is used to analyze the processing policy of the video segment;

and a playing unit 640, configured to play the target video clip according to the first processing policy.

Optionally, the first processing policy includes: the image processor is directed to the processing strategy of the target video segment and/or the data processor is directed to the processing strategy of the target video segment and/or the display driver is directed to the processing strategy of the target video segment.

Optionally, the first processing policy includes: the data processor is directed to the processing strategy of the target video clip, and the playing unit 640 may specifically be configured to: and for the target video segment, decoding the code stream of the target video segment according to the processing strategy of the data processor for the target video segment in the first processing strategy to obtain decoded data.

Optionally, the first processing policy includes: the image processor is directed to the processing strategy of the target video clip, and the playing unit 640 may specifically be configured to: and for each video frame in the target video segment, rendering the video frame according to the processing strategy of the image processor aiming at the target video segment in the first processing strategy.

Optionally, the first processing policy includes: the display driver processing strategy for the target video clip, the playing unit 640 may specifically be configured to: and for each video frame in the target video segment, displaying the video frame according to the processing strategy of the display driver aiming at the target video segment in the first processing strategy.

Optionally, the processing strategy of the image processor for the target video segment includes: image rendering resolution, and/or on-off state of an image sharpening algorithm, and/or on-off state of a contrast enhancement algorithm;

and/or the number of the groups of groups,

Optionally, the method further comprises: the model training unit is used for acquiring a training sample marked with a processing strategy; each of the training samples comprises: samples of each of the at least one environmental data, samples of each of the at least one data; and inputting the training sample into a preset model for training to obtain the first model.

Optionally, the first model is an AI-aware neural network composed of an AI-aware neural network accelerator and a recurrent neural network.

Fig. 7 is a block diagram of an embodiment of a video playing device according to the present application, and as shown in fig. 7, the device 700 may include:

an acquisition unit 710 for acquiring at least one environmental data; the at least one environmental data includes: light data, and/or sound data, and/or air temperature data, and/or air pressure data;

a first determining unit 720, configured to determine a first environment type according to the at least one environment data;

a second determining unit 730, configured to determine a first processing policy of the target video segment according to the first environment type; the target video clip is a video clip to be played by the electronic device;

and a playing unit 740, configured to play the target video clip according to the first processing policy.

Alternatively, if an environmental data is acquired, the first determining unit 720 may specifically be configured to: and for the environmental data, determining a second environment type corresponding to the acquired environmental data as the first environment type according to the corresponding relation between a data interval preset for the environmental data and the second environment type.

Alternatively, if at least two environmental data are acquired, the first determining unit 720 may specifically be configured to: for each type of environment data, determining a second environment type corresponding to the acquired environment data according to a corresponding relation between a data interval preset for the environment data and the second environment type; and determining the first environment type according to the second environment type corresponding to each acquired environment data.

Alternatively, the first determining unit 720 may specifically be configured to: calculating a first value according to the preset weight of each acquired environmental data and the value corresponding to the second environmental type; and determining a third environment type corresponding to the first value according to the corresponding relation between the preset value interval and the third environment type, and obtaining the first environment type.

Alternatively, the second determining unit 730 may specifically be configured to: determining a processing strategy corresponding to the first environment type as the first processing strategy according to a corresponding relation between a preset environment type and the processing strategy; the first processing strategy comprises the following steps: the image processor is directed to the processing strategy of the target video segment and/or the data processor is directed to the processing strategy of the target video segment and/or the display driver is directed to the processing strategy of the target video segment.

Optionally, the first processing policy includes: the data processor may be configured to, for the processing policy of the target video clip, the playing unit 740 specifically: and for the target video segment, decoding the code stream of the target video segment according to the processing strategy of the data processor for the target video segment in the first processing strategy to obtain decoded data.

Optionally, the first processing policy includes: the image processor may be configured to, for the processing policy of the target video clip, the playing unit 740 specifically: and for each video frame in the target video segment, rendering the video frame according to the processing strategy of the image processor aiming at the target video segment in the first processing strategy.

Optionally, the first processing policy includes: the playback unit 740 may specifically be configured to: and for each video frame in the target video segment, displaying the video frame according to the processing strategy of the display driver aiming at the target video segment in the first processing strategy.

and/or the number of the groups of groups,

And/or the number of the groups of groups,

The apparatus provided by the embodiments shown in fig. 6 to fig. 7 may be used to implement the technical solutions of the method embodiments shown in fig. 2 to fig. 5 of the present application, and the implementation principle and technical effects may be further referred to in the related descriptions of the method embodiments.

It should be understood that the division of the units of the apparatus shown in fig. 6 to 7 is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these units may all be implemented in the form of software calls through the processing element; or can be realized in hardware; it is also possible that part of the units are implemented in the form of software calls via the processing elements and part of the units are implemented in the form of hardware. For example, the playback unit may be a separately built processing element or may be implemented integrated in a certain chip of the electronic device. The implementation of the other units is similar. Furthermore, all or part of these units may be integrated together or may be implemented independently. In implementation, each step of the above method or each unit above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

For example, the above units may be one or more integrated circuits configured to implement the above methods, such as: one or more specific integrated circuits (Application Specific Integrated Circuit; hereinafter ASIC), or one or more microprocessors (Digital Singnal Processor; hereinafter DSP), or one or more field programmable gate arrays (Field Programmable Gate Array; hereinafter FPGA), etc. For another example, the units may be integrated together and implemented in the form of a System-On-a-Chip (SOC).

As shown in fig. 8, the process from the detection of the user operation to the display of the video in the prior art includes the following stages:

executing an operation instruction aiming at video playing in a UI interface by a user, and correspondingly, detecting an operation event of the user by the electronic equipment;

the operating system of the electronic equipment performs scheduling and configuration based on the operation event of the user;

the driving layer performs the function conversion;

the SOC executes the corresponding assembly instruction;

GPU/DSP carries out graphic operation corresponding to video data; the graphics operation includes a decoding process for video data.

The display driver performs digital-to-analog conversion on the data obtained after the graphic operation to obtain an analog signal for display of the display;

The display displays video based on the analog signal.

In the embodiment of the present application, before the GPU/DSP performs graphics operation on the video data, the processing policy of the video segment may be obtained based on the environmental data and the video data, so that the GPU/DSP and the display driver may perform processing of the video segment based on the processing policy. After adding the data processing in the embodiment of the present application, as shown in fig. 9, the following parts may be added from the detection of the user operation to the display of the video:

the GPU/DSP is added with an environment data receiving interface and an AI perception operation module, wherein the environment data receiving interface is used for receiving environment data output by an environment sensor; the AI perception operation module can be provided with a first model, so that at least one type of environment data and at least one type of data of the target video segment are input into the first model to obtain a first processing strategy of the target video segment;

the GPU/DSP may then perform subsequent graphics operations according to the first processing policy, the graphics operations being optimized based on the first processing policy.

The application also provides an electronic device comprising: a display screen; one or more processors; a memory; and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions, which when executed by the apparatus, cause the apparatus to perform the methods provided by the embodiments shown in fig. 2-5 of the present application.

The present application also provides an electronic device, where the device includes a storage medium and a central processing unit, where the storage medium may be a nonvolatile storage medium, where a computer executable program is stored in the storage medium, and where the central processing unit is connected to the nonvolatile storage medium and executes the computer executable program to implement a method provided by an embodiment shown in fig. 2 to 5 of the present application.

Embodiments of the present application also provide a computer-readable storage medium having a computer program stored therein, which when run on a computer, causes the computer to perform the methods provided by the embodiments shown in fig. 2-5 of the present application.

Embodiments of the present application also provide a computer program product comprising a computer program which, when run on a computer, causes the computer to perform the methods provided by the embodiments shown in fig. 2-5 of the present application.

In the embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relation of association objects, and indicates that there may be three kinds of relations, for example, a and/or B, and may indicate that a alone exists, a and B together, and B alone exists. Wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of the following" and the like means any combination of these items, including any combination of single or plural items. For example, at least one of a, b and c may represent: a, b, c, a and b, a and c, b and c or a and b and c, wherein a, b and c can be single or multiple.

Those of ordinary skill in the art will appreciate that the various elements and algorithm steps described in the embodiments disclosed herein can be implemented as a combination of electronic hardware, computer software, and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In several embodiments provided herein, any of the functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (hereinafter referred to as ROM), a random access Memory (Random Access Memory) and various media capable of storing program codes such as a magnetic disk or an optical disk.

The foregoing is merely specific embodiments of the present application, and any person skilled in the art may easily conceive of changes or substitutions within the technical scope of the present application, which should be covered by the protection scope of the present application. The protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A video processing method based on an AI model, comprising:

inputting the at least one environmental data and the at least one data of the target video segment into a first model, and obtaining a first processing strategy of the target video segment, wherein the first model is used for analyzing the processing strategy of the video segment; the first processing policy includes a processing policy for the target video segment by a data processor; the processing strategy of the data processor for the target video clip comprises a video packaging format of the target video clip; if the current environment is determined to be an outdoor environment according to the at least one environment data, and the image data of the target video clip is a mountain-water picture, the first processing strategy is to set the video packaging format to be MPEG2, so that the electric quantity of the electronic equipment is saved; if the current environment is determined to be an indoor environment according to the at least one environment data, and the image data of the target video segment is a character fight scene, the first processing strategy is to set a video packaging format as H.264, so that the visual effect of the target video segment is improved;

And playing the target video clip according to the first processing strategy.

2. The method of claim 1, wherein the first processing policy, in addition to the processing policy of the data processor for the target video segment, further comprises: the image processor is directed to the processing strategy of the target video segment and/or the display driver is directed to the processing strategy of the target video segment.

3. The method of claim 2, wherein the first processing policy comprises: the data processor plays the target video clip according to the first processing policy with respect to the processing policy of the target video clip, including:

4. The method of claim 2, wherein the first processing policy comprises: the image processor plays the target video clip according to the first processing policy with respect to the processing policy of the target video clip, including:

5. The method of claim 2, wherein the first processing policy comprises: the display driver is directed to the processing strategy of the target video segment, and playing the target video segment according to the first processing strategy comprises the following steps:

6. The method of any of claims 2 to 5, wherein the processing strategy of the image processor for the target video segment comprises: image rendering resolution, and/or on-off state of an image sharpening algorithm, and/or on-off state of a contrast enhancement algorithm;

the processing strategy of the data processor for the target video segment includes, in addition to the video packaging format of the target video segment: decoding accuracy, and/or frame skipping, and/or frame rate of the target video segment, and/or code rate stream of the target video segment, and/or resolution of the target video segment;

7. The method according to any one of claims 2 to 5, wherein the first model is pre-trained, the training method comprising:

8. The method of any of claims 1 to 7, wherein the first model is an AI-aware neural network comprised of an artificial intelligence AI-aware neural network accelerator and a recurrent neural network.

9. A video processing method applied to an electronic device, comprising:

determining a first environment type from the at least one environment data;

Determining a first processing strategy of the target video clip according to the first environment type; the target video clip is a video clip to be played by the electronic device, and the first processing strategy comprises a processing strategy of a data processor aiming at the target video clip; the processing strategy of the data processor for the target video segment comprises a video packaging format of the target video segment; if the first environment type is an outdoor environment and the image data of the target video clip is a mountain-water picture, the first processing strategy is to set the video packaging format to be MPEG2, so that the electric quantity of the electronic equipment is saved; if the first environment type is an indoor environment and the image data of the target video segment is a character fight scene, the first processing strategy is to set a video packaging format as H.264, so that the visual effect of the target video segment is improved;

and playing the target video clip according to the first processing strategy.

10. The method of claim 9, wherein if one of the environmental data is acquired, said determining the first environmental type based on the at least one environmental data comprises:

or alternatively, the process may be performed,

11. The method of claim 10, wherein the determining the first environment type according to the second environment type corresponding to each of the acquired environment data comprises:

12. The method according to any one of claims 9 to 11, wherein said determining a first processing policy of a target video according to said first environment type comprises:

determining a processing strategy corresponding to the first environment type as the first processing strategy according to a corresponding relation between a preset environment type and the processing strategy; the first processing strategy includes, in addition to the processing strategy of the data processor for the target video segment: the image processor is directed to the processing strategy of the target video segment and/or the display driver is directed to the processing strategy of the target video segment.

13. The method of claim 12, wherein the first processing policy comprises: the data processor plays the target video clip according to the first processing policy with respect to the processing policy of the target video clip, including:

14. The method of claim 13, wherein the processing strategy for the target video segment by the image processor comprises: image rendering resolution, and/or on-off state of an image sharpening algorithm, and/or on-off state of a contrast enhancement algorithm;

15. A portable electronic device, comprising:

the strategy decision unit is used for inputting the at least one environmental data and the at least one data of the target video segment into a first model to obtain a first processing strategy of the target video segment, and the first model is used for analyzing the processing strategy of the video segment; the first processing policy includes a processing policy for the target video segment by a data processor; the processing strategy of the data processor for the target video segment comprises a video packaging format of the target video segment; if the current environment is determined to be an outdoor environment according to the at least one environment data, and the image data of the target video clip is a mountain-water picture, the first processing strategy is to set the video packaging format to be MPEG2, so that the electric quantity of the electronic equipment is saved; if the current environment is determined to be an indoor environment according to the at least one environment data, and the image data of the target video segment is a character fight scene, the first processing strategy is to set a video packaging format as H.264, so that the visual effect of the target video segment is improved;

16. A portable electronic device, comprising:

the second determining unit is used for determining a first processing strategy of the target video clip according to the first environment type; the target video clip is a video clip to be played by the electronic device; the first processing policy includes a processing policy for the target video segment by a data processor; the processing strategy of the data processor for the target video segment comprises a video packaging format of the target video segment; if the first environment type is an outdoor environment and the image data of the target video clip is a mountain-water picture, the first processing strategy is to set the video packaging format to be MPEG2, so that the electric quantity of the electronic equipment is saved; if the first environment type is an indoor environment and the image data of the target video segment is a character fight scene, the first processing strategy is to set a video packaging format as H.264, so that the visual effect of the target video segment is improved;