CN115914746A

CN115914746A - Video processing method and device, electronic equipment and storage medium

Info

Publication number: CN115914746A
Application number: CN202211104664.3A
Authority: CN
Inventors: 刘志欣
Original assignee: Shanghai Moxiang Network Technology Co ltd
Current assignee: Shanghai Moxiang Network Technology Co ltd
Priority date: 2022-09-09
Filing date: 2022-09-09
Publication date: 2023-04-04

Abstract

The disclosure provides a video processing method, a video processing device, electronic equipment and a storage medium, and relates to the technical field of multimedia information acquisition and processing, in particular to the technical field of video processing. The specific implementation scheme is that the video stream recorded in real time is cached to obtain video cache data; generating a main video based on the video cache data; the main video is a video generated in the case where the recording is finished; processing the video cache data to generate a target video under the condition that a specified trigger condition is met; the target video is a video generated during recording, and the generation time of the target video is not later than that of the main video. Through the process, the short videos corresponding to the plurality of wonderful segments can be obtained in real time in the process of recording the main video by the user, and are edited, stored and shared. Therefore, the short video processing efficiency can be improved, and the user experience is improved.

Description

Video processing method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of multimedia information collection and processing technologies, and in particular, to a video processing method and apparatus, an electronic device, and a storage medium.

Background

The conventional video recording device cannot independently store, view, edit and share the recorded part before the video recording is finished. After the video recording is finished, the video file occupies a large memory, and therefore, a large amount of computing power is required to be occupied during processing, and time and labor are wasted.

Therefore, how to identify the highlight in the video recording process and store or edit the highlight in real time becomes a problem to be solved.

Disclosure of Invention

The disclosure provides a video processing method, a video processing device, an electronic device and a storage medium.

According to an aspect of the present disclosure, there is provided a video processing method, which may include:

caching the real-time recorded video stream to obtain video cache data;

generating a main video based on the video cache data; the main video is a video generated in the case where the recording is finished;

processing the video cache data to generate a target video under the condition that a specified trigger condition is met; the target video is a video generated during recording, and the generation time of the target video is not later than that of the main video.

According to another aspect of the present disclosure, there is provided a video processing apparatus, which may include:

the cache module is used for caching the video stream recorded in real time to obtain video cache data;

the main video generating module is used for generating a main video based on the video cache data; the main video is a video generated in the case where the recording is finished;

the target video generation module is used for processing the video cache data to generate a target video under the condition that a specified trigger condition is met; the target video is a video generated during recording, and the generation time of the target video is not later than that of the main video.

According to another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method according to any one of the embodiments of the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method in any embodiment of the present disclosure.

According to the technical scheme, the video processing method is provided, and the short videos corresponding to the plurality of highlight segments can be obtained in real time in the process of recording the main video by the user. Therefore, the short video processing efficiency can be improved, and the user experience is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a flow chart of a video processing method according to the present disclosure;

FIG. 2 is a flow chart of a manner of triggering condition determination according to the present disclosure;

FIG. 3 is a flow chart for generating a target video according to the present disclosure;

FIG. 4 is a flow chart for determining a target video start time and end time according to the present disclosure;

FIG. 5 is a flow chart for generating a master video according to the present disclosure;

FIG. 6 is an electronic device process flow diagram according to the present disclosure;

FIG. 7 is a block diagram of a video processing device according to the present disclosure;

FIG. 8 is a block diagram of an electronic device implementing feature image processing of an embodiment of the disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

As shown in fig. 1, the present disclosure relates to a video processing method, which may include the steps of:

s101: caching the real-time recorded video stream to obtain video cache data;

s102: generating a main video based on the video cache data; the main video is a video generated in the case where the recording is finished;

s103: processing the video cache data to generate a target video under the condition that a specified trigger condition is met; the target video is a video generated during recording, and the generation time of the target video is not later than that of the main video.

The embodiment can be applied to electronic equipment with a video recording function and a video processing function, including but not limited to desktop computers, notebook computers, mobile phones, unmanned aerial vehicles or pan-tilt equipment. The video data is recorded by utilizing the video recording function. And processing and storing the recorded video data by utilizing a video processing function. Processing may include editing, clipping, packaging, etc. of the video data.

Before step S101 is executed, the electronic device first obtains a video stream recorded in real time by using a video recording function. The video stream recorded in real time includes image data and audio data obtained by recording the target object. The image data can be acquired through at least one camera, and the audio data can be acquired through acquisition equipment such as a microphone. The recording object may be a scene, a person, or an object, and is not limited herein.

After the video stream is obtained, the video stream is cached to obtain video cache data. Specifically, a video encoder may first perform encoding processing on a video stream to obtain video encoded data. And then carrying out caching processing on the video coded data to obtain video caching data.

The video encoder may acquire images in a real-time recorded video stream at a fixed frequency, and perform encoding processing on the acquired continuous images to obtain video encoded data. The encoding algorithm may use the MPEG standard or the h.264 standard, and is not limited herein.

And carrying out buffer processing on the video coded data, wherein the buffer processing comprises the step of storing the video coded data in a video buffer area in an internal memory. The capacity of the video buffer may be preset, for example, half of the memory capacity, or one fourth of the content capacity, and may be set according to the needs.

After the video cache data is generated, whether a specified trigger condition is met is monitored in real time. And in the case that the specified trigger condition is not satisfied, generating a main video based on the video cache data, wherein the main video is generated in the case that the recording is finished. The recording end time may be a time when the video recording end time corresponds to the video recording end time, or a time after the recording end time corresponds to a certain time, for example, any time within a preset time period after the recording end time, and the preset time period may be 5s,10s, and 15s, and the like, which is not limited herein.

Under the condition that the specified triggering condition is met, the video cache data can be processed to obtain the target video. The target video may be a short video corresponding to a highlight in a video stream recorded in real time, or may also be a short video corresponding to a time period set by a user, which is not limited herein.

The specified trigger condition may be a trigger condition automatically generated based on the content of the video stream, or may be a trigger condition generated after receiving a control instruction issued by a user, which is not limited herein.

Through the above process, the short videos corresponding to the plurality of wonderful segments in the main video can be automatically acquired in real time in the process of recording the main video by the user, so that the target video can be generated while the main video is recorded, the efficiency of making the short videos is improved, and the user experience is improved.

As shown in fig. 2, in one embodiment, the specifying of the determination manner of the trigger condition includes:

s201: identifying the video stream to obtain an identification result; the recognition result comprises at least one of an image recognition result and a voice recognition result;

s202: determining that a specified trigger condition is met under the condition that the recognition result meets a preset condition; the preset condition is used for representing the interest degree of the user in the video stream content.

The embodiment provides a trigger condition determining mode based on an artificial intelligence technology, wherein the artificial intelligence technology can be implemented by identifying video content by using a pre-trained content identification model and then determining whether a preset condition is met according to an identification result. The preset condition represents the interest degree of the user in the video stream content.

Specifically, before encoding and buffering a video stream recorded in real time, the embodiment extracts image features and sound features in the video stream to be recognized through a pre-trained content recognition model, and determines a recognition result based on the image features and the sound features.

The pre-trained content recognition model may be an artificial intelligence network model selected as needed, such as a convolutional neural network, which is not limited herein. The content recognition model may be trained using the annotated video samples, and in particular, at least one of image data samples and sound data samples may be used as input data for the content recognition model. The content recognition model can obtain a predicted value of the content recognition result according to the input data, and the predicted value can be represented in the form of probability. For example, the probability of the content recognition result being "cheering" is a%, and the probability of the content recognition result being "screaming" is b%. Parameters in the content recognition model are adjusted by using the error between the labeled recognition result (true value of the recognition result) and the predicted value of the recognition result. The above error can be embodied by a loss function, and the effect of the loss function can be understood as: when a predicted value obtained by forward propagation of the content recognition model to be trained is close to the true value, the loss function takes a smaller value; conversely, the value of the loss function increases. The loss function is a function having a parameter in the content recognition model as an argument.

And adjusting all parameters in the content recognition model to be trained by utilizing the errors. The error is propagated reversely in each layer of the content recognition model to be trained, and the parameter of each layer of the content recognition model to be trained is adjusted according to the error until the output result of the content recognition model to be trained is converged or the expected effect is achieved.

And determining that the specified trigger condition is met under the condition that the recognition result meets the preset condition. Wherein the recognition result may include only the image recognition result. Correspondingly, the preset condition may be a pre-stored image database. For example, in a scene that a user is ready to record a dance video, when the image recognition result shows that the user enters the middle position of the image and is ready for dancing, the system considers that the preset trigger condition is met. The dancing preparation can be that the user makes a specified gesture or a specified posture, so that the dancing preparation of the user can be determined.

The recognition result may include only the voice recognition result. Correspondingly, the preset condition may be a pre-stored sound database. For example, in a video surveillance scene, when a laughing or cheering sound occurs in the environment, it is considered that a preset trigger condition is satisfied. Alternatively, when a sound with instruction meaning, for example, "can start", "start recording", or the like, appears in the environment, it may be considered that the preset trigger condition is satisfied.

The recognition result can also comprise an image recognition result and a sound recognition result at the same time, and the current event corresponding to the video stream is determined by using the image recognition result and the sound recognition result. Correspondingly, the preset condition may be a pre-stored event database. And determining that a specified trigger condition is met under the condition that the current event belongs to a preset event. For example, in the case where the image recognition result is a "cake" image and the sound recognition result is a "happy birthday song melody", it is determined that the event corresponding to the current video stream is "holding a birthday party". At this time, whether the event belongs to the event in the event database is judged, and under the condition that the event belongs to the event database, the specified trigger condition is determined to be met. The event database may be configured as desired, for example, singing performance, dinner gathering, parent-child games, etc., and is not exhaustive herein.

The recognition result can also be a content score value calculated by the content recognition model according to the input video stream content. At this time, the preset condition may be a score threshold corresponding to the user interest level. And determining that the specified trigger condition is met under the condition that the content score value corresponding to the identification result is greater than the score threshold value. The content score value may be 20 points, 40 points, 60 points, 80 points, etc., and is not limited herein. The score threshold corresponding to the preset condition may also be set according to the requirement, for example, 60, 65, 70, etc.

Through the process, the triggering condition can be automatically determined according to the video content in the process of recording the main video by the user, and the target video containing the wonderful segments can be automatically generated.

In one embodiment, specifying the determination of the trigger condition includes: and determining that the specified trigger condition is met under the condition that the trigger control instruction is received.

Specifically, the trigger control instruction may be a control instruction generated by a user by operating a trigger switch of the aforementioned electronic device. The trigger switch may be a physical key of the electronic device or a virtual key option located in a display screen of the electronic device, or may be a corresponding key on a wireless remote controller connected to the electronic device or a virtual option on an intelligent device such as a mobile phone or a tablet computer, which is not limited herein.

As shown in fig. 3, in an embodiment, processing the video cache data to obtain a target video includes:

s301: determining a starting time and a terminating time corresponding to the target video based on the triggering time meeting the specified triggering condition;

s302: and performing first packaging processing on the video cache data based on the starting time and the ending time corresponding to the target video to obtain the target video.

The trigger time may be a time at which a specified trigger condition is generated, specifically, a trigger time determined based on a pre-trained content recognition model, or a trigger time generated by a user by controlling a trigger switch, which is not limited herein.

The first encapsulation process may include a process of generating video body data and video header data using a time period corresponding to the target video. The video main body data may be data generated by intercepting the video cache data by using the determined start time and end time, and include image data and sound data. The video header data may be data generated by setting relevant parameters as needed. And integrating the main data and the head data and performing video compression work to generate a target video.

The data format of the first encapsulation process may be set according to needs, for example, an AVI (Audio Video Interleaved) format, an FLV (Flash Video streaming) format, and the like, which are not limited herein.

As shown in fig. 4, in one embodiment, the determining manner of the start time includes:

s401: taking the trigger time as the starting time of the target video;

or,

s402: determining the starting time of the target video based on the triggering time and the first preset time length; the first preset time duration is a time duration based on the trigger time to trace back or delay backwards.

The trigger time may be used as the start time, or a certain time before or after the trigger time may be used as the start time as needed. Specifically, the first preset duration may be traced back or delayed backward based on the trigger time according to a preset first preset duration, as the starting time. For example, the trigger time is 10 minutes and 20 seconds, and 10 minutes and 20 seconds can be used as the starting time of the packaging target video; for another example, the trigger time is 10 minutes and 20 seconds, the first preset time duration is 10 seconds, and at this time, 10 minutes and 10 seconds or 10 minutes and 30 seconds may be used as the start time of the encapsulation target video according to needs.

The upper limit of the first preset duration depends on the size of the memory capacity, and the larger the memory capacity is, the larger the upper limit of the first preset duration is. For example, the first preset time period may be 5s,10s, and 15s, which are not exhaustive.

Through the above process, the time period information corresponding to the target video can be set as required, and particularly, by tracing the first preset time forward, the video in a period of time before the trigger time can be used as a part of the target video, which is beneficial to completely recording the cause, the passing and the result of the highlight moment in the target video.

In one embodiment, the determining of the termination time includes:

determining the termination time of the target video based on the recognition result;

or,

and determining the termination time of the target video based on the starting time and the second preset time length.

The termination time may be determined based on the recognition result of the content recognition model trained in advance, or may be determined by adding the second preset time duration to the start time.

For example, in a scene that a user records a dance video, when the image recognition result is that the user makes a gesture of ending dance and leaves an image middle position, the corresponding time point may be taken as a termination time.

The value range of the second preset duration is the same as the first preset duration, which is not described herein.

Through the process, the event process before the trigger moment can be traced back, the cause, the process and the result of the wonderful segment can be completely recorded, and the integrity of the recorded target video is improved. In addition, the user does not need to spend extra energy to record the moments and carry out post-editing processing, and the experience is better.

As shown in fig. 5, in one embodiment, the method further includes:

s501: generating video recording data based on the video cache data;

s502: and performing second packaging processing on the video recording data to generate a main video.

The video buffer data may be data stored in a video buffer area in the memory, and the video recording data may be data stored in an external storage space when recording is completed, for example, the external storage space may be an SD card, which is not limited herein.

The second encapsulation process and the first encapsulation process may share the same physical encapsulator, which needs to have the processing capability to generate the primary video and the target video simultaneously. For example, the video encapsulator meets the above-described processing power requirements by a time slice round robin approach.

In addition, two independent physical encapsulators may also be disposed in the electronic device, and at this time, the first encapsulation process and the second encapsulation process may be executed in parallel, which is not described herein.

The primary video format obtained by the second encapsulation process may be AVI, FLV, etc., which is not limited herein.

The frame rate of the main video may be the same as or different from that of the target video, and may be specifically set as required. For example, 30 frames/second, 10 frames/second, etc., without limitation.

The obtained target video and the main video may be stored in a storage space inside the electronic device, or may be stored in an external storage space connected to the electronic device or a cloud storage device, which is not limited herein.

As shown in fig. 6, the present disclosure relates to a video processing method, including the steps of:

the video acquisition device acquires and acquires a video stream in real time.

The video stream enters the algorithm decider and the video encoder through the splitter respectively. The algorithm judger adopts a pre-trained content recognition model to extract the characteristics of the video stream, and determines the corresponding image recognition result and audio recognition result according to the characteristic information.

And the video stream entering the video encoder is encoded and cached to generate video cache data. And under the condition that the image recognition result and the audio recognition result meet the specified triggering condition or a hardware control instruction is sent by a user, recording the video stream in real time, and simultaneously performing first packaging processing on the video cache data to generate a target video containing the wonderful segments. Wherein, the specified trigger condition can be that a face image and a pet image appear in the video stream, or can be that cheering sound and screaming sound are given out.

In addition, after the recording is finished, second packaging processing is carried out on the video recording data to generate a main video.

As shown in fig. 7, the present disclosure relates to a video processing apparatus including:

a caching module 701, configured to cache a real-time recorded video stream to obtain video cache data;

a main video generating module 702, configured to generate a main video based on the video cache data; the main video is a video generated in the case where the recording is finished;

a target video generating module 703, configured to process the video cache data to generate a target video when a specified trigger condition is met; the target video is a video generated during recording, and the generation time of the target video is not later than that of the main video.

The present disclosure relates to a video processing apparatus, wherein the target video generating module 703 includes:

the identification submodule is used for identifying the video stream to obtain an identification result; the recognition result comprises at least one of an image recognition result and a voice recognition result;

the first trigger submodule is used for determining that the specified trigger condition is met under the condition that the identification result meets the preset condition; the preset condition is used for representing the interest degree of the user in the video stream content.

and the second trigger submodule is used for determining that the specified trigger condition is met under the condition that the trigger control instruction is received.

the trigger execution sub-module is used for determining the starting time and the ending time corresponding to the target video based on the triggering time meeting the specified triggering condition;

and the first packaging sub-module is used for carrying out first packaging processing on the video cache data based on the starting time and the ending time corresponding to the target video to obtain the target video.

The present disclosure relates to a video processing apparatus, wherein the trigger execution submodule includes a start time determination submodule configured to:

taking the trigger time as the starting time of the target video;

or,

determining the starting time of the target video based on the triggering time and the first preset time length; the first preset time duration is a time duration based on the trigger time to trace back or delay backwards.

The present disclosure relates to a video processing apparatus, wherein the trigger execution submodule includes a termination time determination submodule configured to:

or,

The present disclosure relates to a video processing apparatus, wherein a main video generation module 702 includes:

the recording module is used for generating video recording data based on the video cache data;

and the second packaging submodule is used for carrying out second packaging processing on the video recording data to generate a main video.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM802, and the RAM803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, for example, a method of video processing. For example, in some embodiments, the method of video processing may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM802 and/or communications unit 809. When loaded into RAM803 and executed by the computing unit 801, a computer program may perform one or more steps of the method of video processing described above. Alternatively, in other embodiments, the computing unit 801 may be configured in any other suitable way (e.g., by means of firmware) to perform the method of video processing.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, causes the functions/acts specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. A video processing method, comprising:

caching the video stream recorded in real time to obtain video cache data;

generating a main video based on the video cache data; the main video is a video generated under the condition that recording is finished;

2. The method of claim 1, wherein the specified manner of trigger condition determination comprises:

identifying the video stream to obtain an identification result; the recognition result comprises at least one of an image recognition result and a voice recognition result;

determining that the specified trigger condition is met under the condition that the identification result meets a preset condition; the preset condition is used for representing the interest degree of the user in the video stream content.

3. The method of claim 1, wherein the specified trigger condition is determined in a manner comprising:

and determining that the specified trigger condition is met under the condition that a trigger control instruction is received.

4. The method according to any one of claims 2 or 3, wherein the processing the video cache data to generate the target video comprises:

determining a starting time and a terminating time corresponding to the target video based on the triggering time meeting the specified triggering condition;

and performing first packaging processing on the video cache data based on the starting time and the ending time corresponding to the target video to obtain the target video.

5. The method of claim 4, wherein the starting time is determined in a manner that comprises:

taking the trigger time as the starting time of the target video;

or,

determining the starting time of the target video based on the triggering time and a first preset time length; the first preset time is a time based on the trigger time going back or delaying backwards.

6. The method of claim 5, the termination time determination comprising:

determining a termination time of the target video based on the identification result;

or,

7. The method of any of claims 1-6, the generating a primary video based on the video cache data, comprising:

generating video recording data based on the video cache data;

and performing second packaging processing on the video recording data to generate a main video.

8. A video processing apparatus comprising:

the main video generation module is used for generating a main video based on the video cache data; the main video is a video generated under the condition that recording is finished;

the target video generation module is used for processing the video cache data to generate a target video under the condition that a specified trigger condition is met; the target video is a video generated during recording, and the target video is generated at a time no later than the main video.

9. The apparatus of claim 8, wherein the target video generation module comprises:

10. The apparatus of claim 8, wherein the target video generation module comprises:

and the second trigger submodule is used for determining that the specified trigger condition is met under the condition that a trigger control instruction is received.

11. The apparatus of claim 9, wherein the target video generation module comprises:

the trigger execution sub-module is used for determining the starting time and the ending time corresponding to the target video based on the trigger time meeting the specified trigger condition;

and the first packaging sub-module is used for performing first packaging processing on the video cache data based on the starting time and the ending time corresponding to the target video to obtain the target video.

12. The apparatus of claim 11, the trigger execution submodule comprising a start time determination submodule to:

taking the trigger moment as the starting moment of the target video;

or,

13. The apparatus of claim 12, the trigger execution submodule comprising a termination time determination submodule to:

determining the termination time of the target video based on the identification result;

or,

14. The apparatus of any of claims 8-13, the master video generation module, comprising:

and the second packaging sub-module is used for carrying out second packaging processing on the video recording data to generate a main video.

15. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.

16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.