CN110636337B

CN110636337B - Video image intercepting method, device and system

Info

Publication number: CN110636337B
Application number: CN201910770244.0A
Authority: CN
Inventors: 荣景湛
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-08-20
Filing date: 2019-08-20
Publication date: 2021-08-17
Anticipated expiration: 2039-08-20
Also published as: CN110636337A

Abstract

The application discloses a video image intercepting method, a video image intercepting device and a video image intercepting system, wherein the method comprises the following steps: a first terminal receives a video image intercepting instruction which is sent by a server and comprises a timestamp of a target video image; intercepting a packaging data packet of a target video image corresponding to the timestamp and decoder configuration information corresponding to the packaging data packet from a target video file; decapsulating the encapsulated data packet to obtain a decapsulated video frame; and sending the de-encapsulated video frame and the decoder configuration information to the server so that the second terminal generates a target video image based on the de-encapsulated video frame and the decoder configuration information on the server side. By using the technical scheme provided by the application, the CPU consumption of the terminal side recorded with the video file, the processing time of intercepting the video image and the data flow consumption in the transmission process can be reduced, and the overall operation performance of the system when the video image is intercepted by terminals with poor processing performance, such as a driving recorder and the like, is improved.

Description

Video image intercepting method, device and system

Technical Field

The present application relates to the field of multimedia information technologies, and in particular, to a method, an apparatus, and a system for capturing a video image.

Background

In recent years, multimedia information technology has been developed rapidly, video is an important component in multimedia information, various information can be effectively recorded, and video images in video are often used as the basis for various transactions, so how to capture video images from video becomes an important subject of research in recent years.

In the prior art, in the process of capturing a Video image, for a Video image with a specified timestamp, a terminal (e.g., a driving recorder) that records a Video file needs to first obtain a Video frame in a package format from the Video file, decapsulate the Video frame in the package format, then decode the decapsulated Video frame (currently, the mainstream format is H264, and a highly compressed digital Video codec standard provided by a Joint Video Team (JVT, Joint Video Team) jointly composed of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC dynamic image experts group (MPEG)) into YUV, encode the YUV into a picture display format such as JPEG, and finally upload the encoded picture to a third party cloud storage. And downloading and displaying the JPEG picture when the frame of picture is needed to be used subsequently. In the above prior art, the terminal side recorded with the video file needs to perform decoding, encoding and other processing, but for terminals with poor processing performance such as a car recorder, performing frame cropping on such terminals consumes a large amount of CPU resources, and it takes a long time to perform frame cropping encoding on a single picture. The large amount of CPU consumption causes the overall operation condition of the system to be busy, and simultaneously influences the experience of other applications on the terminal. Meanwhile, the JPEG picture has a large size, which further causes data traffic consumption in the transmission process. Therefore, there is a need to provide a more reliable or efficient solution.

Disclosure of Invention

The application provides a video image intercepting method, a video image intercepting device and a video image intercepting system, which can greatly reduce the CPU consumption of a first terminal side recorded with a video file and the processing time of intercepting the video image, improve the overall operation performance of the system when the video image is intercepted by a first terminal with poor processing performance such as a driving recorder and the like, reduce the influence on other applications on the terminal, and simultaneously reduce the data flow consumption in the transmission process.

In one aspect, the present application provides a method for capturing a video image, where the method includes:

receiving a video image intercepting instruction of a target video file sent by a server, wherein the video image intercepting instruction comprises a timestamp of the target video image;

intercepting a packaging data packet of a target video image corresponding to the timestamp and decoder configuration information corresponding to the packaging data packet from the target video file;

decapsulating the encapsulated data packet to obtain a decapsulated video frame;

and sending the decapsulated video frame and the decoder configuration information to the server, so that the second terminal generates the target video image based on the decapsulated video frame and the decoder configuration information on the server side.

In another aspect, an apparatus for capturing a video image is provided, the apparatus comprising:

the video image intercepting instruction receiving module is used for receiving a video image intercepting instruction of a target video file sent by a server, and the video image intercepting instruction comprises a timestamp of the target video image;

the data intercepting module is used for intercepting an encapsulated data packet of the target video image corresponding to the timestamp and decoder configuration information corresponding to the encapsulated data packet from the target video file;

the decapsulation processing module is used for decapsulating the encapsulated data packet to obtain a decapsulated video frame;

and the first data sending module is used for sending the decapsulated video frame and the decoder configuration information to the server so that the second terminal generates the target video image based on the decapsulated video frame and the decoder configuration information on the server side.

Another aspect provides a video image capturing terminal, including a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or a set of instructions, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the video image capturing method as described above.

Another aspect provides a computer readable storage medium having stored therein at least one instruction, at least one program, code set or set of instructions, which is loaded and executed by a processor to implement the method of intercepting a video image as described above.

In another aspect, another method for capturing a video image is provided, the method including:

sending a video image intercepting instruction of a target video file to a first terminal, wherein the video image intercepting instruction comprises a timestamp of the target video file, so that the first terminal intercepts a packaged data packet of the target video file corresponding to the timestamp and decoder configuration information corresponding to the packaged data packet from the target video file, and decapsulates the packaged data packet to obtain a decapsulated video frame;

receiving the decapsulated video frame and the decoder configuration information sent by the first terminal;

receiving an acquisition request of a target video image sent by a second terminal;

and sending the decapsulated video frame and the decoder configuration information corresponding to the target video image to a second terminal, so that the second terminal generates the target video image based on the decapsulated video frame and the decoder configuration information.

Another aspect provides another apparatus for capturing video images, the apparatus comprising:

a video image capture instruction sending module, configured to send a video image capture instruction for a target video file to a first terminal, where the video image capture instruction includes a timestamp of the target video file, so that the first terminal captures, from the target video file, an encapsulated data packet of the target video file corresponding to the timestamp and decoder configuration information corresponding to the encapsulated data packet, and decapsulates the encapsulated data packet to obtain a decapsulated video frame;

a first data receiving module, configured to receive the decapsulated video frame and the decoder configuration information sent by the first terminal;

the acquisition request receiving module is used for receiving an acquisition request of a target video image sent by a second terminal;

and the second data sending module is used for sending the decapsulated video frame and the decoder configuration information corresponding to the target video image to a second terminal, so that the second terminal generates the target video image based on the decapsulated video frame and the decoder configuration information.

sending an acquisition request of a target video image to a server;

receiving the de-encapsulated video frame and decoder configuration information corresponding to the target video image sent by the server;

generating the target video image based on the decapsulated video frame and the decoder configuration information;

the decapsulated video frame and the decoder configuration information of the server side are acquired by adopting the following modes:

the method comprises the steps that a server sends a video image intercepting instruction of a target video file to a first terminal, wherein the video image intercepting instruction comprises a timestamp of the target video image;

the first terminal intercepts a packaging data packet of a target video image corresponding to the timestamp and decoder configuration information corresponding to the packaging data packet from the target video file;

the first terminal carries out decapsulation processing on the encapsulated data packet to obtain a decapsulated video frame;

and the first terminal sends the decapsulated video frame and the decoder configuration information to the server.

the acquisition request sending module is used for sending an acquisition request of a target video image to the server;

the second data receiving module is used for receiving the de-encapsulated video frame and the decoder configuration information which are sent by the server and correspond to the target video image;

a target video image generation module, configured to generate the target video image based on the decapsulated video frame and the decoder configuration information;

In another aspect, a video image capturing system is provided, where the system includes: the system comprises a server, a first terminal and a second terminal;

the server is used for sending a video image intercepting instruction of a target video file to the first terminal, wherein the video image intercepting instruction comprises a timestamp of the target video image; the video frame and the decoder configuration information after de-encapsulation corresponding to the target video image are sent to a second terminal;

the first terminal is used for intercepting an encapsulation data packet of the target video image corresponding to the timestamp and decoder configuration information corresponding to the encapsulation data packet from the target video file; the video frame processing module is used for carrying out decapsulation processing on the encapsulated data packet to obtain a decapsulated video frame; and for sending the decapsulated video frame and the decoder configuration information to the server;

the second terminal is used for sending an acquisition request of the target video image to the server; and generating the target video image based on the de-encapsulated video frame corresponding to the target video image and the decoder configuration information sent by the server.

The video image interception method, the video image interception device and the video image interception system have the following technical effects:

in the video image intercepting process, after a first terminal recording a video file receives a video image intercepting instruction, intercepting a video frame of a target video image and decoder configuration information of the video frame, and performing decapsulation processing on the intercepted video frame in a packaging format; only the decapsulation and the processing of the intercepted and encapsulated data packet and the configuration information of the decoder are performed on the first terminal side, so that the CPU consumption of the first terminal side and the processing time of the intercepted video image can be greatly reduced, the overall operation performance of a system of the first terminal with poor processing performance such as a vehicle event data recorder when the video image is intercepted is improved, and the influence on other applications on the terminal is reduced. And then, the de-encapsulated video frame and the configuration information of the decoder are sent to a server, so that the data flow consumption in the transmission process can be reduced. When the second terminal has the viewing requirement of the intercepted video image, the decapsulated video frame and the decoder configuration information can be directly sent to the second terminal, so that the second terminal can directly generate the intercepted video image.

Drawings

In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application;

fig. 2 is an application structure diagram of a first terminal according to an embodiment of the present application;

fig. 3 is a schematic flowchart of a video image capturing method according to an embodiment of the present application;

fig. 4 is a schematic flowchart of a video image capturing method according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an apparatus for capturing a video image according to an embodiment of the present application;

fig. 6 is a schematic flowchart of a video image capturing method according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an apparatus for capturing a video image according to an embodiment of the present application;

fig. 8 is a schematic flowchart of a video image capturing method according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of an apparatus for capturing a video image according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a client according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Referring to fig. 1, fig. 1 is a schematic diagram of an application environment according to an embodiment of the present application, and as shown in fig. 1, the application environment at least includes a first terminal 100, a server 200, and a second terminal 300.

In this embodiment, the first terminal 100 may include a vehicle event data recorder, a smart phone, a desktop computer, a tablet computer, a notebook computer, a digital assistant, an Augmented Reality (AR)/Virtual Reality (VR) device, a smart wearable device, and other types of physical devices, and may also include software running in the physical devices. In a specific embodiment, as shown in fig. 1, the first terminal may be a mobile terminal disposed inside a vehicle, and in an alternative schematic diagram of fixing the first terminal inside the vehicle shown in fig. 1, the first terminal 100 is implemented as a tachograph and is fixed to a front window portion of the vehicle by a fixing device (including a suction cup 401 and an arm 402), and the height of the first terminal 100 may be implemented by adjusting the arm 402 in the fixing device 400 so that a user can view a screen of the first terminal 100. Specifically, the first terminal 100 may be used to capture a video; accordingly, the first terminal 100 stores a video file of the photographed video.

In this embodiment, the server 200 may include a server operating independently, or a distributed server, or a server cluster composed of a plurality of servers. The server 01 may comprise a network communication unit, a processor, a memory, etc. Specifically, the server 200 may be configured to send a video image capturing instruction of a video file to the first terminal, so as to trigger the first terminal to perform a video image capturing operation, and store a decapsulated video frame of the video image captured by the first terminal and the decoder configuration information.

In this embodiment, the first terminal 300 may include a smart phone, a desktop computer, a tablet computer, a notebook computer, a digital assistant, an Augmented Reality (AR)/Virtual Reality (VR) device, a smart wearable device, and other types of physical devices, and may also include software running in the physical devices. Specifically, the second terminal 300 may be configured to generate a video image based on the server-side decapsulated video frame and the decoder configuration information.

Furthermore, it should be noted that, when the first terminal is used as a mobile terminal disposed inside a vehicle, as shown in fig. 2, in an alternative schematic diagram of fixing the first terminal inside the vehicle shown in fig. 2, the first terminal 100 may be embedded in a front panel of the vehicle and form a streamlined whole with an internal structure of the vehicle 200, so as to save an internal space of the vehicle 200.

The following describes a video image capturing method according to the present application, and fig. 3 is a flowchart of a video image capturing method according to an embodiment of the present application, and the present specification provides the method operation steps as described in the embodiment or the flowchart, but may include more or less operation steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 3, the method may include:

s301: and the server sends a video image intercepting instruction of the target video file to the first terminal.

In this embodiment of the specification, the target video file may be a video file at the first terminal side, and specifically, in a scene where the first terminal is a car recorder as shown in fig. 1, the target video file may be a video file of a video shot by the car recorder during driving of a vehicle in which the car is located.

In this embodiment of the present specification, the video image capturing instruction may include a timestamp of the target video image; wherein the target video image may comprise a video image corresponding to any timestamp in the target video file.

In practical application, the server may send a video image capturing instruction for the target video file to the first terminal according to a certain rule, and in a specific embodiment, the server may send the video image capturing instruction for the target video file to the first terminal every preset time period, so that intermittent video image monitoring may be performed on a shooting area corresponding to the target video file.

In another specific embodiment, the server may issue a video image capturing instruction to the first terminal according to a real-time track of the first terminal. Correspondingly, the method may further include: the first terminal sends a position point where the first terminal is located and a timestamp of a video image corresponding to the position point in the target video file to a server;

correspondingly, the sending, by the server, the video image capture instruction for the target video file to the first terminal may include: the server judges whether the position point is a target position point; when the judgment result is yes, the server executes the step of sending a video image intercepting instruction of the target video file to the first terminal;

in this embodiment of the present specification, the timestamp of the target video image is a timestamp in the target video file corresponding to the location point. The location point where the first terminal is located may include, but is not limited to, location information of geographic coordinates where the first terminal is located. Accordingly, the target location point may be the designated location information. In practical application, the target location point may be set in combination with a practical application requirement, for example, taking capturing a video image in a vehicle event data recorder as an example, if the current requirement is to acquire an image of a location point where a traffic accident easily occurs, the target location point may be a location point where a traffic accident easily occurs, correspondingly.

In addition, in the embodiment of the present specification, the implementation in practical application that the server sends the video image capture instruction for the target video file to the first terminal is not limited to the above two implementations, and in practical application, other implementations may also be included, for example, after the server receives the video image capture instruction of the third party, the server sends the video image capture instruction to the first terminal, and the like.

S303: and the first terminal intercepts an encapsulation data packet of the target video image corresponding to the timestamp and decoder configuration information corresponding to the encapsulation data packet from the target video file.

In practical applications, a video file may include both audio stream information and video stream information. In this embodiment of this specification, the video image is information in video stream information, and correspondingly, in this embodiment of this specification, the intercepting, by the first terminal, an encapsulated packet of the target video image corresponding to the timestamp and decoder configuration information corresponding to the encapsulated packet from the target video file may include:

1) the first terminal extracts video stream information in the target video file;

2) and the first terminal acquires the encapsulation data packet of the video image corresponding to the timestamp and the decoder configuration information corresponding to the encapsulation data packet from the video stream information.

In practical applications, the video stream information may include video image information in the form of video frames in a packed format (i.e., packed data packets) and decoder information corresponding to the video frames in the packed format.

In the embodiment of the present specification, the package format may include, but is not limited to, package formats such as AVI (Audio Video Interleaved), MP4(Moving Picture Experts Group 4), FLV (FLASH Video format), and the like.

In this embodiment, the decoder configuration information of a video frame may be used to create a decoder that performs decoding processing on the video frame. The decoder configuration information of a certain frame of video frame may include: the decoder identifier, the decoder type (in this embodiment, the decoder type is a video decoder), and the frame of video frame corresponds to the picture package format of the picture, the original width of the picture, the original height of the picture, and other configuration information in the decoder creation process.

S305: and the first terminal carries out decapsulation processing on the encapsulated data packet to obtain a decapsulated video frame.

In this embodiment of the present specification, a video frame in a package format may be converted into video compression coded data by performing decapsulation processing on a package data packet, so as to extract video image information and obtain a decapsulated video frame.

In this embodiment of the present description, decapsulation processing on a video frame in a sealed format may be implemented in combination with FFmpeg (Fast Forward Mpeg, which is a set of open source computer programs that can be used to record and convert digital audio and video, and can convert them into streams), and specifically, when a video image capturing instruction corresponds to multiple timestamps, decapsulation processing may be sequentially performed on the video frame in the sealed format corresponding to each timestamp.

S307: and the first terminal sends the decapsulated video frame and the decoder configuration information to the server.

In this embodiment of the present description, after obtaining the decapsulated video frame and the decoder configuration information corresponding to the video frame, the decapsulated video frame and the decoder configuration information may be sent to the server, and processing such as decoding is not required on the first terminal side, which may reduce consumption of CPU resources and increase time consumption for capturing a video image, and directly transmitting the decapsulated video frame and the decoder configuration information may reduce data traffic consumption in a transmission process.

S309: and the second terminal sends an acquisition request of the target video image to the server.

In this embodiment, when a user at the second terminal has a need to acquire a target video image, the second terminal may be triggered to send an acquisition request of the target video image to the server.

S311: and the server sends the de-encapsulated video frame and the decoder configuration information corresponding to the target video image to the second terminal.

S313: and the second terminal generates the target video image based on the unpackaged video frame and the decoder configuration information.

In this embodiment of the present specification, after acquiring the decapsulated video frame and the decoder configuration information, the second terminal may create a decoder of the target video image based on the decoder configuration information; then, decoding the decapsulated video frame by using the decoder to obtain a decoded video frame; and finally, coding the decoded video frame to obtain the target video image.

Specifically, in the process of creating a decoder of a target video image, the decoder of the target video image may be found based on the decoder identifier in the decoder configuration information, and then, the found decoder is subjected to setting of parameters such as the decoder type, the picture packaging format, the picture original data width, the picture original data height, the baseline time, and the like; accordingly, the decoder with the set parameters is the decoder of the target video image. Then, decoding the decapsulated video frame by using the decoder with the set parameters to obtain a decoded video frame; and finally, according to the picture display format of the actual requirement, coding the decoded video frame to obtain the target video image.

In this embodiment, the process of the encoding process may be determined by combining the picture format of the final target video image display in the actual application.

As can be seen from the technical solutions provided in the embodiments of the present specification, after receiving a video image capture instruction, a first terminal that records a video file captures a video frame of a target video image and decoder configuration information of the video frame, and decapsulates the captured video frame in a encapsulated format; only the decapsulation and the processing of the intercepted and encapsulated data packet and the configuration information of the decoder are performed on the first terminal side, so that the CPU consumption of the first terminal side and the processing time of the intercepted video image can be greatly reduced, the overall operation performance of a system of the first terminal with poor processing performance such as a vehicle event data recorder when the video image is intercepted is improved, and the influence on other applications on the terminal is reduced. And then, the de-encapsulated video frame and the configuration information of the decoder are sent to a server, so that the data flow consumption in the transmission process can be reduced. When the second terminal has the viewing requirement of the intercepted video image, the decapsulated video frame and the decoder configuration information can be directly sent to the second terminal, so that the second terminal can directly generate the intercepted video image.

The following describes a video image capturing method according to the present application, taking a first terminal as an execution subject, and specifically as shown in fig. 4, the method may include:

s401: receiving a video image intercepting instruction of a target video file sent by a server, wherein the video image intercepting instruction comprises a timestamp of the target video image;

s403: intercepting a packaging data packet of a target video image corresponding to the timestamp and decoder configuration information corresponding to the packaging data packet from the target video file;

s405: decapsulating the encapsulated data packet to obtain a decapsulated video frame;

s407: and sending the decapsulated video frame and the decoder configuration information to the server, so that the second terminal generates the target video image based on the decapsulated video frame and the decoder configuration information on the server side.

In some embodiments, the method may further comprise:

sending a position point where a local terminal is located and a timestamp of a video image corresponding to the position point in the target video file to the server;

correspondingly, the video image capturing instruction for the target video file sent by the receiving server includes:

receiving a video image intercepting instruction of a target video file sent when the server judges that the position point is the target position point;

and the time stamp of the target video image is the time stamp corresponding to the position point in the target video file.

In some embodiments, the intercepting, from the target video file, an encapsulated packet of the target video image corresponding to the timestamp and decoder configuration information corresponding to the encapsulated packet includes:

extracting video stream information in the target video file;

and acquiring the encapsulation data packet of the video image corresponding to the timestamp and the decoder configuration information corresponding to the encapsulation data packet from the video stream information.

An embodiment of the present application further provides an apparatus for capturing a video image, as shown in fig. 5, the apparatus includes:

a video image capture instruction receiving module 510, configured to receive a video image capture instruction for a target video file sent by a server, where the video image capture instruction includes a timestamp of the target video image;

a data intercepting module 520, configured to intercept, from the target video file, an encapsulated data packet of the target video image corresponding to the timestamp and decoder configuration information corresponding to the encapsulated data packet;

a decapsulation processing module 530, configured to decapsulate the encapsulated data packet to obtain a decapsulated video frame;

the first data sending module 540 may be configured to send the decapsulated video frame and the decoder configuration information to the server, so that the second terminal generates the target video image based on the decapsulated video frame and the decoder configuration information on the server side.

In some embodiments, the apparatus may further comprise:

the position point sending module is used for sending the position point where the local terminal is located and the timestamp of the video image corresponding to the position point in the target video file to the server;

correspondingly, the video image capture instruction receiving module 510 may be specifically configured to: receiving a video image intercepting instruction of a target video file sent when the server judges that the position point is the target position point;

In some embodiments, the data interception module 520 may include:

the video stream information extraction unit is used for extracting the video stream information in the target video file;

and the data acquisition unit is used for acquiring the packaging data packet of the video image corresponding to the timestamp and the decoder configuration information corresponding to the packaging data packet from the video stream information.

The device and method embodiments in the device embodiment are based on the same application concept.

The embodiment of the application provides a video image intercepting terminal, which comprises a processor and a memory, wherein at least one instruction, at least one program, a code set or an instruction set is stored in the memory, and the at least one instruction, the at least one program, the code set or the instruction set is loaded and executed by the processor to implement the video image intercepting method provided by the above method embodiment.

A method for capturing a video image according to the present application is described below with a server as an execution subject, and specifically as shown in fig. 6, the method may include:

s601: sending a video image intercepting instruction of a target video file to a first terminal, wherein the video image intercepting instruction comprises a timestamp of the target video file, so that the first terminal intercepts a packaged data packet of the target video file corresponding to the timestamp and decoder configuration information corresponding to the packaged data packet from the target video file, and decapsulates the packaged data packet to obtain a decapsulated video frame;

s603: receiving the decapsulated video frame and the decoder configuration information sent by the first terminal;

s605: receiving an acquisition request of a target video image sent by a second terminal;

s607: and sending the decapsulated video frame and the decoder configuration information corresponding to the target video image to a second terminal, so that the second terminal generates the target video image based on the decapsulated video frame and the decoder configuration information.

An embodiment of the present application further provides an apparatus for capturing a video image, as shown in fig. 7, the apparatus includes:

a video image capture instruction sending module 710, configured to send a video image capture instruction for a target video file to a first terminal, where the video image capture instruction includes a timestamp of the target video file, so that the first terminal captures, from the target video file, an encapsulated data packet of the target video file corresponding to the timestamp and decoder configuration information corresponding to the encapsulated data packet, and decapsulates the encapsulated data packet to obtain an decapsulated video frame;

a first data receiving module 720, configured to receive the decapsulated video frame and the decoder configuration information sent by the first terminal;

an obtaining request receiving module 730, configured to receive an obtaining request of a target video image sent by a second terminal;

the second data sending module 740 may be configured to send the decapsulated video frame and the decoder configuration information corresponding to the target video image to a second terminal, so that the second terminal generates the target video image based on the decapsulated video frame and the decoder configuration information.

The embodiment of the present application provides an intercepting server of a video image, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or a set of instructions, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the intercepting method of a video image provided by the above method embodiment.

The following describes a video image capturing method according to the present application, taking a second terminal as an execution subject, and specifically as shown in fig. 8, the method may include:

s801: sending an acquisition request of a target video image to a server;

s803: receiving the de-encapsulated video frame and decoder configuration information corresponding to the target video image sent by the server;

s805: generating the target video image based on the decapsulated video frame and the decoder configuration information;

In some embodiments, said generating said target video image based on said decapsulated video frame and said decoder configuration information comprises:

a decoder that creates the target video image based on the decoder configuration information;

the decoder decodes the decapsulated video frame to obtain a decoded video frame;

and coding the decoded video frame to obtain the target video image.

An embodiment of the present application further provides an apparatus for capturing a video image, as shown in fig. 9, the apparatus includes:

an obtaining request sending module 910, configured to send an obtaining request of a target video image to a server;

a second data receiving module 920, configured to receive the decapsulated video frame and decoder configuration information corresponding to the target video image sent by the server;

a target video image generation module 930, operable to generate the target video image based on the decapsulated video frame and the decoder configuration information;

In the embodiments of the present disclosure, the memory may be used to store software programs and modules, and the processor executes various functional applications and data processing by operating the software programs and modules stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system, application programs needed by functions and the like; the storage data area may store data created according to use of the apparatus, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory may also include a memory controller to provide the processor access to the memory.

In another aspect, the present application further provides a video image capturing system, including: the system comprises a server, a first terminal and a second terminal;

The method provided by the embodiment of the application can be executed in a client (a mobile terminal, a computer terminal), a server or a similar operation device. Taking the operation on the client as an example, fig. 10 is a schematic structural diagram of a client provided in the embodiment of the present application, and as shown in fig. 10, the client may be used to implement the information interaction method provided in the foregoing embodiment. Specifically, the method comprises the following steps:

the client may include components such as RF (Radio Frequency) circuitry 1010, memory 1020 including one or more computer-readable storage media, input unit 1030, display unit 1040, sensors 1050, audio circuitry 1060, WiFi (wireless fidelity) module 1070, processor 1080 including one or more processing cores, and power source 1090. Those skilled in the art will appreciate that the client architecture shown in fig. 10 does not constitute a limitation on the client, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components. Wherein:

RF circuit 1010 may be used for receiving and transmitting signals during a message transmission or communication process, and in particular, for receiving downlink information from a base station and then processing the received downlink information by one or more processors 1080; in addition, data relating to uplink is transmitted to the base station. In general, RF circuitry 1010 includes, but is not limited to, an antenna, at least one Amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, an LNA (Low Noise Amplifier), a duplexer, and the like. In addition, the RF circuitry 1010 may also communicate with networks and other clients via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System for Mobile communications), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), email, SMS (Short Messaging Service), and the like.

The memory 1020 may be used to store software programs and modules, and the processor 1080 executes various functional applications and data processing by operating the software programs and modules stored in the memory 1020. The memory 1020 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, application programs required for functions, and the like; the storage data area may store data created according to the use of the client, and the like. Further, the memory 1020 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 1020 may also include a memory controller to provide access to memory 1020 by processor 1080 and input unit 1030.

The input unit 1030 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, input unit 1030 may include touch-sensitive surface 1031, as well as other input devices 1032. The touch-sensitive surface 1031, also referred to as a touch display screen or a touch pad, may collect touch operations by a user (such as operations by a user on or near the touch-sensitive surface 1031 using any suitable object or attachment, such as a finger, a stylus, etc.) on or near the touch-sensitive surface 1031 and drive the corresponding connection device according to a preset program. Optionally, the touch sensitive surface 1031 may comprise two parts, a touch detection means and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 1080, and can receive and execute commands sent by the processor 1080. In addition, the touch-sensitive surface 1031 may be implemented using various types of resistive, capacitive, infrared, and surface acoustic waves. The input unit 1030 may also include other input devices 1032 in addition to the touch-sensitive surface 1031. In particular, other input devices 1032 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a track ball, a mouse, a joystick, or the like.

The display unit 1040 may be used to display information input by or provided to a user and various graphical user interfaces of the client, which may be made up of graphics, text, icons, video, and any combination thereof. The Display unit 1040 may include a Display panel 1041, and optionally, the Display panel 1041 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like. Further, the touch-sensitive surface 1031 may overlay the display panel 1041, and when a touch operation is detected on or near the touch-sensitive surface 1031, the touch operation is transmitted to the processor 1080 for determining the type of the touch event, and the processor 1080 then provides a corresponding visual output on the display panel 1041 according to the type of the touch event. Touch-sensitive surface 1031 and display panel 1041 may be implemented as two separate components for input and output functions, although in some embodiments touch-sensitive surface 1031 may be integrated with display panel 1041 for input and output functions.

The client may also include at least one sensor 1050, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1041 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1041 and/or the backlight when the client moves to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when the device is stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration) for identifying client gestures, and related functions (such as pedometer and tapping) for vibration identification; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which may be further configured at the client, detailed description is omitted here.

Audio circuitry 1060, speaker 1061, microphone 1062 may provide an audio interface between a user and the client. The audio circuit 1060 can transmit the electrical signal converted from the received audio data to the speaker 1061, and the electrical signal is converted into a sound signal by the speaker 1061 and output; on the other hand, the microphone 1062 converts the collected sound signal into an electrical signal, which is received by the audio circuit 1060 and converted into audio data, which is then processed by the audio data output processor 1080 and then sent to, for example, another client via the RF circuit 1010, or output to the memory 1020 for further processing. The audio circuit 1060 may also include an earbud jack to provide communication of peripheral headphones with the client.

WiFi belongs to short-range wireless transmission technology, and the client can help the user send and receive e-mails, browse web pages, access streaming media, etc. through the WiFi module 1070, which provides the user with wireless broadband internet access. Although fig. 10 shows the WiFi module 1070, it is understood that it does not belong to the essential constitution of the client and can be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 1080 is a control center of the client, connects various parts of the entire client by using various interfaces and lines, and performs various functions of the client and processes data by running or executing software programs and/or modules stored in the memory 1020 and calling data stored in the memory 1020, thereby performing overall monitoring of the client. Optionally, processor 1080 may include one or more processing cores; preferably, the processor 1080 may integrate an application processor, which handles primarily the operating system, user interfaces, applications, etc., and a modem processor, which handles primarily the wireless communications. It is to be appreciated that the modem processor described above may not be integrated into processor 1080.

The client also includes a power source 1090 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 1080 via a power management system to manage charging, discharging, and power consumption management functions via the power management system. Power supply 1090 may also include any component including one or more DC or AC power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

Although not shown, the client may further include a camera, a bluetooth module, and the like, which are not described herein again. Specifically, in this embodiment, the display unit of the client is a touch screen display, the client further includes a memory and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the one or more processors according to the instructions of the method embodiments of the present invention.

Embodiments of the present application further provide a storage medium, which may be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a video image capturing method in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions are loaded and executed by the processor to implement the video image capturing method provided in the method embodiments.

Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

As can be seen from the embodiments of the method, the apparatus, the terminal, the system, the server or the storage medium for intercepting a video image provided by the present application, in the process of intercepting a video image, after receiving a video image intercepting instruction, a first terminal recorded with a video file intercepts a video frame of a target video image and decoder configuration information of the video frame, and decapsulates the intercepted video frame in a encapsulation format; only the decapsulation and the processing of the intercepted and encapsulated data packet and the configuration information of the decoder are performed on the first terminal side, so that the CPU consumption of the first terminal side and the processing time of the intercepted video image can be greatly reduced, the overall operation performance of a system of the first terminal with poor processing performance such as a vehicle event data recorder when the video image is intercepted is improved, and the influence on other applications on the terminal is reduced. And then, the de-encapsulated video frame and the configuration information of the decoder are sent to a server, so that the data flow consumption in the transmission process can be reduced. When the second terminal has the viewing requirement of the intercepted video image, the decapsulated video frame and the decoder configuration information can be directly sent to the second terminal, so that the second terminal can directly generate the intercepted video image.

It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device, terminal, system, server or storage medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and in relation to the description, reference may be made to some portions of the description of the method embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware to implement the above embodiments, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk, an optical disk, or the like.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for intercepting video images, the method comprising:

sending the decapsulated video frame and the decoder configuration information to the server, so that a second terminal sends an acquisition request of the target video image to the server, and receives the decapsulated video frame and the decoder configuration information sent by the server; and generating the target video image based on the decapsulated video frame and the decoder configuration information.

2. The method of claim 1, further comprising:

3. The method of claim 1, wherein the intercepting, from the target video file, an encapsulated packet of the target video image corresponding to the timestamp and decoder configuration information corresponding to the encapsulated packet comprises:

extracting video stream information in the target video file;

4. A method for intercepting video images, the method comprising:

5. The method of claim 4, further comprising:

receiving a position point where the first terminal is located and sent by the first terminal, and a timestamp of a video image corresponding to the position point in the target video file;

judging whether the position point is a target position point or not;

when the judgment result is yes, executing the video image intercepting instruction for sending the target video file to the first terminal;

6. A method for intercepting video images, the method comprising:

sending an acquisition request of a target video image to a server;

7. The method of claim 6, wherein the generating the target video image based on the decapsulated video frame and the decoder configuration information comprises:

decoding the decapsulated video frame by using the decoder to obtain a decoded video frame;

and coding the decoded video frame to obtain the target video image.

8. The method of claim 7, wherein the decoder creating the target video image based on the decoder configuration information comprises:

determining a decoder corresponding to a decoder identification in the decoder configuration information;

setting parameters of the corresponding decoder, wherein the set parameters comprise the type of the decoder, the packaging format of the picture, the width of the original data of the picture, the height of the original data of the picture and the baseline time;

and taking the decoder with the set parameters as the decoder of the target video image.

9. An apparatus for intercepting video images, the apparatus comprising:

a first data sending module, configured to send the decapsulated video frame and the decoder configuration information to the server, so that a second terminal sends an acquisition request of the target video image to the server, and receives the decapsulated video frame and the decoder configuration information sent by the server; generating the target video image based on the decapsulated video frame and the decoder configuration information.

10. An apparatus for intercepting video images, the apparatus comprising:

11. An apparatus for intercepting video images, the apparatus comprising:

12. A system for intercepting video images, the system comprising: the system comprises a server, a first terminal and a second terminal;

13. A computer readable storage medium, in which at least one instruction or at least one program is stored, the at least one instruction or the at least one program being loaded and executed by a processor to implement the method for intercepting a video image according to any one of claims 1 to 8.