CN111770380A

CN111770380A - Video processing method and device

Info

Publication number: CN111770380A
Application number: CN202010048434.4A
Authority: CN
Inventors: 冯伟平
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2020-01-16
Filing date: 2020-01-16
Publication date: 2020-10-13

Abstract

The invention discloses a video processing method and device, and relates to the technical field of computers. One embodiment of the method comprises: acquiring a video to be processed; analyzing the video to be processed into one or more frames of images forming the video to be processed; analyzing the one or more frames of images to determine one or more graphics regions in the images; setting operation to the graphic area and determining a response result corresponding to the operation; inserting the operation on the graphic region into the video so that a user can operate on the graphic region appearing in the video while watching the video. According to the embodiment, the interaction between the user and the video is realized through the operation of the user on the video when the user watches the video.

Description

Video processing method and device

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a video processing method and apparatus.

Background

With the development of internet technology, the video streaming technology has been widely used in the fields of entertainment, multimedia, and the like, with the advantages of downloading and editing while playing, and giving better vivid image.

However, currently, a user can only perform simple operations on the viewed video when viewing the video, such as pausing the playing of the video, playing the video, and the like, and interaction between the user and the video cannot be realized, and the user cannot operate the video itself and graphical elements in the video, and the like.

Disclosure of Invention

In view of this, the present invention provides a video processing method and apparatus, which can not only implement operations such as pausing playing and playing video, but also display corresponding response results according to the operation of a user on a graphic area in the video, thereby implementing interaction between the user and the video.

To achieve the above object, according to a first aspect of the present invention, there is provided a video processing method comprising:

acquiring a video to be processed;

analyzing the video to be processed into one or more frames of images forming the video to be processed;

analyzing the one or more frames of images to determine one or more graphics regions in the images;

setting operation to the graphic area and determining a response result corresponding to the operation;

inserting the operation on the graphic region into the video so that a user can operate on the graphic region appearing in the video while watching the video.

Optionally, the determining one or more image regions in the image comprises:

one or more of the image regions are delineated in the image using a preset shape or line trajectory.

Optionally, the determining one or more image regions in the image comprises:

and performing image recognition on the one or more frames of images by adopting a machine learning algorithm according to the specified object so as to determine one or more graphic areas in the images, wherein the graphic areas indicate one or more objects in the images.

Optionally, the operations include one or more of: single click, double click, long press and sliding;

the response result includes one or more of: the method comprises the steps of enlarging the graphic area, reducing the graphic area, copying the graphic area, displaying a popup box corresponding to the graphic area and displaying a link corresponding to the graphic area.

Optionally, the method further comprises:

deleting one or more frames of images from the one or more frames of images, or adding one or more frames of preset images to the one or more frames of images, before analyzing the one or more frames of images to determine one or more graphic regions in the images.

Optionally, the method further comprises:

after analyzing the one or more frames of images to determine one or more graphic regions in the images or the preset graphics, performing one or more of the following processes on the graphic regions: deleting the graphic region, copying the graphic region, combining one or more of the graphic regions, replacing the graphic region with a preset picture or page element, moving the graphic region, enlarging the graphic region, and reducing the graphic region.

Optionally, the method further comprises:

and under the condition that a user watches the video and operates the graphic area appearing in the video, displaying a response result corresponding to the operation to the user according to the operation of the user and the graphic area of the operation.

Optionally, the displaying, according to the operation of the user and the graphical area of the operation, a response result corresponding to the operation to the user includes:

sending the operation and the graphic area to a server;

receiving the response result returned by the server according to the operation and the graphic area;

and displaying the response result to the user.

Optionally, a Canvas is used to parse the video to be processed into one or more frames of images constituting the video to be processed.

To achieve the above object, according to a second aspect of the present invention, there is provided a video processing apparatus comprising: the system comprises a video acquisition module, a video analysis module, a graphic area acquisition module, a response result setting module and a video processing module; wherein the content of the first and second substances,

the video acquisition module is used for acquiring a video to be processed;

the video analysis module is used for analyzing the video to be processed into one or more frames of images forming the video to be processed;

the image area acquisition module is used for analyzing the one or more frames of images to determine one or more image areas in the images;

the response result setting module is used for setting the operation carried out on the graphic area and the response result corresponding to each operation;

a video processing module, configured to insert the operation performed on the image area into the video, so that a user can operate the image area appearing in the video while watching the video.

Optionally, the determining one or more graphic regions in the image comprises:

Optionally, the graphics area obtaining module is further configured to,

after determining one or more graphic regions in the image or the preset graphic, performing one or more of the following processes on the graphic regions: deleting the graphic region, copying the graphic region, combining one or more of the graphic regions, replacing the graphic region with a preset picture or page element, moving the graphic region, enlarging the graphic region, and reducing the graphic region.

Optionally, the video processing module is further configured to,

sending the operation and the graphic area to a server;

and displaying the response result to the user.

Optionally, the video parsing module is configured to,

and analyzing the video to be processed into one or more frames of images forming the video to be processed by using Canvas.

To achieve the above object, according to a third aspect of the present invention, there is provided a video interaction method, including:

playing a video, wherein the video comprises one or more graphic areas operable by a user when watching the video;

and receiving the operation of the user on the one or more graphic areas so as to show the response result corresponding to the operation and the graphic areas to the user.

Alternatively,

one or more graphical regions operable by a user viewing the video are indicated by one or more of: the outline of the graphic area is displayed in a distinguishing way, and the graphic area is prompted by characters and animation.

Alternatively,

the operations include one or more of: single click, double click, long press and sliding;

Optionally, the method further comprises:

and processing the response result under the condition that the video to be watched shows the operation and the response result corresponding to the graphic area to the user.

Optionally, the method further comprises:

pausing the playing of the video to show the operation and a response result corresponding to the graphic area to the user; or when the video is played, the response result corresponding to the operation and the graphic area is displayed to the user.

To achieve the above object, according to a fourth aspect of the present invention, there is provided an electronic device for video processing, comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, implement the method as any one of the video processing methods described above.

To achieve the above object, according to a fifth aspect of the present invention, there is provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements any one of the video processing methods described above.

The invention has the following advantages or beneficial effects: the video is converted into one or more frames of images by using canvas and the like, one or more graphic areas in the images are identified by using one or more frames of images or defined by using a preset shape or a line segment track, and the operation and the response result which can be carried out by a video viewer on the graphic areas are set, so that the operation of the video viewer on the graphic areas in the video is realized, and the interaction between the user and the video is realized by displaying the response result to the user through the video.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

fig. 1 is a schematic diagram of a main flow of a video processing method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a main flow of another video processing method according to an embodiment of the present invention;

fig. 3 is a schematic diagram of main blocks of a video processing apparatus according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a main flow of a video interaction method according to an embodiment of the present invention;

FIG. 5a is a diagram of a frame of an image according to an embodiment of the present invention;

FIG. 5b is a diagram of a graphics region of a frame of image according to an embodiment of the invention;

FIG. 5c is a schematic illustration of a distinctive display of a graphic region according to an embodiment of the invention;

FIG. 5d is a schematic illustration of a further differentiated display of graphical regions according to an embodiment of the invention;

FIG. 5e is a schematic illustration of a further graphic area difference display according to an embodiment of the present invention;

FIG. 5f is a graphical illustration of response results according to an embodiment of the invention;

FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 7 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram of a main flow of a video processing method according to an embodiment of the present invention, and as shown in fig. 1, the video processing method may specifically include the following steps:

and step S101, acquiring a video to be processed.

The video to be processed is any video available for the user to watch, including but not limited to video entertainment videos such as game videos and advertisement videos. On the basis, the video can be analyzed to obtain attribute information of the video, including video size, video width, video length, video format, video duration, video playing time and the like, so as to analyze the video.

Step S102, analyzing the video to be processed into one or more frames of images forming the video to be processed. Tools that may be used to parse a video into one or more frames of images include, but are not limited to, OpenCV, Photoshop, Canvas, and the like. The Canvas refers to a Canvas element used for drawing graphics in HTML5, is a rectangular area, often uses JavaScript to draw images on web pages, and has various drawing paths, rectangles, circles, characters, and methods for adding images.

In a preferred embodiment, a Canvas is used to parse the video to be processed into one or more frames of images that make up the video to be processed. On this basis, the one or more frames of the image are still analyzed using a Canvas to determine one or more graphics regions in the image.

Step S103, analyzing the one or more frames of images to determine one or more graphic regions in the images. It should be noted that the graphic area may be an element of a person, a scene, an object, etc. included in the image, or may be an area that is self-defined on the image according to actual needs and includes one or more elements.

In an alternative embodiment, one or more frames of images are deleted from the one or more frames of images or one or more preset images are added to the one or more frames of images before the one or more frames of images are analyzed to determine one or more graphic regions in the images. It can be understood that, in order to reduce the number of image analysis, a part of images can be selected from one or more frames of images or the images can be deleted and then the image analysis can be performed according to the actual requirement; meanwhile, in order to more flexibly or self-define the video, a part of preset images can be added on the basis of the existing one or more frames of images, and then image analysis is carried out.

In an alternative embodiment, the determining one or more image regions in the image comprises: one or more of the image regions are delineated in the image using a preset shape or line trajectory. The preset shapes include but are not limited to geometric figures such as circles, rectangles, triangles, polygons and the like, and the line tracks refer to lines which can be used for selecting a figure area in an image in a user-defined mode. If a frame of image contains a mobile phone as an example, a region containing the mobile phone is defined by any geometric figure such as a circle, a rectangle, a triangle, a polygon and the like, namely a graphic region; or the area is defined along the outline of the mobile phone through the line segment track, and the defined area containing the mobile phone is the graphic area. Specifically, still taking the example of analyzing the one or more frames of images by using Canvas to determine one or more graphic regions in the images as an example, one or more graphic regions in the images may be defined by customizing one or more graphic tool type APIs corresponding to the preset shapes.

In an alternative embodiment, the determining one or more image regions in the image comprises: and performing image recognition on the one or more frames of images by adopting a machine learning algorithm according to the specified object so as to determine one or more graphic areas in the images, wherein the graphic areas indicate one or more objects in the images. The designated object can be any object set according to actual requirements, including but not limited to: mobile phones, bags, cars, cats, dogs, mountains, grass, flowers, etc. That is, in addition to self-defining the graphic region by a preset shape or line, one or more objects in the image may be automatically identified by an artificial intelligence algorithm to determine the graphic region containing the object. Specifically, still taking the example of analyzing the one or more frames of images by using Canvas to determine one or more graphic areas in the images as an illustration, the image content may be obtained by a Canvas get Context, the image information may be further obtained by a Canvas Rendering Context, and then the image is sent to a server capable of performing image recognition to be analyzed, and the information of the graphic areas such as the graphic position, the graphic size, the graphic outline and the like returned by the server is received.

In an alternative embodiment, after analyzing the one or more frames of images to determine one or more graphics regions in the image or the preset image, the graphics regions are subjected to one or more of the following processes: deleting the graphic region, copying the graphic region, combining one or more of the graphic regions, replacing the graphic region with a preset picture or page element, moving the graphic region, enlarging the graphic region, and reducing the graphic region. That is, not only the image of one frame or a plurality of frames may be subjected to processing such as addition or deletion, but also the graphics area included in the image may be subjected to processing such as deletion or replacement so that the processed graphics area is included in the image. For example, a graphic area containing flowers, grass, etc. in the image may be replaced with a graphic area containing a cell phone, a bag, etc.

Besides, it can be understood that the animation effect of fading out, fading in, reverse color and the like of the image of one or more frames can be realized by changing the size, the color and the like of the image of one or more frames.

And step S104, setting the operation to the graphic area and determining a response result corresponding to the operation.

The operations include one or more of: single click, double click, long press and sliding; the response result includes one or more of the following: the method comprises the steps of enlarging the graphic area, reducing the graphic area, copying the graphic area, displaying a popup box corresponding to the graphic area and displaying a link corresponding to the graphic area. Specifically, still taking the example that the graphic area includes the mobile phone as an example for explanation, the response result of clicking the graphic area may be set as an enlarged graphic area, and the response result of double clicking the graphic area may be set as a link corresponding to the mobile phone in the display graphic area. Therefore, after the operation performed on the graphic area is inserted into the video, the amplified graphic area can be displayed to the user under the condition that the user watches the video and clicks the graphic area, and the link corresponding to the mobile phone can be displayed to the user under the condition that the user watches the video and double clicks the graphic area, so that the user can further know the details of the mobile phone through the link.

Step S105, inserting the operation performed on the graphics area into the video, so that a user can operate the graphics area appearing in the video while watching the video.

In an optional implementation manner, when a user watches the video and operates the graphic region appearing in the video, a response result corresponding to the operation is displayed to the user according to the operation of the user and the graphic region of the operation.

Specifically, the following description will be given by taking an example in which the graphic area includes a mobile phone, and the response result of clicking the graphic area by a single click is set as an enlarged graphic area, and the response result of clicking the graphic area by a double click is set as a link corresponding to the mobile phone in the display graphic area: after the operation performed on the graphic area is inserted into the video, if the user watches the video and clicks the graphic area, the enlarged graphic area is displayed to the user; and if the user watches the video and double clicks the graphic area, displaying the link corresponding to the mobile phone to the user, so that the user can further know the details of the mobile phone through the link. Therefore, the operation of the user on the graphic area and the corresponding response result are set, so that the operation of the user on the graphic area containing the mobile phone and the package in the image area when the user watches the video is realized, and the interaction between the user and the video is further realized by providing the response result.

In an optional implementation manner, the displaying, to the user, a response result corresponding to the operation according to the operation of the user and the graphical area of the operation includes: sending the operation and the graphic area to a server; receiving the response result returned by the server according to the operation and the graphic area; and displaying the response result to the user. That is to say, in a case that an operation performed on a graphic region in a video by a user is received, in order to save local computing resources, the operation and the graphic region may be sent to the server, so that the server computes a response result corresponding to the operation and the graphic region, and then displays the response result returned by the server to a video viewer. Besides, the user watching the video can also play, pause and other traditional operations on the video.

It is worth noting that, since the existing video cannot realize the interactive operation between the user and the video, in order to remind the user to operate the graphic area in the video while watching the video, the graphic area can be displayed differently, including but not limited to: and the outline of the graphic area is displayed in a distinguishing way, and the graphic area is prompted by characters, animation and the like. It can be understood that, in order to prevent the graphic regions displayed differently from degrading the viewing experience of the user, the graphic regions displayed differently may not be distinguished according to actual requirements, so as to enable the user to find or find the operable graphic regions in the video.

Based on the embodiment, the video is converted into one or more frames of images by using canvas and the like, the one or more image areas in the images are subjected to image recognition or defined by using a preset shape or a line segment track, and the operation and the response result which can be performed by a video viewer on the image areas are set, so that the operation of the video viewer on the image areas in the video is realized, and the interaction between the user and the video is realized by displaying the response result to the user through the video.

Referring to fig. 2, a video processing method is provided on the basis of the foregoing embodiment, and the method may specifically include the following steps:

step S201, a video to be processed is acquired. Meanwhile, the attribute information of the video, including the video size, the video width, the video length, the video format, the video duration, the video playing time and the like, is acquired so as to analyze the video.

Step S202, analyzing the video to be processed into one or more frames of images forming the video to be processed. Specifically, the Canvas is used to parse the video to be processed into one or more frames of images constituting the video to be processed.

Step S203, deleting one or more frames of images in the one or more frames of images, or adding one or more frames of preset images in the one or more frames of images. It will be appreciated that the preset image has a size corresponding to the size of the one or more frames of images that make up the video to be processed.

Step S204, analyzing the one or more frames of images to determine one or more graphic regions in the images. The image recognition can be used for automatically recognizing the elements such as people, scenes and objects in the image as the graphic areas, and the areas containing one or more elements in the image can be customized as the graphic areas.

Step S205, setting an operation performed on the graphics area, and determining a response result corresponding to the operation. The operations include one or more of: single click, double click, long press and sliding; the response result includes one or more of: the method comprises the steps of enlarging the graphic area, reducing the graphic area, copying the graphic area, displaying a popup box corresponding to the graphic area and displaying a link corresponding to the graphic area. Besides, the control of playing, pausing and the like of the video can be realized through the operation of the graphic area.

Step S206, inserting the operation on the graphic area into the video, so that the user can operate the graphic area appearing in the video when watching the video. Specifically, still taking the example that the graphic area includes the mobile phone as an example for explanation, the response result of clicking the graphic area may be set as an enlarged graphic area, and the response result of double clicking the graphic area may be set as a link corresponding to the mobile phone in the display graphic area. Therefore, after the operation performed on the graphic area is inserted into the video, the amplified graphic area can be displayed to the user under the condition that the user watches the video and clicks the graphic area, and the link corresponding to the mobile phone can be displayed to the user under the condition that the user watches the video and double clicks the graphic area, so that the user can further know the details of the mobile phone through the link.

Referring to fig. 3, on the basis of the above embodiment, there is provided a video processing apparatus 300, including: a video acquisition module 301, a video analysis module 302, a graphic region acquisition module 303, a response result setting module 304, and a video processing module 305; wherein the content of the first and second substances,

a video obtaining module 301, configured to obtain a video to be processed;

a video parsing module 302, configured to parse the video to be processed into one or more frames of images constituting the video to be processed;

a graphics region acquiring module 303, configured to analyze the one or more frames of images to determine one or more graphics regions in the images;

a response result setting module 304, configured to set operations performed on the graphics area and a response result corresponding to each of the operations;

a video processing module 305, configured to insert the operation performed on the image area into the video, so that a user may perform an operation on the image area appearing in the video while watching the video.

In an alternative embodiment, the determining one or more graphical regions in the image comprises:

In an alternative embodiment, the operations include one or more of the following: single click, double click, long press and sliding;

In an alternative embodiment, the graphics area obtaining module 303 is further configured to,

after determining one or more graphic regions in the image or the preset image, performing one or more of the following processes on the graphic regions: deleting the graphic region, copying the graphic region, combining one or more of the graphic regions, replacing the graphic region with a preset picture or page element, moving the graphic region, enlarging the graphic region, and reducing the graphic region.

In an alternative embodiment, the video processing module 305 is further configured to,

In an optional implementation manner, the displaying, to the user, a response result corresponding to the operation according to the operation of the user and the graphical area of the operation includes:

sending the operation and the graphic area to a server;

and displaying the response result to the user.

In an alternative embodiment, the video parsing module 302 is configured to,

Referring to fig. 4, a video interaction method is provided on the basis of the foregoing embodiment, so as to implement interaction between a video viewing user and a video, and the method may specifically include the following steps:

step S401, a video is played, where the video indicates one or more graphic regions that are operable when a user watches the video. The video refers to a video processed by using the video processing method provided in the foregoing embodiment, that is, the video has inserted therein an operation performed on one or more graphics areas, and the operation and the graphics areas have been configured with corresponding response results.

Specifically, referring to fig. 5a, one frame of image contains two elements, namely, a mobile phone and a shopping cart, so that the mobile phone and the shopping cart can be encircled by using a preset rectangle, triangle, circle, etc. to determine the graphic area, and the mobile phone and the shopping cart contained therein can be automatically identified by using an image recognition algorithm to determine the graphic area. Referring to fig. 5b, taking the rectangular circled graphic area as an example for explanation, the circled graphic area includes: the shopping cart comprises a graphic area 1 and a graphic area 2, wherein the graphic area 1 contains a mobile phone in an image, and the graphic area 2 contains a shopping cart in the image.

Further, to alert a user that a graphical region in a video is available for operation, one or more graphical regions that are available for operation by a user viewing the video are indicated by one or more of: the outline of the graphic area is displayed in a distinguishing way, and the graphic area is prompted by characters and animation. It can be understood that, in order to prevent the graphic regions displayed differently from degrading the viewing experience of the user, the graphic regions displayed differently may not be distinguished according to actual requirements, so as to enable the user to find or find the operable graphic regions in the video.

Specifically, referring to fig. 5c, fig. 5d and fig. 5e, which are schematic diagrams of different display graphic regions, respectively, wherein fig. 5c reminds the video viewer of the operable graphic region by displaying the graphic region outline distinctively, such as a highlighted graphic region outline, a bolded graphic region outline or a blinking display region outline; FIG. 5d is a schematic diagram of a graphical region that is displayed with differences in brightness or color difference to alert a video viewer of the operational graphical region; fig. 5e is a diagram for reminding the video viewer of the operable graphic regions by marking the graphic regions in a manner consistent with the distinctive display of the star marks. It is understood that the graphical region may be displayed differently using text, bump display, or other manners to alert the user of the operable graphical region.

The operations include one or more of: single click, double click, long press and sliding; the response result includes one or more of: the method comprises the steps of enlarging the graphic area, reducing the graphic area, copying the graphic area, displaying a popup box corresponding to the graphic area and displaying a link corresponding to the graphic area. It can be understood that the user can also pause playing the video to show the user the operation and the response result corresponding to the graphic area; or when the video is played, the operation and the response result corresponding to the graphic area are displayed to the user.

Step S402, receiving the operation of the user to the one or more graphic areas, and displaying the response result corresponding to the operation and the graphic areas to the user.

Specifically, referring to fig. 5f, still taking an example that the graphic area 1 includes a mobile phone, and the response result of clicking the graphic area 1 by a single click is set as an enlarged graphic area, and the response result of clicking the graphic area 1 by a double click is set as a link corresponding to the mobile phone in the display graphic area 1 as an example, since the user double clicks the graphic area 1 when watching a video, the link corresponding to the mobile phone is displayed to the user, so that the user can further know the details of the mobile phone through the link. Therefore, the operation of the user on the graphic area and the corresponding response result are set, so that the operation of the user on the graphic area containing the mobile phone and the package in the image area when the user watches the video is realized, and the interaction between the user and the video is further realized by providing the response result.

On the basis, under the condition that the video to be watched shows the operation and the response result corresponding to the graphic area to the user, the response result is processed. For example, the response result is taken as an example for explaining that the user is shown a link corresponding to a mobile phone in the graphic area, and the user may process the response result, including but not limited to: copying the link and jumping to the page corresponding to the link, directly clicking and jumping to the page corresponding to the link, closing or clearing the link, and the like. Therefore, the user-defined video can be displayed to the user, and commercial purposes such as advertisement and promotion can be achieved through interaction between the user and the video.

Fig. 6 shows an exemplary system architecture 600 of a video processing method or video processing apparatus to which embodiments of the invention may be applied.

As shown in fig. 5, the system architecture 600 may include

terminal devices

601, 602, 603, a network 604, and a server 605. The network 604 serves to provide a medium for communication links between the

terminal devices

601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

601, 602, 603 to interact with the server 605 via the network 604 to receive or send messages or the like. Various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, and the like, may be installed on the

terminal devices

601, 602, and 603.

The

terminal devices

601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 605 may be a server that provides various services, such as a background management server that supports shopping websites browsed by users using the

terminal devices

601, 602, and 603. The background management server can analyze and process the received data such as the product information inquiry request and feed back the processing result (the video inserted into the graphic region operation) to the terminal equipment.

It should be noted that the video processing method provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the video processing apparatus is generally disposed in the server 605.

It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises a video acquisition module, a video analysis module, a graphic area acquisition module, a response result setting module and a video processing module. The names of these modules do not in some cases constitute a limitation on the module itself, and for example, the video capture module may also be described as a "module for capturing video to be processed".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring a video to be processed; analyzing the video to be processed into one or more frames of images forming the video to be processed; analyzing the one or more frames of images to determine one or more graphics regions in the images; setting operation to the graphic area and determining a response result corresponding to the operation; inserting the operation on the graphic region into the video so that a user can operate on the graphic region appearing in the video while watching the video.

According to the technical scheme of the embodiment of the invention, the video is converted into one or more frames of images by using canvas and the like, the image recognition is carried out on the one or more frames of images or one or more graphic areas in the images are defined by using a preset shape or a line segment track, and the operation and the response result which can be carried out by a video viewer on the graphic areas are set, so that the operation of the video viewer on the graphic areas in the video is realized, and the interaction between the user and the video is realized by displaying the response result to the user through the video.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A video processing method, comprising:

acquiring a video to be processed;

2. The video processing method of claim 1, wherein the determining one or more image regions in the image comprises:

3. The video processing method of claim 1, wherein the determining one or more image regions in the image comprises:

4. The video processing method according to claim 1,

5. The video processing method of claim 1, further comprising:

6. The video processing method of claim 5, further comprising:

after analyzing the one or more frames of images to determine one or more graphic regions in the image or the preset image, performing one or more of the following processes on the graphic regions: deleting the graphic region, copying the graphic region, combining one or more of the graphic regions, replacing the graphic region with a preset picture or page element, moving the graphic region, enlarging the graphic region, and reducing the graphic region.

7. The video processing method of claim 1, further comprising:

8. The video processing method according to claim 7, wherein the presenting, to the user, a response result corresponding to the operation according to the operation of the user and the graphical area of the operation comprises:

sending the operation and the graphic area to a server;

and displaying the response result to the user.

9. The video processing method according to claim 1,

10. A video processing apparatus, comprising: the video acquisition module is used for acquiring a video to be processed;

11. A video interaction method, comprising:

12. The video interaction method of claim 11,

13. The video interaction method of claim 11,

14. The video interaction method of claim 13, further comprising:

15. The video interaction method of claim 11, further comprising:

pausing the playing of the video to show the operation and a response result corresponding to the graphic area to the user; or when the video is played, the operation and the response result corresponding to the graphic area are displayed to the user.

16. An electronic device for video processing, comprising:

one or more processors;

a storage device for storing one or more programs,

the one or more programs, when executed by the one or more processors, implement the method of any of claims 1-9.

17. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-9.