CN112312195A

CN112312195A - Method and device for implanting multimedia information into video, computer equipment and storage medium

Info

Publication number: CN112312195A
Application number: CN201910679130.5A
Authority: CN
Inventors: 郑婷
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-25
Filing date: 2019-07-25
Publication date: 2021-02-02
Anticipated expiration: 2039-07-25
Also published as: CN112312195B

Abstract

The invention provides a method for implanting multimedia information into a video, which comprises the following steps: segmenting a video to be implanted to obtain a plurality of video segments; respectively carrying out entity identification on a plurality of video fragments, and determining entity areas corresponding to entities in image frames of the video fragments; acquiring multimedia information to be implanted corresponding to the entity area; implanting multimedia information to be implanted into the entity area of the image frame by taking the entity area of the image frame as a background and taking the corresponding multimedia information to be implanted as a foreground to obtain a target image frame; adjusting image parameters of a foreground in a target image frame to enable the image parameters of the foreground to be matched with the image parameters of a background; and carrying out video synthesis based on the adjusted target image frame to obtain a target video. The invention also provides a device for implanting the multimedia information in the video, computer equipment and a storage medium. The invention can realize the automatic implantation of the multimedia information in the video to be implanted, and improve the implantation efficiency of the information; and simultaneously, the effective management of the implanted multimedia information is realized.

Description

Method and device for implanting multimedia information into video, computer equipment and storage medium

Technical Field

The present invention relates to the field of multimedia information embedding technology, and in particular, to a method, an apparatus, a computer device, and a storage medium for embedding multimedia information in a video.

Background

With the increasing abundance of video playing scenes, besides the content of the playing video itself, there is a demand for simultaneously displaying multimedia information, and one of the display modes of the multimedia information is to implant the multimedia information to be displayed into the video so as to display the multimedia information in the video playing process.

In the related art, for the implantation of multimedia information in a video, only manual pressure flow means can be relied on, and the implanted multimedia information is manually adjusted, so that the implantation mode has low implantation efficiency and high labor cost.

Disclosure of Invention

In view of this, the method, the apparatus, the computer device, and the storage medium for embedding multimedia information into a video according to the embodiments of the present invention can perform fragmentation processing on a video to be embedded, implement embedding of multimedia information matched with the video into the video, and perform integrated processing on the embedded multimedia information and video content.

The technical scheme of the embodiment of the invention is realized as follows:

the embodiment of the invention provides a method for implanting multimedia information into a video, which comprises the following steps:

segmenting a video to be implanted to obtain a plurality of video segments;

respectively carrying out entity identification on the plurality of video fragments, and determining entity areas corresponding to the entities in the image frames of the video fragments;

acquiring multimedia information to be implanted corresponding to the entity area;

implanting the multimedia information to be implanted into the entity area of the image frame by taking the entity area of the image frame as a background and the corresponding multimedia information to be implanted as a foreground to obtain a target image frame;

adjusting image parameters of the foreground in the target image frame so that the image parameters of the foreground and the image parameters of the background are matched;

and carrying out video synthesis based on the adjusted target image frame to obtain a target video.

The embodiment of the invention also provides a device for implanting multimedia information into the video, which is characterized by comprising the following components:

the video processing module is used for carrying out segmentation processing on a video to be implanted to obtain a plurality of video fragments;

the video processing module is used for respectively carrying out entity identification on the plurality of video fragments and determining entity areas corresponding to the entities in the image frames of the video fragments;

the video processing module is used for acquiring multimedia information to be implanted corresponding to the entity area;

the video processing module is used for implanting the multimedia information to be implanted into the entity area of the image frame by taking the entity area of the image frame as a background and taking the corresponding multimedia information to be implanted as a foreground to obtain a target image frame;

the video processing module is configured to adjust an image parameter of the foreground in the target image frame so that the image parameter of the foreground matches the image parameter of the background;

and the video processing module is used for carrying out video synthesis based on the adjusted target image frame to obtain a target video.

In the above-mentioned scheme, the first step of the method,

the video processing module is configured to perform image interception on the target image frame to obtain an intercepted image including the foreground, where an area of the intercepted image is a constant multiple of an area of the foreground.

In the above-mentioned scheme, the first step of the method,

the video processing module is used for responding to the image parameters including brightness and acquiring the brightness difference between the foreground and the background of the intercepted image;

the video processing module is used for converting the foreground of the intercepted image into a hue-saturation-brightness (HSV) image;

and the video processing module is used for adjusting the brightness of the pixel points of the V layer in the HSV image based on the brightness difference.

In the above-mentioned scheme, the first step of the method,

the video processing module is used for responding to the image frame being the key image frame of the video to be implanted, and positioning the entity area of the non-key image frame of the video to be implanted in a target tracking mode;

the video processing module is used for implanting the foreground in the adjusted target image frame into the entity area of the non-key image frame in an affine transformation mode to obtain a target non-key image frame;

and the video processing module is used for carrying out video coding on the adjusted target image frame and the target non-key image frame to obtain the target video.

In the above-mentioned scheme, the first step of the method,

the video processing module is also used for monitoring exposure parameters of the target video;

the video processing module is further used for determining a Manrong visual effect index corresponding to the implanted multimedia information according to the exposure parameter of the target video;

the video processing module is further configured to adjust the playing of the target video according to the gorgeous visual effect index corresponding to the implanted multimedia information.

In the above scheme, the apparatus further comprises:

and the information transmission module is used for sending the exposure parameters during the playing of the target video to a monitoring server so as to realize the monitoring of the monitoring server on the exposure of the target video.

An embodiment of the present invention further provides a computer device, including:

a memory for storing executable instructions;

and the processor is used for realizing the method for implanting the multimedia information in the video of the preamble when the executable instructions stored by the memory are run.

Embodiments of the present invention also provide a computer-readable storage medium storing executable instructions that, when executed by a processor, implement a method for embedding multimedia information in a video of a preamble.

The embodiment of the invention has the following beneficial effects:

1) implanting corresponding multimedia information into the entity area of the image frame by taking the entity area of the image frame of the video to be implanted as a background and the corresponding image to be implanted as a foreground to obtain a target image frame, thereby realizing the automatic implantation of the multimedia information into the video to be implanted and improving the information implantation efficiency;

2) the image parameters of the foreground and the image parameters of the background in the target image frame are matched by adjusting the image parameters of the foreground in the target image frame, so that the automatic integrated processing of the foreground and the background in the video image frame is realized after the corresponding multimedia information is implanted, the integration of the multimedia information implanted in the target video and the video content is realized, and the acceptance degree of a user on the implanted multimedia information when watching the video is improved.

Drawings

Fig. 1 is a schematic view of a usage scenario of a method for embedding multimedia information in a video according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a component structure of a computer device according to an embodiment of the present invention;

fig. 3 is a schematic flow chart illustrating an alternative method for embedding multimedia information into a video according to an embodiment of the present invention;

fig. 4A is a schematic diagram of an alternative data structure of a video according to an embodiment of the present invention;

fig. 4B is a schematic diagram of an alternative data structure of a video according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a frame image of a video before embedding multimedia information according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a frame image of a video after embedding multimedia information to be embedded according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a frame image of a video after embedding multimedia information to be embedded according to an embodiment of the present invention;

fig. 8 is a schematic flow chart illustrating an alternative method for embedding multimedia information into a video according to an embodiment of the present invention;

FIG. 9 is a schematic flow chart of an alternative process for embedding advertisements in videos according to an embodiment of the present invention;

FIG. 10A is a schematic flow chart of an alternative process for embedding advertisements in videos according to an embodiment of the present invention;

FIG. 10B is a schematic flow chart of an alternative process for embedding advertisements in videos according to an embodiment of the present invention;

FIG. 11 is a schematic view of an aggregate display of advertisements to be placed by different advertisers;

FIG. 12 is a schematic diagram of an alternative detection process shown by the video client of the present invention;

FIG. 13 is a schematic diagram of a video client showing an alternative audit process in accordance with the present invention;

FIG. 14 is a schematic view of an advertisement to be embedded in a photo frame region according to an embodiment of the present invention;

fig. 15 is a schematic diagram of implanting an advertisement to be implanted in a screen area of a television according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.

It should be noted that in the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.

1) The entity, as used herein, refers to a main body or real object to be embedded in the video for carrying multimedia information to be embedded, such as a desktop, a wall surface, a photo frame, a screen of an electronic device, and so on.

2) The image frame refers to the minimum unit of a video and is a static image; for example, when video information is played, a picture at any time is frozen, i.e., an image frame is obtained.

3) The shot is the basic unit of video data, and in the video shooting process, a section of video continuously shot by a camera is called a shot.

4) Key frame images, which can be used to represent images of the shot content, after the shot structure is cut out, the key frame images are used to represent the underlying features of each shot, so as to perform further video structuring; in a video shot, the number of key frames is much smaller than the number of frame images contained in the shot.

5) In response to the condition or state on which the performed operation depends, one or more of the performed operations may be in real-time or may have a set delay when the dependent condition or state is satisfied; there is no restriction on the order of execution of the operations performed unless otherwise specified.

6) The method is characterized in that the advertisement is embedded, product or brand information is embedded into video content in the form of a real object or a picture or a video, and brand impression is given to audiences, so that marketing purposes are achieved. Wherein, the expression form of the advertisement is multimedia information, and the types of the multimedia information include but are not limited to: pictures, text, video, audio. The 'advertisement implantation' is hidden in the carrier and integrated with the carrier, and simultaneously, the advertisement information is elaborately coded by a non-advertisement expression method, so that audiences feel the commodity and brand information in an unconscious state, and thus the stimulation of the advertisement information is received.

7) The Manrong visual effect index specifically comprises: 1. the minimum exposure, i.e. the frame exposure time, must be longer than 1 second. 2. The visibility of the multimedia information, i.e. the display ratio of the embedded multimedia information on the screen, must be higher than a preset threshold value. The threshold consists of the threshold of the first exposure and the remaining exposures. 3. The sharpness, i.e. the sharpness must be higher than a predetermined threshold (sharpness threshold), different dimensions will have different predetermined values. Specifically, the visual effect index (VIS) of the corresponding multimedia information may be calculated according to the following parameters:

exposure size to screen ratio parameter, saliency parameter (how visible the embedded brand is in the scene and environment), proximity effect parameter (how close it is to the main active task/object in the scene).

8) Exposure: when the effective conditions are met, the corresponding multimedia information is pushed to the user or the user selects to watch, for example: and when the pushing condition of the video is met, pushing different videos to the user for the user to watch.

Fig. 1 is a schematic view of a usage scenario of a method for embedding multimedia information into a video according to an embodiment of the present invention, and referring to fig. 1, a terminal (including a terminal 10-1 and a terminal 10-2) is provided with a corresponding client capable of playing a video embedded with multimedia information, the terminal is connected to a computer device 200 through a network 300, the network 300 may be a wide area network or a local area network, or a combination of the two, and data transmission is implemented using a wireless link, where the multimedia information includes, but is not limited to, videos, pictures, Flash animations and advertisement information.

During the process that the terminal (terminal 10-1 and/or terminal 10-2) acquires and exposes the corresponding target video with the embedded multimedia information to the computer device 200 through the network 300, the user can perform different operations on the exposed multimedia information through the terminal (terminal 10-1 and/or terminal 10-2) to generate different user behaviors, for example, when the multimedia information is a video, the user can share and/or approve the exposed target video during the process of viewing the information. When the multimedia information is an advertisement, the user may forward and/or comment on the advertisement during an exposure of the advertisement by the terminal (terminal 10-1 and/or terminal 10-2).

As an example, the computer device 200 is configured to perform segmentation processing on a video to be implanted, so as to obtain a plurality of video segments; respectively carrying out entity identification on the plurality of video fragments, and determining entity areas corresponding to the entities in the image frames of the video fragments; acquiring multimedia information to be implanted corresponding to the entity area; implanting the multimedia information to be implanted into the entity area of the image frame by taking the entity area of the image frame as a background and the corresponding multimedia information to be implanted as a foreground to obtain a target image frame; adjusting image parameters of the foreground in the target image frame so that the image parameters of the foreground and the image parameters of the background are matched; and carrying out video synthesis based on the adjusted target image frame to obtain and expose a target video.

As described in detail below, the structure of the computer device according to the embodiment of the present invention may be implemented in various forms, such as a dedicated terminal with a multimedia information processing function, or a computer device with a multimedia information processing function, such as the computer device 200 in fig. 1. Fig. 2 is a schematic diagram of a constituent structure of a computer device according to an embodiment of the present invention, and it is understood that fig. 2 only shows an exemplary structure of the computer device, and not a whole structure, and a part of or the whole structure shown in fig. 2 may be implemented as needed.

The computer device provided by the embodiment of the invention comprises: at least one processor 201, memory 202, user interface 203, and at least one network interface 204. The various components in the computer device 200 are coupled together by a bus system 205. It will be appreciated that the bus system 205 is used to enable communications among the components. The bus system 205 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 205 in fig. 2.

The user interface 203 may include, among other things, a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, or a touch screen.

It will be appreciated that the memory 202 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. The memory 202 in embodiments of the present invention is capable of storing data to support operation of the terminal (e.g., 10-1). Examples of such data include: any computer program, such as an operating system and application programs, for operating on a terminal (e.g., 10-1). The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application program may include various application programs.

In some embodiments, the apparatus for embedding multimedia information in video provided by the embodiments of the present invention may be implemented by a combination of hardware and software, and as an example, the apparatus for embedding multimedia information in video provided by the embodiments of the present invention may be a processor in the form of a hardware decoding processor, which is programmed to execute the method for embedding multimedia information in video provided by the embodiments of the present invention. For example, a processor in the form of a hardware decoding processor may employ one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.

As an example of the implementation of the apparatus for embedding multimedia information in video provided by the embodiment of the present invention by using a combination of software and hardware, the apparatus for embedding multimedia information in video provided by the embodiment of the present invention can be directly embodied as a combination of software modules executed by the processor 201, the software modules can be located in a storage medium, the storage medium is located in the memory 202, the processor 201 reads executable instructions included in the software modules in the memory 202, and the method for embedding multimedia information in video provided by the embodiment of the present invention is completed in combination with necessary hardware (for example, including the processor 201 and other components connected to the bus 205).

By way of example, the Processor 201 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor or the like.

As an example of the hardware implementation of the apparatus for embedding multimedia information in video provided by the embodiment of the present invention, the apparatus provided by the embodiment of the present invention may be implemented directly by using the processor 201 in the form of a hardware decoding processor, for example, the method for embedding multimedia information in video provided by the embodiment of the present invention is implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.

The memory 202 in embodiments of the present invention is used to store various types of data to support the operation of the computer device 200. Examples of such data include: any executable instructions for operating on the computer device 200, such as executable instructions, may be embodied in the executable instructions, and the program implementing the method for embedding multimedia information from video according to the embodiments of the present invention.

In other embodiments, the apparatus for embedding multimedia information in video according to the embodiments of the present invention may be implemented by software, and fig. 2 shows an apparatus 2020 for embedding multimedia information in video stored in the memory 202, which may be software in the form of programs and plug-ins, and includes a series of modules, and as an example of the programs stored in the memory 202, an apparatus 2020 for embedding multimedia information in video may be included, and the apparatus 2020 for embedding multimedia information in video includes the following software modules: a video processing module 2081 and a video playing module 2082. When the software modules in the apparatus 2020 for embedding multimedia information in video are read into the RAM by the processor 201 and executed, the method for embedding multimedia information in video provided by the embodiment of the present invention will be implemented, and the functions of the software modules in the apparatus 2020 for embedding multimedia information in video will be described as follows:

the video processing module 2081 is used for segmenting a video to be implanted to obtain a plurality of video segments; the video processing module 2081 is configured to perform entity identification on the plurality of video segments, and determine an entity area corresponding to the entity in an image frame of each video segment; the video processing module 2081 is configured to obtain multimedia information to be implanted corresponding to the entity area; the video processing module 2081 is configured to implant the multimedia information to be implanted into the entity area of the image frame with the entity area of the image frame as a background and the corresponding multimedia information to be implanted as a foreground, so as to obtain a target image frame; the video processing module 2081 is configured to adjust image parameters of the foreground in the target image frame, so that the image parameters of the foreground are matched with the image parameters of the background; the video processing module 2081 is configured to perform video synthesis based on the adjusted target image frame to obtain a target video; and the video playing module 2082 is used for exposing the target video.

In the following description, the method for embedding multimedia information in video provided by the embodiment of the present invention is described with reference to the exemplary application and implementation of the terminal provided by the embodiment of the present invention, and it can be understood from the foregoing description that the method for embedding multimedia information in video provided by the embodiment of the present invention can be implemented by various types of devices with a multimedia information processing function, such as a video computer device or a video processing dedicated device.

Referring to fig. 3, fig. 3 is an optional flowchart of the method for embedding multimedia information in video according to the embodiment of the present invention, and it can be understood that the steps shown in fig. 3 may be executed by various computer devices, such as a dedicated terminal, a computer device, or a cluster of computer devices, which operate an apparatus for embedding multimedia information in video, for example, the dedicated terminal, the computer device, or the cluster of computer devices with a multimedia information processing function. The following is a description of the steps shown in fig. 3.

Step 301: and determining the video to be implanted to be played by the video client.

Step 302: and carrying out segmentation processing on the video to be implanted to obtain a plurality of video fragments.

In some embodiments of the present invention, the to-be-implanted video is segmented to obtain a plurality of video segments, which may be implemented in the following manner:

carrying out shot switching detection on a video frame of a video to be implanted to obtain a plurality of shots corresponding to the video to be implanted; segmenting the video to be implanted based on the plurality of lenses to obtain video segments corresponding to the lenses; and determining the coordinates of the initial position and the ending position of the entity area in the video fragment corresponding to each lens.

The video to be embedded may be a complete video, such as a complete video file, or a video segment, such as a segment excerpt of a video.

Step 303: and respectively carrying out entity identification on the plurality of video fragments, and determining an entity area corresponding to the entity in the image frame of each video fragment.

The entity is a main body or an object to be embedded in the video for bearing multimedia information to be embedded, such as a desktop, a wall surface, a photo frame and a screen of an electronic device (such as a television screen); in practical implementation, the entity in the frame image may be identified by a Single-polygon Multi-Box Detector (SSD), or may be identified by a mask RCNN, which is an example segmentation algorithm, and the embodiment of the present invention is not limited thereto. In practical application, entities needing to be identified can be preset, such as only identifying a desktop in a frame image, or identifying a wall surface and a photo frame in the frame image.

In practical application, different entity identifications can be carried out on different video fragments, for example, a video is divided into 10 video fragments, desktop identification is carried out on 1 st to 3 rd video fragments, photo frame identification is carried out on 4 th to 6 th video fragments, and wall identification is carried out on 7 th to 10 th video fragments; of course, the same entity identification can be performed for different video segments, and still taking the example that the video is divided into 10 video segments, the entity identification of the desktop and the photo frame is performed for 10 video segments.

Fig. 4A is a schematic diagram of an alternative data structure of a video according to an embodiment of the present invention, wherein, as shown in fig. 4A, the video data can be structurally divided into four layers, i.e., a video, a scene, a shot, and a frame, and a visually continuous video is formed by continuously showing a still image on a screen or a display, where the still image is a video frame; in the video shooting process, a section of video continuously shot by a camera is called a shot, the shot is a basic unit of video data, a plurality of shots with similar contents form a scene, the shots describe the same event from different angles, a section of complete video consists of a plurality of scenes, the exposure selectable time of different multimedia information units in the video is 10 seconds, and fig. 4B is a selectable data structure schematic diagram of the video provided by the embodiment of the invention; the exposure of the single multimedia information unit may be a continuous exposure as shown in fig. 4B, or an interval exposure as shown in fig. 4B, wherein the number of the slices of the video during the interval exposure is not particularly limited.

Based on the data structure of the video, the video to be implanted is segmented to obtain a plurality of video segments, and the method can be realized in the following way: carrying out shot switching detection on a video frame of a video to be implanted to obtain a plurality of shots corresponding to the video to be implanted; and segmenting the video to be implanted based on the plurality of lenses to obtain video segments corresponding to the lenses.

The shot switching detection can find the position where the switching occurs by utilizing the characteristics shown when the shot is switched, so that the whole video is divided into independent shots. For example, shot cut detection for a video to be detected can be achieved as follows: calculating the difference degree of pixel points at the same position in adjacent video frames of the video to be detected by adopting an interframe pixel point matching method, determining the number of the pixel points of which the difference degree exceeds a first difference threshold value in the two adjacent video frames, and determining that the shot switching occurs between the two video frames when the preset number threshold value is reached.

Based on the data structure of the video, when the data structure is actually implemented, the video to be implanted can be segmented in the following mode to obtain a plurality of video fragments: carrying out scene switching detection on video frames of a video to be implanted to obtain a plurality of scenes corresponding to the video to be implanted; and segmenting the video to be implanted based on a plurality of scenes to obtain video segments corresponding to the scenes.

Here, in practical applications, the scene change detection of the video to be detected can be implemented as follows: and calculating the histogram difference degree of adjacent video frames of the video to be detected, and determining that scene switching occurs between two video frames of which the histogram difference degree reaches a second difference threshold value.

Step 304: and acquiring multimedia information to be implanted corresponding to the entity area.

In some embodiments of the present invention, obtaining multimedia information to be embedded corresponding to the entity area includes:

determining display parameters of the entity area; according to the display parameters of the entity area, determining multimedia information to be implanted matched with the display parameters of the entity area, wherein the display parameters of the entity area comprise: scene parameters of the entity area, type parameters of the entity area, and display unit parameters of the entity area.

In some embodiments of the present invention, the multimedia information to be embedded may belong to different users, and the different users may aggregate the multimedia information to be embedded in advance and classify the multimedia information according to parameters of the multimedia information to be embedded, for example: the multimedia information to be implanted by the user A is a picture of milk tea beverage, the multimedia information to be implanted by the user B is a picture of pure milk (beverage), and the multimedia information to be implanted by the user A and the user B can belong to the same type of multimedia information because the type parameters of the multimedia information to be implanted by the user A and the user B are the same.

In some embodiments of the present invention, the video computer device may request the computer device storing the multimedia information for the corresponding multimedia information to be embedded according to the determined display parameters of the entity area, for example: when the scene parameter of the entity area is determined to be the application scene of the restaurant, the multimedia information corresponding to the scene can be requested to the multimedia information computer equipment to be used as the multimedia information to be implanted; or, when the type parameter of the entity area is the screen of the electronic device, the multimedia information (video clip) corresponding to the type parameter of the entity area may be requested from the multimedia information computer device as the multimedia information to be embedded.

In some embodiments of the present invention, when the display unit parameter of the entity area represents the number of display units of a specific entity area, different (including different multimedia information of the same user or the same type of multimedia information of different users) multimedia information may be selected as the multimedia information to be embedded in a single display unit according to the frame number of the video frame corresponding to the display unit.

In practical applications, the multimedia information to be embedded may be an advertisement, specifically, an embedded advertisement, also called Video-In, which is a soft advertisement form, and refers to an entity area In a Video frame, such as a desktop, a wall surface, a photo frame, a bar counter, a billboard, etc., In which the multimedia information to be embedded is embedded. Fig. 5 is a schematic view of a frame image of a video before embedding multimedia information to be embedded according to an embodiment of the present invention, fig. 6 is a schematic view of a frame image of a video after embedding multimedia information to be embedded according to an embodiment of the present invention, and it can be known from fig. 5 and fig. 6 that "fruit juice" serving as multimedia information to be embedded is embedded in a desktop area in a video frame. Fig. 7 is a schematic diagram of a frame image of a video after embedding multimedia information to be embedded according to an embodiment of the present invention, and it can be known from fig. 5, fig. 6 and fig. 7 that "pure milk" is embedded as multimedia information to be embedded in the same position of a desktop area in a video frame.

In some embodiments, the multimedia information to be embedded may include at least one of: pushing entities and characters; wherein, the pushing entity is an entity which displays the advertisement in a tangible material form mode, such as a coffee cup with a specific shape; the push text is a text that is used for displaying an advertisement with specific content in a text form, for example, the text is used for describing the function of a specific electronic device.

In practical implementation, different entity areas may correspond to different multimedia information to be embedded, or different entity areas may correspond to the same multimedia information to be embedded. For example, in the case that the entity area is a desktop and a photo frame, the desktop can bear multimedia information to be embedded in the form of a three-dimensional model and a poster, and the photo frame can bear multimedia information to be embedded in the form of a poster; here, a poster is one of the presentation forms of visual communication, and displays advertisement information in a specific form by completely combining elements such as pictures, characters, colors, and spaces.

The description will be given by taking the identified entity as the desktop and the corresponding entity area as the desktop area. In some embodiments, the multimedia information to be implanted corresponding to the desktop area includes a push entity, and for the same push entity, there may be images to be implanted at different rendering angles, and for different rendering angles of the desktop, images to be implanted that match the rendering angles of the desktop may be selected.

In some embodiments, the computer device may obtain the image to be implanted carrying the multimedia information to be implanted of the corresponding entity area by: acquiring a video identifier corresponding to a video to be implanted, determining a pushing entity corresponding to a desktop area of the video to be implanted based on the acquired video identifier, and acquiring a first presentation angle of the pushing entity in an image to be implanted; acquiring a second presentation angle of an entity presented in the entity area; and determining the image to be implanted with the first presentation angle matched with the second presentation angle, wherein the image to be implanted is the image to be implanted corresponding to the entity area.

Step 305: and implanting the multimedia information to be implanted into the entity area of the image frame by taking the entity area of the image frame as a background and the corresponding multimedia information to be implanted as a foreground to obtain a target image frame.

Step 306: and adjusting the image parameters of the foreground in the target image frame so that the image parameters of the foreground are matched with the image parameters of the background.

In some embodiments of the invention, the method further comprises:

and carrying out image interception on the target image frame to obtain an intercepted image containing the foreground, wherein the area of the intercepted image is a constant multiple of the area of the foreground.

In some embodiments, the computer device may adjust image parameters of the foreground in the target frame image directly based on the background in the target frame image so that the image parameters of the foreground and the image parameters of the background match; in some embodiments, the computer device may perform image interception on the target frame image to obtain an intercepted image containing a foreground, and then adjust image parameters of the foreground in the intercepted image based on a background of the intercepted image (a local background of the target frame image) so that the image parameters of the foreground are matched with the image parameters of the background; the area of the intercepted image is a constant multiple of the area of the foreground, for example, the foreground is taken as the center, and twice the area of the foreground is the size of the intercepted image, so that the target frame image is intercepted.

In some embodiments of the present invention, adjusting the image parameter of the foreground in the target image frame so that the image parameter of the foreground matches the image parameter of the background may be implemented by:

responding to the image parameters including brightness, and acquiring the brightness difference between the foreground and the background of the intercepted image;

converting the foreground of the intercepted image into a hue-saturation-brightness (HSV) image; and adjusting the brightness of the pixel points of the V layer in the HSV image based on the brightness difference. Wherein the average saturation of the foreground and the background of the captured image may be calculated separatelyCalculating the brightness difference diff between the average saturation of the foreground and the average saturation of the background_sConverting the foreground of the intercepted image into an HSV image, and then performing complementary diff on each pixel point of an S layer in the HSV image_sAnd completing the saturation harmony of the foreground and the background by 0.8 operation, so that the image parameters of the foreground are matched with the image parameters of the background.

Step 307: and carrying out video synthesis based on the adjusted target image frame to obtain and expose a target video.

In some embodiments of the present invention, performing video synthesis based on the adjusted target image frame to obtain and expose the target video may be implemented by:

respectively carrying out video coding on the basis of the adjusted target image frame in each video fragment to obtain a target video fragment corresponding to each video fragment; fusing target video fragments corresponding to the video fragments to obtain the target video; and responding to a corresponding playing instruction to play or push stream the target video so as to expose the target video.

In some embodiments of the present invention, the video synthesis based on the adjusted target image frame to obtain the target video may be implemented by:

in response to the image frame being a key image frame of the video to be implanted, locating a solid area of a non-key image frame of the video to be implanted in a target tracking manner; implanting the foreground in the adjusted target image frame into the entity area of the non-key image frame in an affine transformation mode to obtain a target non-key image frame; and carrying out video coding on the adjusted target image frame and the target non-key image frame to obtain the target video.

In some embodiments of the invention, the method further comprises:

monitoring exposure parameters of the target video; determining a Manrong visual effect index corresponding to the implanted multimedia information according to the exposure parameters of the target video; and adjusting the playing of the target video according to the Manrong visual effect index corresponding to the implanted multimedia information. After the implantation of the multimedia information is completed, the implanted multimedia information is exposed along with the exposure of the target video, and in the process, the exposure of the implanted multimedia information needs to be monitored, wherein when any parameter in the Manrong visual effect index corresponding to the implanted multimedia information does not reach a corresponding parameter threshold, the playing of the target video can be adjusted, for example, the video to be implanted is replaced. For example: when the display proportion of the implanted multimedia information on the screen is lower than the preset proportion threshold value 0.1 (the ratio of the multimedia information to the face value of the display screen), the video to be implanted is replaced again, the corresponding multimedia information is implanted again, and therefore the fact that the user to whom the multimedia information belongs can timely push the multimedia information of the user to audiences is guaranteed.

In some embodiments of the invention, the method further comprises:

and sending the exposure parameters of the target video during playing to monitoring computer equipment so as to realize that the monitoring computer equipment monitors the exposure of the target video. Therefore, exposure of the target video is monitored by the third-party computer equipment.

Fig. 8 is an optional flowchart of a method for embedding multimedia information in a video according to an embodiment of the present invention, and it can be understood that the steps shown in fig. 8 may be executed by various computer devices that operate an apparatus for embedding multimedia information in a video, for example, a dedicated terminal, a computer device, or a computer device cluster with a multimedia information processing function. The following is a description of the steps shown in fig. 8. Wherein steps 301 to 307 are as described above and are not described again.

Step 308: and (4) auditing the formed target video, judging whether the formed target video passes or not, if so, executing step 309, otherwise, executing step 310.

In some embodiments of the present invention, the formed target video needs to be audited before the target video is exposed, so as to avoid misleading the embedded multimedia information to the audience of the target video.

Step 309: and exposing the target video.

Step 310: and implanting corresponding multimedia information again.

Step 311: monitoring exposure parameters of the target video;

step 312: and sending the exposure parameters of the target video during playing to a monitoring server.

In the following, the processing procedure of implanting multimedia information into a video in the present application is described with the implanted multimedia information as an advertisement, where a client is capable of playing the video, fig. 9 is an optional flowchart of implanting an advertisement into a video in the embodiment of the present invention, where a computer device is represented as a corresponding server, as shown in fig. 9, and specifically includes the following steps:

first stage

Step 901: a client requests a source address of video playing;

step 902: the video server requests an advertisement implantation fragment;

step 903: a synchronous serial port controller of the video server requests an advertisement fragment to be implanted from an advertisement server;

step 904: the advertisement server obtains an advertisement fragment to be implanted according to the unique ID number of the vv;

step 905: the advertisement server sends the acquired advertisement segment to be implanted to the synchronous serial port controller.

Step 906: the synchronous serial port controller sends the advertisement segment to be implanted to the video server.

Step 907: and the video client obtains a complete video playing address.

Second stage

Step 1001: the SDK of the advertisement requests to merge the interfaces and carries the unique identification id of the vv.

Step 1002: the synchronous serial port controller requests the report information of the embedded advertisement.

Step 1003: and the advertisement server obtains the implanted advertisement information according to the unique identification id of the vv.

Step 1004: and returning the report information of the implanted advertisement.

Step 1005: and the synchronous serial port controller returns the report information of the implanted advertisement.

Step 1006: the SDK executes the reporting logic.

In the process of fig. 9, fig. 10A and 10B are schematic diagrams illustrating an alternative process for embedding advertisements in videos according to an embodiment of the present invention; the process of embedding advertisement information in video is shown in fig. 10A, and includes the following steps:

step 1101: and (4) analyzing and selecting the inventory.

In the processing process, the bottom layer is in butt joint with an image recognition service, and advertisement opportunities of each video medium are analyzed, wherein the advertisement opportunities comprise the number of advertisement units, the advertiser industry suitable for each advertisement unit and scene characteristics. Fig. 11 is a schematic view of aggregate display of advertisements to be implanted by different advertisers, a corresponding video client display interface may be as shown in fig. 11, and the server may automatically select a corresponding advertisement according to the number of advertisement units, the advertiser industry suitable for each advertisement unit, and the scene characteristics.

Step 1102: ad resource reservation.

The advertisement scheduling period and the corresponding resources can be preset, and the content package is associated, wherein the associated information in the content package comprises: cid + vid list, brand category, scene category, material category, advertisement ID, and number of advertisement units and remark information for each advertisement ID.

Taking the solid area as a photo frame area and a tv screen as an example, referring to fig. 14, fig. 14 is a schematic view of implanting an advertisement to be implanted in the photo frame area according to an embodiment of the present invention, in fig. 14, a number 91 is a photo frame area before implanting the advertisement to be implanted, and a number 92 is a photo frame area after implanting the advertisement to be implanted. Referring to fig. 15, fig. 15 is a schematic view of implanting an advertisement to be implanted in a screen area of a television according to an embodiment of the present invention, in fig. 15, a number 11 is a picture frame area before the advertisement to be implanted is implanted, and a number 12 is a picture frame area after the advertisement to be implanted is implanted. All ad units in the scene category can be reserved for ad to be placed,

number

11 and 12, for ad to be placed, for user a, via step 1102.

Step 1103: and (5) putting and checking the advertisement order.

The advertisement server can submit advertisement materials and monitoring links according to standard material specifications, support an interface (API) mode and fix the first frame report to add the monitoring links, add a plurality of corresponding monitoring links, and select a branch platform and sdk monitoring modes. Fig. 12 is a schematic diagram of the video client displaying an optional detection process, and fig. 13 is a schematic diagram of the video client displaying an optional audit process. After the advertisement is implanted, the corresponding auditing stage is entered, so that the legal compliance of the advertisement materials is guaranteed. After the examination is passed, the order is not in effect on line like a conventional order, but advertisement implantation is initiated, and advertisement implantation service implants the advertisement material into the specified sub-slice according to the information of the material and the content packet.

Step 1104: and (5) examining and verifying the implantation result.

In some embodiments of the present invention, after the advertisement implantation result comes out, the advertisement implantation review stage is entered, and the advertisement implantation review comprises 2 review steps. The method specifically comprises the following steps:

1) displaying content: id. Album name (cid) _ xth set, vid, scene category, material category, review status, preview, operation (pass/fail), all piece sets that are a same order + a same vid.

2) And (4) auditing the state: content editing to be checked, business editing to be checked, content editing and checking not passing, business editing and checking not passing, final checking passing to be online and online

3) Previewing the content: and implanting the segment video.

4) And (5) auditing results: audit pass/fail. The description of the reasons for failure is also included therein.

In an optional embodiment, the auditing process includes content editing and auditing, and then business editing and auditing, when the process goes to the content editing and auditing, the corresponding content editing (for example, corresponding editing components are triggered to realize editing in instant communication software or mail editing) needs to be notified by mails and WeChat, and only the part of people have authority to audit; when the process goes to the business editing and checking, the corresponding business editing and editing needs to be informed by mails and WeChat, and only the part of people have authority to check.

5) And (4) notifying the content: there are dynamic advertisements (album name (cid) _ X (th) set) to be edited and checked for content (or business), please process as soon as possible. If the mail is, the mail title: (please review) newly added dynamic embedded advertisements to be reviewed; auditing the corresponding person who initiates implantation (i.e. the person who packages) who does not pass the email notification, email header: dynamic placement advertisement reviews do not pass the album name (cid) _ X th set dynamic placement advertisement (Ad ID ═ XXX) review, do not pass for reasons: XXXX. An auditor: (instant messaging software).

Step 1105: and advertising a corresponding video server menu.

The video server corresponding to the advertisement supports conventional advertisement targeting capability and supports a menu mode of percentage interruption. The system is connected with the SDK and the video playing client side in a butt joint mode, for each request of the video client side, all selected advertisement unit ids, the starting/ending time of each advertisement unit and the monitoring link of each advertisement unit are returned to the advertisement SDK and the video client side, the advertisement SDK carries out third-party monitoring and data reporting when the specified time is reached, and the video client side requests a corresponding server (cloud server) to obtain a corresponding advertisement implantation segment and plays the corresponding advertisement implantation segment.

Step 1106: client display and data reporting monitoring. And the video client displays the corresponding video fragments at the point of arrival according to the order and the fragments returned by the video server.

Therefore, the processing flow from advertisement inventory analysis → putting → implanting → auditing → displaying and reporting in the process of implanting the advertisement into the video is realized.

The embodiment of the invention has the following beneficial effects:

2) the image parameters of the foreground and the image parameters of the background in the target image frame are matched by adjusting the image parameters of the foreground in the target image frame, so that the automatic integrated processing of the foreground and the background in the video image frame is realized after the corresponding multimedia information is implanted, the integration of the multimedia information implanted in the target video and the video content is realized, and the acceptance degree of the user on the implanted multimedia information when watching the video is improved.

3) The method overcomes the defect that the video can only depend on manual streaming pressing means if the multimedia information needs to be modified or added after being online in the related technology. Furthermore, the defect that no matter the mode of pre-implantation or pressure flow in the related technology can not realize the monitoring of the exposure of the implanted multimedia information by a third party can be overcome.

Claims

1. A method for embedding multimedia information in video, the method comprising:

segmenting a video to be implanted to obtain a plurality of video segments;

2. The method according to claim 1, wherein the slicing the video to be embedded to obtain a plurality of video slices comprises:

carrying out lens switching detection on the video frame of the video to be implanted to obtain a plurality of lenses corresponding to the video to be implanted;

segmenting the video to be implanted based on the plurality of lenses to obtain video segments corresponding to the lenses;

and determining the coordinates of the initial position and the ending position of the entity area in the video fragment corresponding to each lens.

3. The method according to claim 1, wherein the video composition based on the adjusted target image frame to obtain the target video comprises:

performing video coding on the basis of the adjusted target image frame in each video fragment respectively to obtain target video fragments corresponding to each video fragment;

fusing target video fragments corresponding to the video fragments to obtain the target video;

and responding to a corresponding playing instruction to play or push stream the target video so as to expose the target video.

4. The method according to claim 1, wherein the obtaining multimedia information to be embedded corresponding to the physical area comprises:

determining display parameters of the entity area;

according to the display parameters of the entity area, determining multimedia information to be implanted matched with the display parameters of the entity area, wherein the display parameters of the entity area comprise:

scene parameters of the entity area, type parameters of the entity area, and display unit parameters of the entity area.

5. The method of claim 1, further comprising:

6. The method of claim 5, wherein the adjusting image parameters of the foreground in the target image frame so that the image parameters of the foreground and the image parameters of the background match comprises:

converting the foreground of the intercepted image into a hue-saturation-brightness (HSV) image;

and adjusting the brightness of the pixel points of the V layer in the HSV image based on the brightness difference.

7. The method according to claim 1, wherein the video composition based on the adjusted target image frame to obtain the target video comprises:

in response to the image frame being a key image frame of the video to be implanted, locating a solid area of a non-key image frame of the video to be implanted in a target tracking manner;

implanting the foreground in the adjusted target image frame into the entity area of the non-key image frame in an affine transformation mode to obtain a target non-key image frame;

and carrying out video coding on the adjusted target image frame and the target non-key image frame to obtain the target video.

8. The method of claim 3, further comprising:

monitoring exposure parameters of the target video;

determining a Manrong visual effect index corresponding to the implanted multimedia information according to the exposure parameters of the target video;

and adjusting the playing of the target video according to the Manrong visual effect index corresponding to the implanted multimedia information.

9. The method of claim 8, further comprising:

and sending the exposure parameters of the target video during playing to a monitoring server so as to realize that the monitoring server monitors the exposure of the target video.

10. An apparatus for embedding multimedia information in video, the apparatus comprising:

11. The apparatus of claim 10,

the video processing module is used for carrying out lens switching detection on the video frame of the video to be implanted to obtain a plurality of lenses corresponding to the video to be implanted;

the video processing module is used for segmenting the video to be implanted based on the plurality of lenses to obtain video segments corresponding to the lenses;

the video processing module is configured to determine coordinates of a start position and coordinates of an end position of an entity area in a video segment corresponding to each of the shots.

12. The apparatus of claim 10,

the device also comprises a video playing module used for exposing the target video;

the video processing module is used for respectively carrying out video coding on the basis of the adjusted target image frame in each video fragment to obtain a target video fragment corresponding to each video fragment;

and the video processing module is used for fusing the target video fragments corresponding to the video fragments to obtain the target video.

13. The apparatus of claim 10,

the video processing module is used for determining display parameters of the entity area;

the video processing module is configured to determine, according to the display parameter of the entity region, to-be-implanted multimedia information that matches the display parameter of the entity region, where the display parameter of the entity region includes:

14. A computer device, characterized in that the computer device comprises:

a memory for storing executable instructions;

a processor for implementing the method of embedding multimedia information in video according to any one of claims 1 to 9 when executing the executable instructions stored in the memory.

15. A computer-readable storage medium storing executable instructions, wherein the executable instructions when executed by a processor implement the method of embedding multimedia information in video according to any one of claims 1 to 9.