CN117494080A

CN117494080A - Watermark rendering method and device for media information, electronic equipment and storage medium

Info

Publication number: CN117494080A
Application number: CN202210884952.9A
Authority: CN
Inventors: 左洪涛
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-07-26
Filing date: 2022-07-26
Publication date: 2024-02-02

Abstract

The application provides a watermark rendering method, a watermark rendering device, electronic equipment and a computer readable storage medium of media information, comprising the following steps: when the terminal displays media information on the media information display interface, positioning a focus area of the target object in the media information display interface; determining a watermark region of the media information based on the region of interest; the watermark area is contained in other display areas except the attention area in the media information display interface; watermark rendering is carried out on the media information displayed in the watermark area, and the concerned area is detected; when the detection result represents that the attention area changes, updating the watermark rendering result of the media information according to the changed attention area. Through the application, the exposure of the watermark can be guaranteed, and the viewing experience of a user is improved.

Description

Watermark rendering method and device for media information, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of internet technologies, and in particular, to a method and apparatus for watermark rendering of media information, an electronic device, and a computer readable storage medium.

Background

In the related art, the watermark information is added to the media information in a manner of fixing a physical position or directly rendering the issued watermark according to a specific animation rule (such as lateral movement), however, the manner of directly adding the watermark to the media information prevents the content from being leaked, but can affect the viewing of the user.

Disclosure of Invention

The embodiment of the application provides a watermark rendering method, device, electronic equipment, computer readable storage medium and computer program product of media information, which can ensure the exposure of watermarks and improve the viewing experience of users.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a watermark rendering method of media information, which comprises the following steps:

when a terminal displays media information on a media information display interface, positioning a focus area of a target object in the media information display interface;

determining a watermark region of the media information based on the region of interest; the watermark area comprises other display areas except the concerned area in the media information display interface;

watermark rendering is carried out on the media information displayed in the watermark area, and the concerned area is detected;

When the detection result represents that the concerned area changes, updating the watermark rendering result of the media information according to the changed concerned area.

An embodiment of the present application provides a watermark rendering device of media information, including: .

The positioning module is used for positioning a concerned area of a target object in the media information display interface when the terminal displays the media information on the media information display interface;

a determining module, configured to determine a watermark area of the media information based on the attention area; the watermark area comprises other display areas except the concerned area in the media information display interface;

the rendering module is used for performing watermark rendering on the media information displayed in the watermark area and detecting the concerned area;

and the updating module is used for updating the watermark rendering result of the media information according to the changed attention area when the detection result represents the change of the attention area.

In the above scheme, the positioning module is further configured to obtain a convergence point of the binocular vision line of the target object in the media information display interface; and drawing a target image according to the target length by taking the convergence point as the center point of the image, and taking the corresponding region of the target image in the media information display interface as the region of interest.

In the above scheme, the rendering module is further configured to obtain a new attention area of the target object to the media information after watermark rendering; comparing the new attention area with the attention area to obtain a first coincidence rate between the new attention area and the attention area; comparing the first coincidence rate with a first coincidence rate threshold value to obtain a comparison result; based on the comparison result, a detection result indicating whether the region of interest has changed is determined.

In the above scheme, the updating module is further configured to determine, when the detection result indicates that the region of interest changes, a position of the changed region of interest in the media information display interface; adjusting the watermark area based on the changed position of the concerned area to obtain a new watermark area; and re-performing watermark rendering on the media information based on the new watermark region.

In the above scheme, when the detection result indicates that the region of interest is unchanged and the size of the media information display interface is changed, the device further includes a first comparison module, where the first comparison module is configured to compare the media information display interface with the region of interest to obtain a second coincidence rate between the media information display interface and the region of interest; and comparing the second coincidence rate with a second coincidence rate threshold value, and performing watermark rendering on the media information displayed by the media information display interface when the second coincidence rate reaches the second coincidence rate threshold value.

In the above scheme, the first comparison module is further configured to determine a target watermark area based on the attention area when the second coincidence rate is smaller than a second coincidence rate threshold, and perform watermark rendering on the media information displayed in the target watermark area.

In the above scheme, the determining module is further configured to obtain other areas except the area of interest on the media information display interface; and determining the other area as the watermark area.

In the above scheme, the determining module is further configured to obtain other areas except the area of interest on the media information display interface; comparing the area of the other areas with an area threshold; and when the comparison result shows that the area size of the other areas is larger than the area threshold, selecting part of the areas from the other areas as watermark areas.

In the above scheme, the determining module is further configured to obtain a size of a watermark to be rendered in the watermark area; and selecting at least one rectangular area matched with the watermark to be rendered from the other areas as the watermark area based on the size.

In the above scheme, the positioning module is further configured to collect, by using an image collecting device, a face image of the target object; performing line-of-sight analysis on the face image to obtain a line-of-sight analysis result for indicating the line of sight of the target object; and based on the sight analysis result, determining the area in the media information display interface where the target object gazes as the attention area.

In the above scheme, the positioning module is further configured to perform feature extraction on the face image to obtain a line-of-sight vector and a head pose vector of the target object; based on the head posture vector, adjusting the sight line vector to obtain a target sight line vector for indicating the sight line of the target object, and taking the target sight line vector as the sight line analysis result; positioning a region where eyes of the target object look according to the sight line analysis result; and when the area is positioned on the media information display interface, determining the corresponding area in the media information display interface as the concerned area.

In the above scheme, the positioning module is further configured to collect, by using an audio collecting device, voice data of the target object; carrying out semantic analysis on the voice data to obtain a semantic analysis result; the semantic analysis result comprises a target text word, wherein the target text word is used for indicating the gazing position of the target object on the media information display interface; and determining a focus area of the target object in the media information display interface based on the position of the target object, indicated by the target text word, on the media information display interface in the voice analysis result.

An embodiment of the present application provides an electronic device, including:

a memory for storing executable instructions;

and the processor is used for realizing the watermark rendering method of the media information provided by the embodiment of the application when executing the executable instructions stored in the memory.

The embodiment of the application provides a computer readable storage medium, which stores executable instructions for causing a processor to execute, so as to implement the watermark rendering method of media information.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the electronic device executes the watermark rendering method of the media information provided by the embodiment of the application.

The embodiment of the application has the following beneficial effects:

and determining watermark areas outside the attention areas by positioning the attention areas of the target objects to the media information so as to carry out watermark rendering on the media information displayed in the watermark areas, thereby updating watermark rendering results of the media information when the attention areas are changed. Therefore, the content area watched by the target object is avoided in real time, watermark rendering at other positions is performed, normal rendering of the watermark is ensured, and watermark information is ensured not to influence user watching.

Drawings

Fig. 1 is a schematic architecture diagram of a watermark rendering system 100 of media information provided in an embodiment of the present application;

fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 3 is a flowchart of a watermark rendering method of media information according to an embodiment of the present application;

FIG. 4 is a schematic diagram of determining a region of interest provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of determining a region of interest provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of a watermark region of media information determined based on a region of interest provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of a watermark region of media information determined based on a region of interest provided by an embodiment of the present application;

FIG. 8 is a schematic diagram of a watermark region of media information determined based on a region of interest provided by an embodiment of the present application;

FIG. 9 is a schematic diagram of watermark rendered media information provided by an embodiment of the present application;

FIG. 10 is a schematic diagram of watermark rendered media information provided by an embodiment of the present application;

fig. 11 is a flowchart of a watermark rendering method of media information according to an embodiment of the present application;

FIG. 12 is a schematic view of a user viewing area provided by an embodiment of the present application;

Fig. 13 is a schematic flow chart of watermark rendering of media information according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail with reference to the accompanying drawings, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.

In the following description, the terms "first", "second", "third" and the like are merely used to distinguish similar objects and do not represent a particular ordering of the objects, it being understood that the "first", "second", "third" may be interchanged with a particular order or sequence, as permitted, to enable embodiments of the application described herein to be practiced otherwise than as illustrated or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

Before further describing embodiments of the present application in detail, the terms and expressions that are referred to in the embodiments of the present application are described, and are suitable for the following explanation.

1) Client (Client): the client is also called a user terminal, which refers to a program corresponding to a server and providing local service for a user, and besides some application programs which can only run locally, the program is generally installed on a common client and needs to cooperate with the server to run, namely, a corresponding server and service programs in a network are needed to provide corresponding service, so that a specific communication connection needs to be established between the client and the server terminal to ensure the normal running of the application programs.

2) Watermark, a means of protecting digital information, is understood to be adding some digital information to media information (e.g. images, video, etc.) to achieve authentication, copyright protection, etc. for digital multimedia. Typically, watermark information is hidden in the host (e.g., media information) file without affecting the objectivity and integrity of the host file. The watermark is classified into soft watermark and hard watermark. The hard watermark is encoded in the video image, no additional rendering is needed when playing, and the soft watermark is an independent picture or text, and is needed to be dynamically rendered when playing, for example, the picture or text information such as brand Logo, personal account information and the like is rendered on the video image layer when playing the video.

3) An eyeball tracking technology, which is a technology for tracking and positioning a movement track of an eyeball by using a camera, is used for identifying the area of the eyeball focused content on a screen according to the biological characteristics of the eyeball.

The inventor finds that, in the related art, when media information is displayed, in order to prevent content leakage, watermarks such as personal names, enterprise names and the like are rendered on the media information, so that if the information is leaked, a leakage source is easy to track, however, in the watermark rendering mode, the current practice is to render according to a specific area, and thus, the watching effect of a user is poor.

Based on this, the embodiments of the present application provide a watermark rendering method, apparatus, electronic device, computer readable storage medium and computer program product for media information, which can identify a physical area of a content focused by a user on a screen, and dynamically avoid the area to perform watermark rendering when the terminal performs watermark rendering, so as to avoid affecting the user's viewing.

Referring to fig. 1, fig. 1 is a schematic architecture diagram of a watermark rendering system 100 of media information provided in an embodiment of the present application, in order to implement an application scenario of watermark rendering of media information (for example, when an enterprise live broadcast on an internal network, the application scenario of watermark rendering of media information may be determined by a user on a gazing area of content in a player, and watermarks of the enterprise or a person are rendered on media information corresponding to other display areas outside the gazing area), a terminal (a terminal 400 is shown in an example) is connected to a server 200 through a network 300, the network 300 may be a wide area network or a local area network, or a combination of the two, the terminal 400 is used for the user to use a client 401, and is displayed on a display interface (a media information display interface 401-1 is shown in an example), and the terminal 400 and the server 200 are connected to each other through a wired or wireless network.

The terminal 400 is configured to display media information on a media information display interface;

the server 200 is configured to locate a region of interest of a target object in the media information display interface when the terminal displays media information on the media information display interface; determining a watermark region of the media information based on the region of interest; the watermark area is contained in other display areas except the attention area in the media information display interface; watermark rendering is carried out on the media information displayed in the watermark area, and the media information subjected to watermark rendering is sent to the terminal 400;

the terminal 400 is further configured to display media information rendered by the watermark;

the server 200 is further configured to detect a region of interest of the displayed media information rendered by the watermark; when the detection result represents that the attention area changes, the watermark rendering result of the media information is updated according to the changed attention area, and the updated media information subjected to watermark rendering is sent to the terminal 400.

In some embodiments, the server 200 may be a stand-alone physical server, a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDNs, content Deliver Network), and basic cloud computing services such as big data and artificial intelligence platforms. The terminal 400 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a set-top box, a smart voice interaction device, a smart home appliance, a car terminal, an aircraft, a mobile device (e.g., a mobile phone, a portable music player, a personal digital assistant, a dedicated messaging device, a portable game device, a smart speaker, and a smart watch), etc. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiments of the present application.

Referring to fig. 2, fig. 2 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, in an actual application, the electronic device may be the server 200 or the terminal 400 shown in fig. 1, referring to fig. 2, and the electronic device shown in fig. 2 includes: at least one processor 410, a memory 450, at least one network interface 420, and a user interface 430. The various components in terminal 400 are coupled together by a bus system 440. It is understood that the bus system 440 is used to enable connected communication between these components. The bus system 440 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled in fig. 2 as bus system 440.

The processor 410 may be an integrated circuit chip having signal processing capabilities such as a general purpose processor, such as a microprocessor or any conventional processor, or the like, a digital signal processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

The user interface 430 includes one or more output devices 431, including one or more speakers and/or one or more visual displays, that enable presentation of the media content. The user interface 430 also includes one or more input devices 432, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

Memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 450 optionally includes one or more storage devices physically remote from processor 410.

Memory 450 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The non-volatile memory may be read only memory (ROM, read Only Mem ory) and the volatile memory may be random access memory (RAM, random Access Memory). The memory 450 described in the embodiments herein is intended to comprise any suitable type of memory.

In some embodiments, memory 450 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 451 including system programs, e.g., framework layer, core library layer, driver layer, etc., for handling various basic system services and performing hardware-related tasks, for implementing various basic services and handling hardware-based tasks;

a network communication module 452 for accessing other electronic devices via one or more (wired or wireless) network interfaces 420, the exemplary network interface 420 comprising: bluetooth, wireless compatibility authentication (WiF i), and universal serial bus (USB, universal Serial Bus), etc.;

A presentation module 453 for enabling presentation of information (e.g., a user interface for operating peripheral devices and displaying content and information) via one or more output devices 431 (e.g., a display screen, speakers, etc.) associated with the user interface 430;

an input processing module 454 for detecting one or more user inputs or interactions from one of the one or more input devices 432 and translating the detected inputs or interactions.

In some embodiments, the apparatus provided in the embodiments of the present application may be implemented in software, and fig. 2 shows a message transmission apparatus 455 stored in a service system of the memory 450, which may be software in the form of a program and a plug-in, and includes the following software modules: the positioning module 4551, the determining module 4552, the rendering module 4553 and the updating module 4554 are logical, and thus may be arbitrarily combined or further split according to the implemented functions. The functions of the respective modules will be described hereinafter.

In other embodiments, the apparatus provided in the embodiments of the present application may be implemented in hardware, and the watermark rendering apparatus for media information provided in the embodiments of the present application may be a processor in the form of a hardware decoding processor that is programmed to perform the watermark rendering method for media information provided in the embodiments of the present application, for example, the processor in the form of a hardware decoding processor may employ one or more application specific integrated circuits (ASIC, application Specific Integrated Circuit), DSP, programmable logic device (PLD, programmable Logic Device), complex programmable logic device (CPLD, comple x Programmable Logic Device), field programmable gate array (FPGA, field-Programma ble Gate Array), or other electronic components.

In some embodiments, the terminal or the server may implement the watermark rendering method of the media information provided by the embodiments of the present application by running a computer program. For example, the computer program may be a native program or a software module in an operating system; the Application program can be a local (Native) Application program (APP), namely a program which can be installed in an operating system to run, such as an instant messaging APP and a web browser APP; the method can also be an applet, namely a program which can be run only by being downloaded into a browser environment; but also an applet that can be embedded in any APP. In general, the computer programs described above may be any form of application, module or plug-in.

Based on the description of the watermark rendering system and the electronic device for media information provided in the embodiments of the present application, the watermark rendering method for media information provided in the embodiments of the present application is described below. In practical implementation, the watermark rendering method of media information provided in the embodiments of the present application may be implemented by a terminal or a server alone, or implemented by the terminal and the server cooperatively, and the watermark rendering method of media information provided in the embodiments of the present application is illustrated by separately executing the server 200 in fig. 1. Referring to fig. 3, fig. 3 is a flowchart of a watermark rendering method of media information according to an embodiment of the present application, and will be described with reference to the steps shown in fig. 3.

In step 101, the server locates a region of interest of the target object in the media information presentation interface when the terminal presents the media information on the media information presentation interface.

It should be noted that, the target object may be a user of the terminal, the media information may be information such as video, pictures, etc., and the media information display interface may be a display screen interface of the terminal, or when the media information is displayed based on the projection device, the media information display interface may also be a curtain or other devices for displaying the projected media information outside the terminal. When a user runs an application for media information display on a terminal to watch media information, the watched media information is displayed on a media information display interface of the terminal, so that a server locates a focus area of a target object in the media information display interface in real time in the process that the user watches the media information.

In practical implementation, when the terminal displays media information on the media information display interface, there are various ways in which the server locates the region of interest of the target object in the media information display interface, and next, description is given of the region of interest of the server locating the target object in the media information display interface.

In some embodiments, the process of locating the region of interest of the target object in the media information presentation interface specifically includes, by the image acquisition device, acquiring a face image of the target object; performing line-of-sight analysis on the face image to obtain a line-of-sight analysis result for indicating the line of sight of the target object; and determining the area in the media information display interface where the target object gazes as the attention area based on the sight line analysis result.

In practical implementation, for the face image of the target object collected by the image collecting device, specifically, in the process of the user watching the media information, the image collecting device of the terminal starts to work, and the face image of the user is detected in real time. Here, the image pickup device may be a camera. The cameras may be monocular cameras, binocular cameras, depth cameras, three-dimensional (3 d,3 dimensions) cameras, and the like. Illustratively, a camera on scan mode is invoked to scan a target object in the camera field of view in real time and generate an image, i.e., a face image, at a specified frame rate. Alternatively, the image acquisition device may be a radar device such as a laser radar or a millimeter wave radar. A lidar is a radar apparatus that detects characteristic data of a position, a speed, a posture, a shape, and the like of a target object by emitting a laser beam. Millimeter wave radar is a radar device that detects in the millimeter wave band. The radar equipment transmits detection signals to the target object in real time, receives echo signals reflected by the target object, and determines characteristic data of the target object based on differences between the detection signals and the echo signals. The radar apparatus employs a plurality of transmitters and receivers, and the image thus acquired is a three-dimensional point cloud image, that is, a face image.

In practical implementation, after the face image of the user is acquired, line-of-sight analysis is performed on the face image, and the manner of obtaining the line-of-sight analysis result at least includes an eyeball tracking technology, an electromagnetic coil method, an electrooculography method, a contact lens method and the like.

In some embodiments, the process of performing line-of-sight analysis on a face image to obtain a line-of-sight analysis result specifically includes extracting features of the face image to obtain a line-of-sight vector and a head pose vector of a target object; based on the head posture vector, the line-of-sight vector is adjusted to obtain a target line-of-sight vector for indicating the line of sight of the target object, and the target line-of-sight vector is used as a line-of-sight analysis result.

In practical implementation, the process of extracting features of the face image to obtain the sight vector and the head posture vector of the target object is specifically to extract features of the face image to obtain the display area of both eyes and the infrared reflection spots of the pupils of the target object; based on the display area of both eyes of the target object, a head posture vector of the target object is determined, and a sight line vector of the target object is determined through infrared reflection spots of pupils of the target object.

The method specifically comprises the steps of firstly identifying the position of eyes of a user through a face image, then identifying eyeballs of the user, and determining infrared reflection spot coordinates of left eyeballs and infrared reflection spot coordinates of right eyeballs of the user in the face image corresponding to the user; calculating to obtain a left eye sight vector according to the infrared reflection spot coordinates of the left eyeball and the center coordinates of the pupil of the left eyeball, and calculating to obtain a right eye sight vector according to the infrared reflection spot coordinates of the right eyeball and the center coordinates of the pupil of the right eyeball; thereby determining a line of sight vector of the user from the left eye line of sight vector and the right eye line of sight vector.

It should be noted that, when the image acquisition is performed, the image acquisition device irradiates the user with an external auxiliary light source, for example, an infrared light source, then detects and records the reflected infrared light of different areas, then determines the light reflected by the corresponding user eye area through the positions of the eyes, where the emitted infrared light irradiates the eyes to form infrared reflection spots, and after determining the light reflected by the corresponding user eye area, the coordinates of the infrared reflection spots of the left eye and the coordinates of the infrared reflection spots of the right eye in the face image can be obtained, so that the left eye sight vector and the right eye sight vector are calculated respectively in combination with the left eye pupil center coordinates and the right eye pupil center coordinates determined by the positions of the eyes to determine the sight vector.

The process for obtaining the head gesture vector of the target object specifically comprises the steps of firstly identifying the position of eyes of a user through the image of the user, determining the outline of the eyes of the user, and then determining the left eye area and the right eye area in the image of the face based on the outline of the eyes of the user; and calculating the difference value of the left eye area and the right eye area, and searching the mapping relation between the preset difference value of the binocular area and the head posture vector according to the difference value, so as to determine the head posture vector corresponding to the difference value.

It should be noted that, when the head of the user is offset to the left, the left eye area in the obtained face image is generally smaller than the right eye area, and when the head of the user is offset to the right, the left eye area in the obtained face image is generally larger than the right eye area. Therefore, the left eye image and the right eye image in the face image can be extracted first, then the areas of the left eye image and the right eye image are calculated, and the head posture vector which can be used for representing the head deflection direction of the user is determined according to the area difference value. If the area difference is greater than zero, the face is indicated to be facing right, and if the area difference is less than zero, the face is indicated to be facing left. The mapping relation between the binocular area difference value of the user and the head posture vector is preset, so that the head posture vector can be rapidly determined according to the currently calculated binocular area difference value.

It should be noted that, for the same line of sight vector, the final attention area is different due to the different postures of the head, so the line of sight vector is adjusted based on the head posture vector to obtain the target line of sight vector, where the target line of sight vector is used to indicate the line of sight direction of the user when the head of the user is opposite to the media information display interface.

In actual implementation, after the line-of-sight analysis result is obtained, determining that the area in the media information display interface where the target object is gazed is the attention area based on the line-of-sight analysis result, specifically including positioning the area where the eyes of the target object are gazed according to the line-of-sight analysis result; when the area is positioned on the media information display interface, the corresponding area in the media information display interface is determined as the concerned area. Here, when the area is not located on the media information presentation interface, line of sight analysis is performed again on the newly acquired face image to relocate the area where the eyes of the target object are gazing according to the obtained line of sight analysis result.

In other embodiments, the process of locating the region of interest of the target object in the media information presentation interface specifically includes collecting, by the audio collection device, voice data of the target object; carrying out semantic analysis on the voice data to obtain a semantic analysis result; the semantic analysis result comprises a target text word which is used for indicating the gazing position of a target object on the media information display interface; and determining a focus area of the target object in the media information display interface based on the position of the target object, indicated by the target text word, on the media information display interface in the voice analysis result. Specifically, in response to an audio acquisition instruction triggered by a target object, acquiring voice data of the target object through audio acquisition equipment, and then performing text conversion on the voice data to obtain text data corresponding to the voice data; carrying out semantic analysis processing on the text data to obtain text words in the text data, wherein the text words represent the gazing position of the target object; and matching the text word with the content in the displayed media information, taking the text word as a target text word when the matching result represents successful matching, and determining the gazing position of the target object on the media information display interface based on the corresponding position of the target text word in the displayed media information, thereby taking the position as the attention area of the target object in the media information display interface.

As an example, when the media presentation interface is a conference interface inside an enterprise and the media information is a conference document, in response to a user-triggered instruction for acquiring the explanation of the conference document by an audio acquisition device, such as a recording device, the explanation of the conference document is acquired, text words representing the gazing position of the target object in the explanation are determined by performing semantic analysis on the explanation, for example, the explanation is "let us see abstract part", here, "abstract" can be used as the text words representing the gazing position of the target object, then the text words are matched with the content in the presented conference document, the word "abstract" is determined to be included in the conference document based on the matching result, thus the word "abstract" is used as the target text word, the corresponding position of the word "abstract" in the presented conference document is determined, and the gazing position of the target object on the conference interface is determined, thus the gazing region of the user on the conference interface is determined.

As an example, when the media presentation interface is a song recording interface in a singing application and the media information is song lyrics, in response to a singing recording instruction triggered by a user, recording singing content of the song by the user through an audio acquisition device such as the song recording device, determining a text word representing a gazing position of the user in the singing content through semantic analysis on the singing content, then matching the text word with the presented lyrics, determining that the text word is included in the presented lyrics based on a matching result, thereby taking the text word as a target text word, determining a gazing position of a target object on the song recording interface at a corresponding position of the target text word in the presented lyrics, and determining a focus area of the user on the song recording interface.

It should be noted that, the manner in which the user triggers the audio collection instruction may be triggered by a triggering operation for the function item for audio collection, or may be triggered by a triggering operation for the voice determination function item, for example, by voice of "collect voice", and the manner in which the audio collection instruction is triggered includes, but is not limited to, the above two manners, and in this regard, the embodiment of the present application is not limited thereto.

In some embodiments, a convergence point of the binocular vision line of the target object in the media information display interface may also be obtained; and drawing a target image according to the target length by taking the convergence point as the center point of the image, and taking a corresponding region of the target image in the media information display interface as a region of interest. The target image and the target length are both set in advance.

As an example, referring to fig. 4, fig. 4 is a schematic diagram of determining a region of interest provided in the embodiment of the present application, based on fig. 4, when the target image is a rectangle, the target length, that is, the length and width of the rectangle are h and w, respectively, and the point P (X, Y) is taken as a center point, and the h and w are the length and width, so that the region corresponding to the drawn rectangle in the media information display interface is taken as the region of interest.

As an example, referring to fig. 5, fig. 5 is a schematic diagram of determining a region of interest provided in the embodiment of the present application, based on fig. 5, when the target image is a circle, the radius of the circle, which is the target length, is r, and the point P (X, Y) is taken as a center point, and r is taken as a radius, the circle is drawn, and thus, a region corresponding to the drawn circle in the media information presentation interface is taken as the region of interest.

Step 102, determining a watermark area of media information based on the attention area; the watermark area is contained in other display areas except the attention area in the media information display interface.

It should be noted that the watermark area may be all or part of the display area other than the region of interest in the media information display interface.

In some embodiments, when the watermark region is an entire region of a presentation region other than the region of interest in the media information presentation interface, determining the watermark region of the media information based on the region of interest specifically includes obtaining the other region of the media information presentation interface other than the region of interest; the other areas are determined as watermark areas. For example, referring to fig. 6, fig. 6 is a schematic diagram of a watermark area of media information determined based on a region of interest according to an embodiment of the present application, and based on fig. 6, after determining a region of interest having a rectangular shape, all areas except the rectangular area in a media information display interface are used as watermark areas to perform watermark rendering.

In other embodiments, when the watermark region is a partial region of a display region other than the region of interest in the media information display interface, determining the watermark region of the media information based on the region of interest specifically includes obtaining the other region of the media information display interface other than the region of interest; comparing the area of the other areas with an area threshold; and when the comparison result shows that the area size of other areas is larger than the area threshold value, selecting part of the areas from the other areas as watermark areas. For example, referring to fig. 7, fig. 7 is a schematic diagram of a watermark area of media information determined based on a region of interest according to an embodiment of the present application, and based on fig. 7, after determining a region of interest having a rectangular shape, an area except the rectangular area in a media information display interface is determined, so that a rectangular area is selected from the areas as a watermark area, so as to perform watermark rendering.

In practical implementation, for the case that the watermark area is a partial area of other display areas except the concerned area in the media information display interface, when selecting the partial area as the watermark area in the other areas, a plurality of areas can be selected as the watermark area, and specifically, the size of the watermark to be rendered in the watermark area is obtained; and selecting at least one rectangular area matched with the watermark to be rendered from other areas as a watermark area based on the size. For example, referring to fig. 8, fig. 8 is a schematic diagram of a watermark area of media information determined based on a region of interest according to an embodiment of the present application, and based on fig. 8, after determining a region of interest having a rectangular shape, an area except the rectangular area in a media information display interface is determined, so that four rectangular areas are selected from the area as watermark areas, so as to perform watermark rendering.

It should be noted that, at least one rectangular area adapted to the watermark to be rendered may be selected from other areas as the watermark area, or at least one circular area adapted to the watermark to be rendered may be selected from other areas as the watermark area, and the shape of the selected area includes, but is not limited to, the above two types of areas, and the embodiment of the present application is not limited thereto.

And 103, watermark rendering is carried out on the media information displayed in the watermark area, and the attention area is detected.

It should be noted that detecting the region of interest specifically includes detecting a change in the region of interest, where the change includes, but is not limited to, a size of the region of interest, a position of the region of interest on the media display interface, and the like.

As an example, when detecting a change in the size of the region of interest, specifically, a new region of interest of the target object for the watermark-rendered media information is acquired; the method comprises the steps of obtaining the size of a new region of interest and the size of the region of interest, differentiating the size of the new region of interest and the size of the region of interest to obtain a difference value between the size of the new region of interest and the size of the region of interest, comparing the absolute value of the difference value with a difference threshold value to obtain a comparison result, and determining a detection result for indicating whether the region of interest is changed or not based on the comparison result.

As an example, when detecting a change in the position of the region of interest, specifically, a new region of interest of the target object for the watermark-rendered media information is acquired; comparing the new attention area with the attention area to obtain a first coincidence rate between the new attention area and the attention area; comparing the first coincidence rate with a first coincidence rate threshold value to obtain a comparison result; based on the comparison result, a detection result indicating whether the region of interest has changed is determined. Specifically, comparing the new attention area with the attention area to obtain a first coincidence rate between the new attention area and the attention area; comparing the first coincidence rate with a first coincidence rate threshold value to obtain a comparison result; the process of determining the detection result indicating whether the region of interest has changed may include obtaining a position of the new region of interest and a position of the region of interest, comparing the position of the new region of interest with the position of the region of interest to obtain a coincidence ratio of the position of the new region of interest and the position of the region of interest, comparing the coincidence ratio with a first coincidence ratio threshold to obtain a comparison result, and determining the detection result indicating whether the region of interest has changed based on the comparison result.

It should be noted that, the detection may be performed on the region of interest in real time, or the detection may be performed on the region of interest periodically, and for the new region of interest, the acquisition time point of the new region of interest is after the determined time point of the region of interest, that is, after the user views the media information corresponding to the region of interest on the media information display interface, the user views the media information corresponding to the new region of interest on the media information display interface.

Step 104, when the detection result represents that the attention area changes, updating the watermark rendering result of the media information according to the changed attention area.

It should be noted that the detection result may indicate that the region of interest has changed, or indicate that the region of interest has not changed. When detecting the change of the position of the region of interest, determining a process for indicating a detection result of whether the region of interest is changed based on the comparison result, specifically including determining a detection result characterizing the change of the region of interest when the first coincidence rate is smaller than a first coincidence rate threshold; and when the first coincidence rate reaches a first coincidence rate threshold value, determining a detection result which characterizes that the region of interest is unchanged. When detecting the change of the size of the region of interest, determining a process of a detection result for indicating whether the region of interest is changed based on the comparison result, specifically including determining a detection result representing the change of the region of interest when the absolute value of the difference reaches a difference threshold; and when the absolute value of the difference is smaller than the difference threshold value, determining a detection result which represents that the region of interest is unchanged. The first coincidence rate threshold value and the difference threshold value may be preset, for example, the first coincidence rate threshold value is 90%, when the first coincidence rate is 80%, the detection result indicating that the region of interest is changed is determined, and when the first coincidence rate is 95%, the detection result indicating that the region of interest is not changed is determined. Here, after determining the detection result that characterizes the region of interest unchanged, based on the detection result that characterizes the region of interest changed, the watermark rendering result of the media information is updated based on the new region of interest after determining the region of interest changed.

In actual implementation, for the process of updating the watermark rendering result of the media information according to the changed attention area or the new attention area, specifically, when the detection result represents that the attention area is changed, determining the position of the changed attention area in the media information display interface; adjusting the watermark area based on the changed position of the concerned area to obtain a new watermark area; and re-performing watermark rendering on the media information based on the new watermark region.

For example, referring to fig. 9, fig. 9 is a schematic diagram of media information subjected to watermark rendering according to an embodiment of the present application, based on fig. 9, a dashed box 901 is a region of interest of a user, and surrounding regions are watermark regions, that is, after watermark rendering is performed on the watermark regions, watermark rendering results shown in fig. 9 are presented. When the detection result indicates that the attention area changes, referring to fig. 10, fig. 10 is a schematic diagram of watermark-rendered media information provided in the embodiment of the present application, based on fig. 10, the dashed box 1001 is the changed attention area, after determining the position of the changed attention area, the watermark area is adjusted based on the position of the changed attention area, so as to obtain a new watermark area, and watermark rendering is performed on the media information again based on the new watermark area, so as to present the watermark rendering result as shown in fig. 10.

It should be noted that, the process of adjusting the watermark area based on the position of the changed attention area is the same as the process of determining the watermark area of the media information based on the attention area, which is not described in detail in the embodiment of the present application.

In some embodiments, when the detection result indicates that the region of interest does not change, the size of the media information display interface may also be detected, and if the size of the media information display interface changes, for example, the media information display interface is windowed, after the region of interest is detected, the media information display interface is compared with the region of interest, so as to obtain a second coincidence rate between the media information display interface and the region of interest; comparing the second overlapping rate with a second overlapping rate threshold value, and performing watermark rendering on the media information displayed on the media information display interface when the second overlapping rate reaches the second overlapping rate threshold value; and when the second coincidence rate is smaller than a second coincidence rate threshold value, determining a target watermark region based on the attention region, and performing watermark rendering on media information displayed in the target watermark region. Specifically, comparing the media information display interface with the attention area to obtain a second coincidence rate between the media information display interface and the attention area; comparing the second overlapping rate with a second overlapping rate threshold value, and performing watermark rendering on the media information displayed on the media information display interface when the second overlapping rate reaches the second overlapping rate threshold value; when the second coincidence rate is smaller than the second coincidence rate threshold, determining a target watermark area based on the concerned area, and performing watermark rendering on the media information displayed in the target watermark area, wherein the process may be that a ratio of the size of the concerned area to the size of the media information display interface, namely the second coincidence rate, is obtained, the ratio is compared with the second coincidence rate threshold, and when the ratio reaches the second coincidence rate threshold, the media information displayed on the media information display interface is performed with watermark rendering; and when the ratio is smaller than a second coincidence rate threshold value, determining a target watermark area based on the attention area, and performing watermark rendering on media information displayed in the target watermark area. It should be noted that, the second coincidence rate threshold may be preset, for example, the second coincidence rate threshold is 50%, and when the ratio is 0.7, that is, the second coincidence rate is 70%, watermark rendering is performed on the media information displayed on the media information display interface; when the ratio is 0.3, namely the second recombination rate is 30%, a target watermark area is determined based on the attention area, and watermark rendering is carried out on media information displayed in the target watermark area.

In practical implementation, when the second coincidence rate reaches the second coincidence rate threshold, the second coincidence rate between the media information display interface and the region of interest can be reduced by adjusting the size of the region of interest, so that the second coincidence rate is smaller than the second coincidence rate threshold. The method for adjusting the size of the concerned area specifically comprises the steps of obtaining the standby length and the convergence point of the binocular vision line of the target object in the media information display interface; and drawing a target image according to the standby length by taking the convergence point as the center point of the image, taking the corresponding region of the target image in the media information display interface as the adjusted attention region, and thus obtaining the coincidence rate between the adjusted attention region and the media information display interface so as to carry out the subsequent processing.

It should be noted that there may be a plurality of spare lengths, and the spare length may be acquired from a large size to a small size and then processed correspondingly until the overlapping ratio between the target area and the media information display interface, which is adjusted based on the acquired spare length, is smaller than the second overlapping ratio threshold. And when the superposition rate between the attention area and the media information display interface, which are obtained by adjustment based on all the standby lengths, reaches a second superposition rate threshold value, directly performing watermark rendering on the media information displayed on the media information display interface.

Next, description will be continued on the watermark rendering method of the media information provided in the embodiment of the present application, referring to fig. 11, fig. 11 is a schematic flow diagram of the watermark rendering method of the media information provided in the embodiment of the present application, and based on fig. 11, the watermark rendering method of the media information provided in the embodiment of the present application is cooperatively implemented by a client and a server.

In step 201, the client side responds to the triggering operation for the media information to display the media information on the media information display interface.

In practical implementation, the client may be a video playing client set in the terminal and used for playing video, or a conference live client used for conducting conference live broadcast, and the media information may be a document, a video or a picture, etc., when the client is a video playing client set in the terminal and used for playing video, the triggering operation for the media information may be based on a man-machine interaction interface of the client by a user, and the video playing function item in the man-machine interaction interface is triggered to enable the terminal to play corresponding video; when the client is a live conference client which is arranged at the terminal and is used for conducting live conference, the triggering operation aiming at the media information can be that a user triggers a conference starting function item in a man-machine interaction interface based on the man-machine interaction interface of the client so that the terminal starts a live conference.

Step 202, when the client displays the media information on the media information display interface, the face image of the target object is acquired through the image acquisition device.

In practical implementation, the image acquisition device may be a camera, the face image of the target object may be obtained by shooting by the camera in communication with the terminal, and after the camera shoots the face image of the target object, the face image of the target object is transmitted to the terminal and automatically uploaded to the client by the terminal.

Step 203, the client sends the collected face image to the server.

Step 204, the server locates the attention area of the target object in the media information display interface based on the received face image.

In step 205, other areas on the media information presentation interface, except for the region of interest, are acquired and determined as watermark areas.

And 206, performing watermark rendering on the media information displayed in the watermark area.

Step 207, the media information after watermark rendering is sent to the client.

Step 208, the client displays the media information after watermark rendering on the media information display interface, and acquires the face image of the target object again when displaying the media information after watermark rendering.

Step 209, sending the acquired face image to the server.

Step 210, the server determines a new attention area of the target object to the media information rendered by the watermark based on the received face image acquired again.

Step 211, comparing the new region of interest with the region of interest, and when the comparison result indicates that the region of interest is changed, adjusting the watermark region based on the position of the new region of interest to obtain a new watermark region.

In actual implementation, the new region of interest may be compared with the region of interest by comparing at least one of the positions and the sizes of the new region of interest and the region of interest. Here, when the comparison result indicates that the coincidence ratio of the new region of interest and the region of interest is smaller than the preset coincidence ratio threshold value, it is determined that the region of interest is changed.

Step 212, updating the watermark rendering result of the media information based on the new watermark region.

And step 213, sending the watermark rendering result of the updated media information to the client.

Step 214, the client displays the watermark rendering result of the updated media information on the media information display interface.

By applying the embodiment of the application, the watermark area outside the concerned area is determined by locating the concerned area of the target object on the media information, so as to carry out watermark rendering on the media information displayed in the watermark area, and therefore, when the concerned area is changed, the watermark rendering result of the media information is updated. Therefore, the content area watched by the target object is avoided in real time, watermark rendering at other positions is performed, normal rendering of the watermark is ensured, and watermark information is ensured not to influence user watching.

In the following, an exemplary application of the embodiments of the present application in a practical application scenario will be described.

In the related art, in order to prevent content leakage when the intranet of a company is live broadcast, personal or enterprise names (namely watermarks) are rendered on a player, so that a leakage source is easy to track if information is leaked out, however, aiming at the method that watermark rendering is performed according to a specific area, such as watermark rendering is performed by adopting specific transverse and longitudinal fixed intervals, some key content is easy to be blocked, and bad user experience is caused to a user.

Based on this, the embodiment of the application provides a watermark rendering method of media information, which needs to dynamically avoid a physical area (attention area) of content focused by eyeballs of a user on a screen to render when the terminal performs watermark rendering, so as to avoid influencing the watching of the user.

The following details the present solution, specifically, the technical solution needs to have a camera capturing capability on a viewing device and an eye tracking system (image capturing device), and the eye tracking system captures a user viewing area (attention area) and renders the watermark based on the captured user viewing area (attention area).

In practical implementation, when video (media information) starts to play, an eyeball tracking system (image acquisition device) needs to be started, the operation life cycle of the system is the same as that of a player, the operation is stopped until the video playing of the player is finished, and in the operation process, a camera needs to be started to continuously acquire face data (face images) of a user and identify data eyeballs so as to obtain a video content area (attention area) watched by the user. The system is operated at a certain acquisition frame rate, so that the loss of the performance of the equipment caused by the excessive frame rate is prevented, the response time of human eyes is about 40 milliseconds per second, the acquisition frame rate is smaller than 25 frames, the excessive frame rate causes obvious delay, and the frame rate can be selected to be between 5 and 25 frames according to the performance condition of the equipment.

In practical implementation, the identified user viewing area is generally represented by using a rectangular local area, and may be represented by using a complete coordinate, or may be represented by using any point coordinate and a rectangular width and height, as shown in fig. 12, fig. 12 is a schematic diagram of the user viewing area provided in the embodiment of the present application, based on fig. 12, using a lower left corner a point coordinate (X, Y) and a width and height, where the B point coordinate is (X, y+height), the C point coordinate is (x+width, y+height), and the D point coordinate is (x+width, Y), where other coordinate representation methods may be used, and four point coordinates may be labeled, so that the user viewing area is represented by using the a point coordinate and the rectangular width and height.

In actual implementation, after a camera and an eyeball tracking system acquire a user watching area, the system needs to timely transmit data to a rendering system, if the data are consistent with the last time, watermark redrawing is not triggered, and otherwise, watermark redrawing is required to be triggered; the operation of the system is irrelevant to the state of the player, that is, if the player is in a pause state, if the watermark is inconsistent with the last time, the watermark needs to be redrawn so as to prevent the user from watching the content even if the video is paused.

It should be noted that, in order to facilitate maintenance, the original watermark coordinate computing system should not be changed, and a legal watermark position judgment logic should be added, after the original watermark coordinates are computed, it should be judged which coordinates exist in the user viewing area, if all coordinates exist in the user viewing area, coordinate recalculation needs to be triggered to obtain a new set of coordinates; if only part of the watermark is in the user viewing area, the coordinates in the user viewing area are not rendered, and the watermarks of other coordinates are normally rendered.

Next, referring to fig. 13, fig. 13 is a schematic flow chart of watermark rendering of media information provided in an embodiment of the present application, and based on fig. 13, the watermark rendering method of media information provided in the embodiment of the present application may be implemented by executing steps 1 to 7. Specifically, step 1, after a user clicks a video to play, a client video play scheduling module sends request video information to a video background; step 2, the video background transmits video information requested by a user to a client after receiving the user request, wherein the information at least comprises a video playing address and watermark information, the watermark information can be a picture or character information, if the picture is a network address, and if the picture is a character, the watermark information is a section of character information; step 3, transmitting the acquired video playing address to a player core, and transmitting the acquired watermark information to a watermark rendering module; step 4, after receiving the playing address, the inner core of the player starts the operation of the player, reads data from the server, decodes and renders image and audio, and notifies the video playing scheduling module to play the video after all the operations are ready; step 5, after the video starts to play, informing an eyeball tracking system that a camera needs to be started and collecting image data (face image) according to preset frequency; step 6, calculating the acquired camera data (face image) by an eyeball tracking system to obtain a viewing area of a user, wherein the area information is area information corresponding to the content of a screen, and is a rectangular area, namely coordinate values corresponding to the screen; it may be 4 specific coordinate values (coordinate values of 4 points of the rectangle), or one coordinate value plus one length and one width (one coordinate may be defined, for example, a lower left corner (x, y), a length of width, and a width of height, then the other three coordinates may be calculated as an upper left corner (x, y+height), a lower right corner (x+width, y), and an upper right corner (x+width, y+height)); it should be noted that the sixth step is a repeated process, and the camera acquisition and operation will be performed once at regular intervals (which may be defined by the program and is generally not less than 1 second), so as to ensure that the acquired information is the latest user information until the video viewing is finished; step 7, transmitting the data calculated in the step 6 to a watermark rendering module; step 8, calculating according to the transmitted user watching area data, wherein a calculation rule of watermark rendering needs to avoid the area, namely the watermark cannot be rendered into the area, and meanwhile, the fact that the watermark can be rendered outside the area is required to be ensured, so that normal watching of a user is ensured; here, the step is a repeated process, watermark rendering is performed continuously, and redrawing is performed each time according to the transferred user viewing area information

Continuing with the description below of an exemplary structure of the watermark rendering device 455 for media information provided in embodiments of the present application implemented as a software module, in some embodiments, as shown in fig. 2, the software module stored in the watermark rendering device 455 for media information of the memory 440 may include:

the positioning module 4551 is configured to, when the terminal displays media information on a media information display interface, position a region of interest of a target object in the media information display interface;

a determining module 4552 configured to determine a watermark region of the media information based on the region of interest; the watermark area comprises other display areas except the concerned area in the media information display interface;

A rendering module 4553, configured to render a watermark on the media information displayed in the watermark region, and detect the region of interest;

and the updating module 4554 is configured to update a watermark rendering result of the media information according to the changed attention area when the detection result characterizes the change of the attention area.

In some embodiments, the positioning module 4551 is further configured to obtain a convergence point of the binocular vision line of the target object in the media information display interface; and drawing a target image according to the target length by taking the convergence point as the center point of the image, and taking the corresponding region of the target image in the media information display interface as the region of interest.

In some embodiments, the rendering module 4553 is further configured to obtain a new region of interest of the target object for the media information after watermark rendering; comparing the new attention area with the attention area to obtain a first coincidence rate between the new attention area and the attention area; comparing the first coincidence rate with a first coincidence rate threshold value to obtain a comparison result; based on the comparison result, a detection result indicating whether the region of interest has changed is determined.

In some embodiments, the updating module 4554 is further configured to determine, when the detection result indicates that the region of interest has changed, a location of the changed region of interest in the media information display interface; adjusting the watermark area based on the changed position of the concerned area to obtain a new watermark area; and re-performing watermark rendering on the media information based on the new watermark region.

In some embodiments, when the detection result indicates that the region of interest is unchanged and the size of the media information display interface is changed, the apparatus further includes a first comparison module, where the first comparison module is configured to compare the media information display interface with the region of interest to obtain a second coincidence rate between the media information display interface and the region of interest; and comparing the second coincidence rate with a second coincidence rate threshold value, and performing watermark rendering on the media information displayed by the media information display interface when the second coincidence rate reaches the second coincidence rate threshold value.

In some embodiments, the first comparison module is further configured to determine a target watermark region based on the region of interest when the second coincidence rate is less than a second coincidence rate threshold, and watermark render the media information presented in the target watermark region.

In some embodiments, the determining module 4552 is further configured to obtain other regions of the media information presentation interface other than the region of interest; and determining the other area as the watermark area.

In some embodiments, the determining module 4552 is further configured to obtain other regions of the media information presentation interface other than the region of interest; comparing the area of the other areas with an area threshold; and when the comparison result shows that the area size of the other areas is larger than the area threshold, selecting part of the areas from the other areas as watermark areas.

In some embodiments, the determining module 4552 is further configured to obtain a size of the watermark to be rendered in the watermark region; and selecting at least one rectangular area matched with the watermark to be rendered from the other areas as the watermark area based on the size.

In some embodiments, the positioning module 4551 is further configured to acquire, by an image acquisition device, a face image of the target object; performing line-of-sight analysis on the face image to obtain a line-of-sight analysis result for indicating the line of sight of the target object; and based on the sight analysis result, determining the area in the media information display interface where the target object gazes as the attention area.

In some embodiments, the positioning module 4551 is further configured to perform feature extraction on the face image to obtain a line-of-sight vector and a head pose vector of the target object; based on the head posture vector, adjusting the sight line vector to obtain a target sight line vector for indicating the sight line of the target object, and taking the target sight line vector as the sight line analysis result; positioning a region where eyes of the target object look according to the sight line analysis result; and when the area is positioned on the media information display interface, determining the corresponding area in the media information display interface as the concerned area.

In some embodiments, the positioning module 4551 is further configured to collect, by an audio collection device, voice data of the target object; carrying out semantic analysis on the voice data to obtain a semantic analysis result; the semantic analysis result comprises a target text word, wherein the target text word is used for indicating the gazing position of the target object on the media information display interface; and determining a focus area of the target object in the media information display interface based on the position of the target object, indicated by the target text word, on the media information display interface in the voice analysis result.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the electronic device executes the watermark rendering method of the media information according to the embodiment of the application.

The embodiments of the present application provide a computer readable storage medium having stored therein executable instructions that, when executed by a processor, cause the processor to perform a watermark rendering method of media information provided by the embodiments of the present application, for example, a watermark rendering method of media information as shown in fig. 3.

In some embodiments, the computer readable storage medium may be FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; but may be a variety of devices including one or any combination of the above memories.

In some embodiments, the executable instructions may be in the form of programs, software modules, scripts, or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.

As an example, the executable instructions may, but need not, correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, in one or more scripts in a hypertext markup language (HTML, hyper Text Markup Language) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

As an example, executable instructions may be deployed to be executed on one electronic device or on multiple electronic devices located at one site or, alternatively, on multiple electronic devices distributed across multiple sites and interconnected by a communication network.

In summary, the embodiment of the application has the following beneficial effects:

watermark rendering at other positions is performed by avoiding the content area watched by the target object in real time, so that normal rendering of the watermark is ensured, and watermark information is ensured not to influence user watching.

The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modifications, equivalent substitutions, improvements, etc. that are within the spirit and scope of the present application are intended to be included within the scope of the present application.

Claims

1. A method of watermark rendering of media information, the method comprising:

2. The method of claim 1, wherein locating a region of interest of a target object in the media information presentation interface comprises:

acquiring convergence points of binocular vision of the target object in the media information display interface;

and drawing a target image according to the target length by taking the convergence point as the center point of the image, and taking the corresponding region of the target image in the media information display interface as the region of interest.

3. The method of claim 1, wherein the detecting the region of interest comprises:

acquiring a new attention area of the target object to the media information rendered by the watermark;

comparing the new attention area with the attention area to obtain a first coincidence rate between the new attention area and the attention area;

comparing the first coincidence rate with a first coincidence rate threshold value to obtain a comparison result;

based on the comparison result, a detection result indicating whether the region of interest has changed is determined.

4. The method of claim 1, wherein updating the watermark rendering result of the media information according to the changed region of interest when the detection result characterizes the change of the region of interest, comprises:

when the detection result represents that the concerned area changes, determining the position of the concerned area after the change in the media information display interface;

adjusting the watermark area based on the changed position of the concerned area to obtain a new watermark area;

and re-performing watermark rendering on the media information based on the new watermark region.

5. The method of claim 1, wherein when the detection result indicates that the region of interest has not changed and the size of the media information presentation interface has changed, the method further comprises, after detecting the region of interest:

comparing the media information display interface with the concerned area to obtain a second coincidence rate between the media information display interface and the concerned area;

and comparing the second coincidence rate with a second coincidence rate threshold value, and performing watermark rendering on the media information displayed by the media information display interface when the second coincidence rate reaches the second coincidence rate threshold value.

6. The method of claim 5, wherein the method further comprises:

and when the second coincidence rate is smaller than a second coincidence rate threshold value, determining a target watermark area based on the concerned area, and performing watermark rendering on the media information displayed in the target watermark area.

7. The method of claim 1, wherein the determining a watermark region of the media information based on the region of interest comprises:

acquiring other areas except the concerned area on the media information display interface;

And determining the other area as the watermark area.

8. The method of claim 1, wherein the determining a watermark region of the media information based on the region of interest comprises:

comparing the area of the other areas with an area threshold;

and when the comparison result shows that the area size of the other areas is larger than the area threshold, selecting part of the areas from the other areas as watermark areas.

9. The method of claim 8, wherein selecting a partial region among the other regions as a watermark region comprises:

acquiring the size of the watermark to be rendered in the watermark region;

and selecting at least one rectangular area matched with the watermark to be rendered from the other areas as the watermark area based on the size.

10. The method of claim 1, wherein locating a region of interest of a target object in the media information presentation interface comprises:

acquiring a face image of the target object through image acquisition equipment;

Performing line-of-sight analysis on the face image to obtain a line-of-sight analysis result for indicating the line of sight of the target object;

and based on the sight analysis result, determining the area in the media information display interface where the target object gazes as the attention area.

11. The method of claim 10, wherein the performing line-of-sight analysis on the face image to obtain a line-of-sight analysis result comprises:

extracting features of the face image to obtain a sight line vector and a head posture vector of the target object;

based on the head posture vector, adjusting the sight line vector to obtain a target sight line vector for indicating the sight line of the target object, and taking the target sight line vector as the sight line analysis result;

the determining, based on the line of sight analysis result, that the region in the media information presentation interface where the target object gazes is the region of interest includes:

positioning a region where eyes of the target object look according to the sight line analysis result;

and when the area is positioned on the media information display interface, determining the corresponding area in the media information display interface as the concerned area.

12. The method of claim 1, wherein locating a region of interest of a target object in the media information presentation interface comprises:

collecting voice data of the target object through an audio collection device;

carrying out semantic analysis on the voice data to obtain a semantic analysis result; the semantic analysis result comprises a target text word, wherein the target text word is used for indicating the gazing position of the target object on the media information display interface;

and determining a focus area of the target object in the media information display interface based on the position of the target object, indicated by the target text word, on the media information display interface in the voice analysis result.

13. A watermark rendering apparatus for media information, the apparatus comprising:

14. An electronic device, comprising:

a memory for storing executable instructions;

a processor for implementing a watermark rendering method for media information according to any one of claims 1 to 12 when executing executable instructions stored in said memory.

15. A computer readable storage medium storing executable instructions for causing a processor to perform the watermark rendering method of media information according to any one of claims 1-12.