CN105611380A

CN105611380A - Video image processing method and device

Info

Publication number: CN105611380A
Application number: CN201510981180.0A
Authority: CN
Inventors: 吴小勇; 刘洁; 王维
Original assignee: Xiaomi Inc
Current assignee: Xiaomi Inc
Priority date: 2015-12-23
Filing date: 2015-12-23
Publication date: 2016-05-25

Abstract

The invention relates to a video image processing method and device, and belongs to the field of image processing. The method comprises the following steps of acquiring data of a video image; detecting objects to be sheltered in the video image; when the objects to be sheltered in the video image are not detected, then adopting a set area in the video image as an area to be sheltered; performing fuzzy processing on data of the area to be sheltered; and outputting the data of the video image after fuzzy processing. The device comprises an acquisition module, a detection module, a determination module, a processing module and an output module. According to the video image processing method and device provided by the invention, when the objects to be sheltered in the video image cannot be detected, the set area in the video image is adopted as the area to be sheltered, and fuzzy processing is performed on the data in the area to be sheltered, therefore, the objects to be sheltered in the area to be sheltered can be sheltered, and thusthe condition that the objects to be sheltered are not exposed can be guaranteed in the whole video playing process.

Description

Video image processing method and device

Technical Field

The present disclosure relates to the field of image processing, and in particular, to a method and an apparatus for processing a video image.

Background

With the popularization of the smart television and the set top box, a user can watch videos on the smart television more conveniently. In China, the supervision department does not allow the frames which do not conform to the regulations to be displayed on the intelligent television and be seen by users, such as station marks (Logo) on video websites. To block the part of the picture, the part of the content needs to be detected in the video image first, and then the part of the content needs to be blocked, however, for some frame images in the video, the content that needs to be blocked may not be detected, so that the blocked content is still exposed when the frame images are played.

Disclosure of Invention

To overcome the problems in the related art, the present disclosure provides a video image processing method and apparatus.

In one aspect, a video image processing method is provided, and the method includes:

acquiring data of a video image;

detecting an object to be shielded in the video image by adopting the data of the video image;

when the object to be occluded in the video image is not detected, a set area in the video image is adopted as an area to be occluded;

carrying out fuzzy processing on the data of the area to be shielded;

and outputting the data of the video image after the blurring processing.

In an implementation manner of the embodiment of the present disclosure, the method further includes:

and when the object to be occluded in the video image is detected, taking the area where the object to be occluded is located as the area to be occluded.

In another implementation manner of the embodiment of the present disclosure, the object to be occluded includes a station logo of a video website or a station logo of a television station.

Further, the adopting a set area in the video image as an area to be occluded includes:

and adopting a rectangular area at the upper left corner or the upper right corner of the video image as the area to be shielded, wherein the ratio of the width of the rectangular area to the width of the video image ranges from one fourth to one third, and the ratio of the height of the rectangular area to the height of the video image ranges from one fourth to one third.

In an implementation manner of the embodiment of the present disclosure, the performing a fuzzy process on the data of the to-be-occluded area includes:

carrying out fuzzy processing on the data of the region to be shielded by adopting a Gaussian fuzzy algorithm; or,

and replacing the data of each row of pixel points in the area to be shielded by adopting the data of one row of pixel points in the area to be shielded.

In another aspect, there is provided a video image processing apparatus, the apparatus comprising:

the acquisition module is used for acquiring data of the video image;

the detection module is used for detecting the object to be shielded in the video image by adopting the data acquired by the acquisition module;

the determining module is used for adopting a set area in the video image as an area to be occluded when the detecting module does not detect the object to be occluded in the video image;

the processing module is used for carrying out fuzzy processing on the data of the region to be shielded determined by the determining module;

and the output module is used for outputting the data of the video image after the fuzzy processing of the processing module.

In an implementation manner of the embodiment of the present disclosure, the determining module is further configured to, when an object to be occluded in the video image is detected, take a region where the object to be occluded is located as the region to be occluded.

Further, the determining module is configured to use a rectangular region at an upper left corner or an upper right corner of the video image as the region to be blocked, a ratio of a width of the rectangular region to a width of the video image ranges from one quarter to one third, and a ratio of a height of the rectangular region to a height of the video image ranges from one quarter to one third.

In an implementation manner of the embodiment of the present disclosure, the processing module is configured to perform a fuzzy processing on the data of the region to be occluded by using a gaussian fuzzy algorithm; or,

In yet another aspect, there is provided a video image processing apparatus, the apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

acquiring data of a video image;

carrying out fuzzy processing on the data of the area to be shielded;

and outputting the data of the video image after the blurring processing.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

when the object to be shielded in the video image cannot be detected, the set area in the video image is used as the area to be shielded, and the data in the area to be shielded is subjected to fuzzy processing, so that the object to be shielded in the area to be shielded can be shielded, and the object to be shielded cannot be exposed in the whole video playing process.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a schematic diagram of an application scenario of a video image processing method according to an exemplary embodiment;

FIG. 2 is a flow diagram illustrating a video image processing method according to an exemplary embodiment;

FIG. 3 is a flow diagram illustrating another method of video image processing according to an exemplary embodiment;

FIG. 3a is a pictorial display diagram of a video image shown in accordance with an exemplary embodiment;

FIG. 4 is a block diagram illustrating a video image processing apparatus according to an exemplary embodiment;

FIG. 5 is a block diagram illustrating another video image processing apparatus according to an exemplary embodiment;

fig. 6 is a block diagram illustrating another video image processing apparatus according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings in which like numerals refer to the same or similar elements throughout the different views unless otherwise specified. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

Fig. 1 is a schematic diagram of an application scenario of a video image processing method according to an exemplary embodiment. As shown in fig. 1, the television 10 is connected to the server 30 through the set-top box 20, the user 50 may send a control instruction, such as a program playing instruction, to the television 10 through a control device such as a remote controller 40, and after receiving the program playing instruction, the television 10 forwards the program playing instruction to the server 30 through the set-top box 20, and the server 30 sends video image data corresponding to the program playing instruction to the set-top box 20, and the set-top box forwards the data to the television 10 for playing.

Fig. 2 is a flow diagram illustrating a video image processing method according to an exemplary embodiment. The method can be applied to a set-top box, a television or a server, and comprises the following steps as shown in fig. 2.

In step 201, data of a video image is acquired.

The data of the video image is usually YUV format data, and the YUV format includes, but is not limited to, YUV444 interlaced format, YVYU format, YUV420P or YUYV format, etc. The YUV format data may be stored in a byte aligned manner, for example, in an 8-byte or 16-byte aligned manner.

In step 202, data of the video image is used to detect an object to be occluded in the video image.

The object to be occluded includes, but is not limited to, a station logo of a video website, a station logo of a television station, a trademark of a product, and the like.

In step 203, when the object to be occluded in the video image is not detected, the set region in the video image is adopted as the region to be occluded.

The setting area is determined according to the position of the object to be shielded in the video image, and the object to be shielded is located in the setting area.

In step 204, the data of the determined region to be occluded is subjected to blurring processing.

Optionally, this step 204 may include:

carrying out fuzzy processing on data of an area to be shielded by adopting a Gaussian fuzzy algorithm; or,

and replacing the data of each row of pixel points in the region to be shielded by adopting the data of one row of pixel points in the region to be shielded.

In step 205, the data of the video image after the blurring process is output.

When the method of the present embodiment is applied to a server, this step 205 may include: sending the data of the video image after the fuzzy processing to a set top box;

when the method of this embodiment is applied to a set-top box, this step 205 may include: sending the data of the video image after the fuzzy processing to a television;

when the method of the present embodiment is applied to a television, this step 205 may include: and sending the data of the video image after the fuzzy processing to a rendering module of the television.

According to the embodiment of the disclosure, when the object to be shielded in the video image cannot be detected, the set area in the video image is used as the area to be shielded, and the data in the area to be shielded is subjected to fuzzy processing, so that the object to be shielded in the area to be shielded can be shielded, and the object to be shielded cannot be exposed in the whole video playing process.

Fig. 3 is a flow chart illustrating another video image processing method according to an example embodiment. The method is applied to a set top box, a television or a server, and in the embodiment, the disclosure is described by taking the content to be blocked as a station mark of a video website or a television station as an example. As shown in fig. 3, the method includes the following steps.

In step 301, data of a video image is acquired.

The data of the video image is usually YUV format data, and the YUV format includes YUV444 interlaced format, YVYU format, YUV420P or YUYV format, etc. The YUV format data may be stored in a byte aligned manner, for example, in an 8-byte or 16-byte aligned manner.

In step 302, using the data of the video image to detect the station caption in the video image, and executing step 303a when the station caption is detected; when the beacon is not detected, step 303b is performed.

The station caption detection algorithm can be used for detecting the station caption in the video image, is used for detecting and positioning the station caption, and can be a station caption detection algorithm based on a multi-frame video image or a station caption detection algorithm based on a single-frame image.

The station caption detection algorithm based on the multi-frame video image can comprise the following steps:

extracting a key frame of a video image;

taking the extracted key frame as a reference frame, performing interframe differential processing on the extracted key frame and the key frame behind the extracted key frame, recording a differential value corresponding to each pixel point, and solving the cumulative sum of the differential values corresponding to each pixel point;

and determining the pixel points with the accumulated sum smaller than the set threshold value as the pixel points of the station caption.

It should be noted that, in order to reduce the amount of calculation, the key frame image may be converted into a grayscale image, and then the inter-frame difference processing may be performed.

When the logo detection algorithm based on multiple frames of video images is adopted, for at least one initial frame of video image of the video, the logo cannot be detected, and then step 303b is executed.

The station caption detection algorithm based on the multi-frame video images utilizes the space-time invariance of the station caption, namely the position, the color, the size and the like of the station caption in the multi-frame video images are fixed and unchanged, so that other objects to be occluded with the characteristics can also be determined by adopting the algorithm based on the multi-frame video images.

The station caption detection algorithm based on the single frame image can comprise the following steps:

obtaining a station logo template;

adopting a detection window to slide in a single-frame video image to obtain a plurality of images to be detected;

respectively calculating the similarity of the obtained multiple images to be detected and the station caption template;

when an image to be detected with the similarity larger than a set value exists, a station caption is detected in the video image; and when the image to be detected with the similarity larger than the set value does not exist, the fact that the station caption is not detected in the video image is indicated.

Wherein, the calculation to obtain the similarity between the image to be detected and the station caption template can include:

determining a first characteristic value of the station caption template;

determining a second characteristic value of the image to be detected;

and calculating the distance (namely the similarity) between the station caption template and the image to be detected by adopting the first characteristic value and the second characteristic value. The distance includes, but is not limited to, euclidean distance, cosine distance, etc.

The station caption detection algorithm based on the single-frame video image is used for searching the station caption in the video image and can also be used for searching other objects to be shielded, such as trademarks of products and the like.

It should be noted that the station caption detection algorithm in this embodiment is only an example, and other existing station caption detection algorithms may also be used, which is not limited by this disclosure.

In step 303a, the region where the object to be occluded is located is taken as the region to be occluded.

In the station caption detecting algorithm based on the multi-frame video images in step 302, this step 303a may include: and determining the area to be shielded in the video image according to the detected coordinates of the pixel points of the station caption.

In an implementation manner, determining a region to be occluded in a video image according to the detected coordinates of the pixel points of the station caption may include:

determining the vertex coordinates of the upper left corner of the area to be shielded and the height and width of the area to be shielded according to the minimum abscissa, the maximum abscissa, the minimum ordinate and the maximum ordinate in the coordinates of the pixel points of the station caption;

and determining the area to be occluded in the video image according to the vertex coordinates of the upper left corner of the area to be occluded and the width and the height of the area to be occluded.

When the method is realized, the minimum abscissa and the maximum ordinate in the coordinates of the pixel points of the station caption form the vertex coordinates of the upper left corner of the region to be shielded, the difference value between the maximum abscissa and the minimum abscissa corresponds to the width of the region to be shielded, and the difference value between the maximum ordinate and the minimum ordinate corresponds to the height of the region to be shielded.

In another implementation manner, determining a region to be occluded in the video image according to the detected coordinates of the pixel point of the station caption may include:

determining coordinates of four vertexes of an area to be shielded according to a minimum abscissa, a maximum abscissa, a minimum ordinate and a maximum ordinate in coordinates of pixel points of the station caption;

and determining an area to be occluded in the video image according to the coordinates of the four vertexes.

The coordinates of four vertexes of the region to be shielded are determined according to the minimum abscissa, the maximum abscissa, the minimum ordinate and the maximum ordinate in the coordinates of the pixel points of the station caption, and the following two modes can be adopted.

The first method comprises the steps of taking a minimum horizontal coordinate and a maximum vertical coordinate in coordinates of a pixel point of a station caption as an upper left corner vertex coordinate of a region to be shielded, taking the minimum horizontal coordinate and the minimum vertical coordinate in the coordinates of the pixel point of the station caption as a lower left corner vertex coordinate of the region to be shielded, taking a maximum horizontal coordinate and a maximum vertical coordinate in the coordinates of the pixel point of the station caption as an upper right corner vertex coordinate of the region to be shielded, and taking the maximum horizontal coordinate and the minimum vertical coordinate in the coordinates of the pixel point of the station caption as a lower right corner vertex coordinate of the region to be shielded.

Secondly, firstly, reducing the minimum abscissa in the coordinates of the pixel points of the station caption by a first set value to obtain a new minimum abscissa; increasing the maximum abscissa in the coordinates of the pixel points of the station caption by a second set value to obtain a new maximum abscissa; reducing the minimum vertical coordinate in the coordinates of the pixel points of the station caption by a third set value to obtain a new minimum vertical coordinate; increasing a fourth set value to the maximum vertical coordinate in the coordinates of the pixel points of the station caption to obtain a new maximum vertical coordinate, wherein the first set value, the second set value, the third set value and the fourth set value can be set according to actual needs, for example, 1-5 pixel points, preferably 1-2 pixel points;

and then, taking the new minimum abscissa and the new maximum ordinate as top-left corner vertex coordinates of the region to be occluded, taking the new minimum abscissa and the new minimum ordinate as bottom-left corner vertex coordinates of the region to be occluded, taking the new maximum abscissa and the new maximum ordinate as top-right corner vertex coordinates of the region to be occluded, and taking the new maximum abscissa and the new minimum ordinate as bottom-right corner vertex coordinates of the region to be occluded.

The area to be shielded determined by the second mode is larger than the area to be shielded determined by the first mode.

In the station caption detection algorithm based on single frame video image of step 302, this step 303a may include:

and taking the area corresponding to the image to be detected with the similarity smaller than the set value as the area to be shielded.

In step 303b, the set area in the video image is used as the area to be occluded.

In this embodiment, since the object to be occluded is a station caption, and the station caption is usually located at the upper left corner or the upper right corner of the video image, in this embodiment, the step 303b may include:

the rectangular area at the upper left corner or the upper right corner of the video image is used as an area to be shielded, the value range of the ratio of the width of the rectangular area to the width of the video image is one fourth to one third, and the value range of the ratio of the height of the rectangular area to the height of the video image is one fourth to one third.

Preferably, the width of the rectangular region may be a quarter of the width of the video image, and the height of the rectangular region may also be a quarter of the height of the video image.

In other embodiments, for example, some beauty category column videos may include that some beauty products need to block their trademarks, at this time, the object to be blocked may be the trademarks of the products, and the beauty products are placed on the table fixed area most of the time, and their positions do not change in the multi-frame image, then the positions of their trademarks may be determined according to the positions of the products, and then the set area is preset according to the positions of the trademarks of the products.

The determination of the region to be occluded in the video image can be realized through steps 302, 303a and 303 b.

In step 304, the data of the region to be occluded is blurred.

In a first implementation manner of this step 304, a gaussian blur algorithm may be used to blur data of the region to be occluded. The gaussian fuzzy algorithm can be a one-dimensional gaussian fuzzy algorithm or a two-dimensional gaussian fuzzy algorithm.

When a one-dimensional Gaussian blur algorithm is adopted to perform blur processing on data of a region to be occluded, the one-dimensional Gaussian blur processing in a first direction is generally performed, and then the one-dimensional Gaussian blur processing in a second direction is performed, wherein the second direction is perpendicular to the first direction, for example, the first direction refers to the transverse direction of a video image, and the second direction refers to the longitudinal direction of the video image; alternatively, the first direction refers to a longitudinal direction of the video image and the second direction refers to a lateral direction of the video image.

Here, the horizontal direction refers to the left-right direction of the video image (i.e. the width direction of the video image) viewed by the user when the user normally views the video, such as the arrow x direction in fig. 3a, and is generally the horizontal direction for the television, and the vertical direction refers to the up-down direction of the video image (i.e. the height direction of the video image) viewed by the user when the user normally views the video, such as the arrow y direction in fig. 3a, and is generally the vertical direction for the television.

The effect of performing fuzzy processing on the region to be shielded by adopting the first-direction one-dimensional Gaussian fuzzy algorithm and the second-direction one-dimensional Gaussian fuzzy algorithm is the same as the effect of performing fuzzy processing on the region to be shielded by adopting the two-dimensional Gaussian fuzzy algorithm, but the two-dimensional Gaussian fuzzy algorithm is more complex compared with the one-dimensional Gaussian fuzzy algorithm, the time for performing fuzzy processing on the data of the region to be shielded by adopting the one-dimensional Gaussian fuzzy algorithm is shorter, the efficiency is higher, and therefore the one-dimensional Gaussian fuzzy algorithm is preferably used for performing fuzzy processing during implementation.

In practice, this step 304 may include:

converting the data of the pixel points of the area to be shielded from YUV format data into RGB888 format data;

carrying out fuzzy processing on the data of R, G, B channels by adopting a Gaussian fuzzy algorithm;

and converting the RGB888 format data after the fuzzy processing into YUV format data.

Optionally, if the data of the video image acquired in step 301 is not in the YUV444 interleaving format, converting the data of the pixel point in the region to be occluded from the YUV format data to the RGB888 format data, which may include:

converting the data of the pixel points of the area to be shielded from YUV format data into YUV444 staggered format data;

and converting the YUV444 interleaving format data into RGB888 format data.

The YUV444 staggered format data is arranged in a manner that three continuous bytes correspond to Y, U, V channels respectively, the RGB888 format data is arranged in a manner that three continuous bytes correspond to R, G, B channels respectively, Y, U, V channels of the YUV444 staggered format correspond to R, G, B channels of the RGB888 format, and conversion from the YUV444 staggered format data to the RGB888 format data is achieved through matrix operation.

Preferably, the conversion from YUV444 interlaced format data to RGB888 format data may be implemented by using a Shader language in an open graphics library (OpenGL). Because the YUV444 staggered format data and the RGB888 format data are three-channel data, the data can be parallelized by adopting a Shader language in OpenGL, so that the processing time of each frame of image is reduced.

In a second implementation manner of step 304, data of a row of pixel points in the region to be blocked may be replaced with data of each row of pixel points in the region to be blocked, so as to obtain data of the video image after replacement.

The row of pixels in the region to be shielded can be horizontally (i.e. rows) or vertically (i.e. columns).

As shown in fig. 3a, in the embodiment of the present disclosure, a row refers to a left-right direction of a video image (i.e., a width direction of the video image) viewed by a user when the user normally views the video, and refers to an arrow x direction in fig. 3a, which is a horizontal direction in general for a television, and a column refers to an up-down direction of the video image (i.e., a height direction of the video image) viewed by the user when the user normally views the video, and refers to an arrow y direction in fig. 3a, which is a vertical direction in general for the television.

During implementation, data of a row of pixel points on the outermost side in the area to be shielded is preferably adopted to replace data of each row of pixel points in the area to be shielded; or,

and replacing the data of each row of pixel points in the region to be shielded by adopting the data of the row of pixel points at the outermost side of the region to be shielded.

Because some station captions may have black parts such as characters, if the data of the pixel points in the row or column in the middle of the region to be shielded is used to replace the data of other rows or columns, a large area of black region may appear, which is more prominent in the image and affects the display effect, and the data of the pixel points in the outermost row or column of the region to be shielded is used to replace the data of other rows or columns, which can avoid the situation, and improve the display effect of the image.

More preferably, the data of each row of pixel points in the region to be shielded can be replaced by the data of the last row of pixel points in the region to be shielded. The last row of pixel points refers to a row of pixel points located at the lowest position of the region to be shielded along the y direction in fig. 3 a. Taking a rectangular frame shown by a dotted line in fig. 3a as an example of the region to be blocked, the last row of pixel points refers to a row of pixel points located on the lower side 21a of the rectangular frame. Because the station caption is usually located at the upper half part of the image, such as the upper left corner or the upper right corner, and the probability that the color of the last line of pixel points is close to the color of the background image is high, the probability that the image of the area to be shielded and the background are fused after replacement can be improved by replacing the data of each line of pixel points in the area to be shielded with the data of the last line of pixel points.

Of course, the data of each row of pixel points in the region to be shielded may also be replaced by the data of one row of pixel points in the middle of the region to be shielded, or the data of each row of pixel points in the region to be shielded may be replaced by the data of one row of pixel points in the middle of the region to be shielded.

The image frame is subjected to fuzzy processing by adopting a mode of replacing the data of the pixel points, the processing mode is simple, the efficiency is high, basically, the processing time of each frame of video image does not exceed 10ms, and the normal playing of the video is not influenced.

In step 305, the data of the video image after the blurring process is output.

When the method of the present embodiment is applied to a server, this step 305 may include: sending the data of the video image after the fuzzy processing to a set top box;

when the method of the present embodiment is applied to a set-top box, this step 305 may include: sending the data of the video image after the fuzzy processing to a television;

when the method of the present embodiment is applied to a television, this step 305 may include: and sending the data of the video image after the fuzzy processing to a rendering module of the television.

It should be noted that, when the content to be occluded is a station logo of a video website or a television station, since the position of the station logo in a video usually does not change, after the area to be occluded is determined in steps 302 and 303a, parameters of the area to be occluded (for example, coordinates of a vertex of the aforementioned area to be occluded) may be stored, for data of each frame of video image obtained subsequently, the area to be occluded in the video image is determined and data of the area to be occluded is obtained directly according to the stored parameters of the area to be occluded, and then, step 304 and 305 are data of the video image after blurring processing is performed and output. When the object to be occluded is a trademark of a product, since the position where it appears is not fixed, the area to be occluded therein needs to be determined separately for each frame image.

Fig. 4 is a block diagram illustrating a video image processing apparatus according to an exemplary embodiment. As shown in fig. 4, the apparatus may include: an acquisition module 401, a detection module 402, a determination module 403, a processing module 404 and an output module 405.

Wherein the obtaining module 401 is configured to obtain data of a video image. The detection module 402 is configured to detect an object to be occluded in the video image by using the data acquired by the acquisition module 401. The determining module 403 is configured to adopt the set area in the video image as the area to be occluded when the object to be occluded in the video image is not detected by the detecting module 402. The processing module 404 is configured to perform blurring processing on the data of the region to be occluded determined by the determining module 403. The output module 405 is configured to output data of the video image after the blur processing by the processing module 404.

The object to be occluded includes, but is not limited to, a station logo of a video website, a station logo of a television station, or a trademark of a product.

Fig. 5 is a block diagram illustrating a video image processing apparatus according to an exemplary embodiment. As shown in fig. 5, the apparatus may include: an acquisition module 501, a detection module 502, a determination module 503, a processing module 504 and an output module 505.

Wherein the obtaining module 501 is configured to obtain data of a video image. The detection module 502 is configured to detect an object to be occluded in the video image by using the data acquired by the acquisition module 501. The determining module 503 is configured to adopt a set area in the video image as an area to be occluded when the detecting module 502 does not detect the object to be occluded in the video image. The processing module 504 is configured to perform blurring processing on the data of the region to be occluded determined by the determining module 503. The output module 505 is configured to output data of the video image after the blur processing by the processing module 504.

The data of the video image acquired by the acquiring module 501 may be YUV format data, and the YUV format includes YUV444 interleaving format, YVYU format, YUV420P, YUYV format, or the like. The YUV format data may be stored in a byte aligned manner, for example, in an 8-byte or 16-byte aligned manner.

The way for the detection module 502 to detect the object to be occluded can refer to the detailed description of step 302, and the detailed description is omitted here.

Further, the determining module is configured to use a rectangular region at the upper left corner or the upper right corner of the video image as a region to be blocked, a value range of a ratio of a width of the rectangular region to a width of the video image is one fourth to one third, and a value range of a ratio of a height of the rectangular region to a height of the video image is one fourth to one third.

Furthermore, the determining module 503 is further configured to, when the detecting module 502 detects the object to be occluded in the video image, take the area where the object to be occluded is located as the area to be occluded.

The way in which the determining module 503 determines the region to be occluded can refer to the detailed description of steps 303a and 303b, and the detailed description is omitted here.

Optionally, the processing module 504 is configured to perform fuzzy processing on the data of the region to be occluded by using a gaussian fuzzy algorithm; or,

The way for the processing module 504 to perform the blurring processing on the data of the region to be occluded can be referred to the detailed description of step 304, and the detailed description is omitted here.

When the apparatus of this embodiment is applied to a server, the output module 504 is configured to send the data of the video image after the blurring processing to a set-top box;

when the apparatus of this embodiment is applied to a set-top box, the output module 504 is configured to send the data of the video image after the blurring processing to a television;

when the apparatus of the present embodiment is applied to a television, the output module 504 is configured to send the data of the video image after the blurring processing to a rendering module of the television.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 6 is a block diagram illustrating a video image processing apparatus 600 according to an exemplary embodiment. For example, the apparatus 600 may be a set-top box, a television, or a server, etc.

Referring to fig. 6, apparatus 600 may include one or more of the following components: a processing component 602, a memory 604, a power component 606, a multimedia component 608, an audio component 610, an interface to input/output (I/O) 612, a sensor component 614, and a communication component 616.

The processing component 602 generally controls overall operation of the apparatus 600, such as operation associated with display. The processing component 602 may include one or more processors 620 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 602 can include one or more modules that facilitate interaction between the processing component 602 and other components. For example, the processing component 602 can include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602. The processing component 602 may also include a GPU for processing graphics data, and the like.

The memory 604 is configured to store various types of data to support operations at the apparatus 600. Examples of such data include instructions for any application or method operating on device 600, pictures, videos, and so forth. The memory 604 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

Power component 606 provides power to the various components of device 600. Power components 606 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 600.

The multimedia component 608 includes a screen that provides an output interface between the device 600 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 608 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 600 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 610 is configured to output and/or input audio signals. For example, audio component 610 includes a Microphone (MIC) configured to receive external audio signals when apparatus 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 604 or transmitted via the communication component 616. In some embodiments, audio component 610 further includes a speaker for outputting audio signals.

The I/O interface 612 provides an interface between the processing component 602 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 614 includes one or more sensors for providing status assessment of various aspects of the apparatus 600. For example, the sensor component 614 may detect an open/closed state of the device 600, the relative positioning of the components, such as a display and keypad of the device 600, the sensor component 614 may also detect a change in position of the device 600 or a component of the device 600, the presence or absence of user contact with the device 600, orientation or acceleration/deceleration of the device 600, and a change in temperature of the device 600. The sensor assembly 614 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 614 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 616 is configured to facilitate communications between the apparatus 600 and other devices in a wired or wireless manner. The apparatus 600 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 616 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 616 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 604 comprising instructions, executable by the processor 620 of the apparatus 600 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

A non-transitory computer readable storage medium in which instructions, when executed by a processor of a set-top box, television, or server, enable the set-top box, television, or server to perform a video image processing method, the method comprising:

acquiring data of a video image;

detecting an object to be shielded in the video image by adopting data of the video image;

carrying out fuzzy processing on data of an area to be shielded;

and outputting the data of the video image after the blurring processing.

Further, the method for using a set area in a video image as an area to be occluded comprises the following steps:

the method comprises the steps that a rectangular area at the upper left corner or the upper right corner of a video image is used as an area to be shielded, the value range of the ratio of the width of the rectangular area to the width of the video image is one fourth to one third, and the value range of the ratio of the height of the rectangular area to the height of the video image is one fourth to one third.

In an implementation manner of the embodiment of the present disclosure, performing a fuzzy processing on data of a region to be occluded includes:

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method for video image processing, the method comprising:

acquiring data of a video image;

carrying out fuzzy processing on the data of the area to be shielded;

and outputting the data of the video image after the blurring processing.

2. The method of claim 1, further comprising:

3. The method of claim 2, wherein the object to be occluded comprises a station logo of a video website or a station logo of a television station.

4. The method according to claim 3, wherein the adopting the set area in the video image as the area to be occluded comprises:

5. The method according to any one of claims 1 to 4, wherein the blurring the data of the region to be occluded comprises:

6. A video image processing apparatus, characterized in that the apparatus comprises:

the acquisition module is used for acquiring data of the video image;

7. The apparatus according to claim 6, wherein the determining module is further configured to, when an object to be occluded in the video image is detected, regard an area where the object to be occluded is located as the area to be occluded.

8. The apparatus of claim 7, wherein the object to be occluded comprises a station logo of a video website or a station logo of a television station.

9. The apparatus according to any one of claims 6 to 8, wherein the determining module is configured to use a rectangular region in an upper left corner or an upper right corner of the video image as the region to be occluded, a ratio of a width of the rectangular region to a width of the video image ranges from one quarter to one third, and a ratio of a height of the rectangular region to a height of the video image ranges from one quarter to one third.

10. The device according to claims 6-8, wherein the processing module is configured to perform a blurring process on the data of the region to be occluded by using a gaussian blurring algorithm; or,

11. An apparatus for video image processing, the apparatus comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

acquiring data of a video image;

carrying out fuzzy processing on the data of the area to be shielded;

and outputting the data of the video image after the blurring processing.