CN106682094B

CN106682094B - Face video retrieval method and system

Info

Publication number: CN106682094B
Application number: CN201611087529.7A
Authority: CN
Inventors: 马国强
Original assignee: Shenzhen Mengwang Video Co Ltd
Current assignee: Shenzhen Mengwang Video Co Ltd
Priority date: 2016-12-01
Filing date: 2016-12-01
Publication date: 2020-05-22
Anticipated expiration: 2036-12-01
Also published as: CN106682094A

Abstract

The invention discloses a face video retrieval method and a system, the method determines a search area of a key frame through information of a non-compressed domain, and then obtains a tracking search area through motion and prediction information of a compressed domain, thereby reducing data volume and operation amount of video search and improving timeliness of video search; in addition, the method also aims at the characteristics of face retrieval, and reduces the calculation amount by reducing the search area; through preprocessing, the accuracy of searching is improved.

Description

Face video retrieval method and system

Technical Field

The invention relates to the field of video retrieval, in particular to a face video retrieval method and a face video retrieval system.

Background

With the rapid development of multimedia technology and computer network technology, video is becoming one of the mainstream carriers for information dissemination. The problem faced by people is no longer the lack of video content, but how to quickly and effectively find the content needed by people facing video information in the great amount. In the field of social public security, the video monitoring system becomes an important component for maintaining social security and strengthening social management. The face video retrieval is an urgent need in a public security user monitoring system. As the most popular video search technology at present, no matter the video content retrieval based on the non-compression domain and the video content retrieval based on the compression domain, the common design mode does not utilize the characteristics of the face retrieval, thereby influencing the efficiency of the face video retrieval technology.

Disclosure of Invention

The embodiment of the invention aims to provide a face video retrieval method, and aims to solve the problem of low efficiency of the existing face video retrieval technology.

The embodiment of the invention is realized in such a way that a face video retrieval method comprises the following steps:

step A: judging the current frame pic of the current search video_tIs determined by the judgment parameter par_tIf the value is 1, entering the step B if the value is 1, and otherwise, entering the step E;

and B: searching a current frame by using a first video searching mode;

and C: if the next frame of the current search video exists, making t equal to t +1, setting the next frame of the current search video as the current frame of the current search video, and then entering the step D; otherwise, ending; t represents the frame number of the searched video sequence, and the initial value of t is 1;

step D: if not sbk_tIf (i, j) is 1, entering step E; otherwise, go to step G.

sbk_t(i, j) denotes bk_t(i, j) identifying the parameter, bk_t(i, j) denotes pic_tIth row and jth column code blocks;

step E: if the current searching video current frame pic_tFor intra-predicted frames, let tp_tBkh × bkw; otherwise, calculate tp_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw);

step F: if tp_tFirst, all sbk are set to 0_t(i, j) ═ 0, then proceed to step C; otherwise, if tp_tEntering step B if not less than 0.9 × bkh × bkw; otherwise, entering step G; bkw and bkh respectively represent the column number and the row number of a frame of image in a unit of block after the frame of image is divided into blocks;

step G: searching the current frame by using a second video search mode, and then entering the step C;

the first video search mode comprises the steps of:

decoding a current frame of a current search video to obtain a decoded image;

all decoded blocks of the decoded picture are processed as follows: if bk_t(i, j) if the prediction mode is a subblock prediction mode, entering a subdivision judgment mode; otherwise, entering a rough judgment mode;

unifying the resolution of the current search area and the search target, and then scaling the current search area and the search target to the same size with the unified resolution;

firstly, extracting image characteristics from a search area of a current decoding image; then comparing with a search target, matching and finishing the search of the current frame of the current search video;

according to the matching result of the current frame of the current search video, identifying parameter identification of each decoding block of the current frame of the current search video;

wherein, sbk_t(i,j)＝sign(bk_t(i, j) | condition 3), condition 3 represents: bk_t(i, j) matching the target.

Another objective of an embodiment of the present invention is to provide a face video retrieval system, where the system includes:

a first judgment processing module for judging the pic of the current frame of the current search video_tIs determined by the judgment parameter par_tIf the video is 1, entering a first video searching device if the video is 1, and otherwise entering a scene switching parameter calculation module;

wherein par_tRepresents pic_tThe determination parameter of (a) is determined,

pic_tthe method comprises the steps of representing the tth frame of a current search video, wherein t represents the frame number of a search video sequence, and the initial value of t is 1; condition 1 represents: t is 1_Orpic_{t is an intra-predicted frame or}tp_t≥0.9*bkh*bkw；tp_tFor scene switching parameters, tp_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw); sum (b) of_{Variables of}|_Condition) Means for summing the variables that satisfy the condition;

condition 2 represents: bk_t(i, j) is an intra prediction block or at least comprises one intra prediction sub-block; bk_t(i, j) denotes pic_tThe ith row and the jth decoding block bkw and bkh of the decoding device respectively represent the column number and the row number of a frame of image which is divided into blocks and takes the blocks as a unit;

first video search means for searching for a current frame using a first video search mode;

the second judgment processing module is used for judging whether a next frame of the current search video frame exists or not, if yes, the next frame of the current search video frame is made to be t +1, the next frame of the current search video frame is set to be the current search video frame, then the third judgment processing module is started, and if not, the process is ended;

a third judgment processing module for judging whether the existence sbk exists_tIf the (i, j) is not 1, entering a scene switching parameter calculation module, otherwise, entering a second video search device;

a scene switching parameter calculation module for judging if the current frame pi of the video is searched currentlyc_tFor intra-predicted frames, let tp_tBkh × bkw; otherwise, tp is calculated_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw);

a fourth judgment processing module for judging whether tp is available_tIf 0, all sbk are set_t(i, j) is equal to 0, and then the second judgment processing module is entered; otherwise, judging if tp_tEntering the first video searching device if the speed is more than or equal to 0.9 and bkh and bkw; otherwise, entering a second video searching device;

the second video searching device is used for searching the current frame by using a second video searching mode and then entering a second judgment processing module;

the first video search device includes:

the decoding image acquisition module is used for decoding the current searching video frame to acquire a decoding image;

a prediction mode decision module for deciding if bk_t(i, j) entering a subdivision decision device if the prediction mode is a subblock prediction mode; otherwise, entering a rough classification judgment device;

the first size unifying module is connected with the prediction mode judging module and is used for unifying the resolution of the current search area and the search target and then scaling the current search area and the search target to the same size according to the unified resolution;

the first target image searching module is used for extracting image characteristics from a searching area of a current decoding image; then comparing with a search target, matching and finishing the search of the current frame of the current search video;

the first identification parameter identification module is used for identifying the identification parameter of each decoding block of the current frame of the current search video according to the matching result of the current frame of the current search video;

wherein, sbk_t(i,j)＝sign(bk_t(i, j) | Condition 3), sbk_t(i, j) represents bk_t(ii) the identification parameter of (i, j); condition 3 represents: bk_t(i, j) matching the target.

The invention has the advantages of

The invention provides a face video retrieval method, which determines a search area of a key frame through information of a non-compressed domain, and then acquires a tracking search area through motion and prediction information of a compressed domain, so that the data volume and the operation amount of video search are reduced, and the timeliness of the video search is improved; in addition, the method also aims at the characteristics of face retrieval, and reduces the calculation amount by reducing the search area; through preprocessing, the accuracy of searching is improved.

Drawings

FIG. 1 is a flow chart of a face video retrieval method according to a preferred embodiment of the present invention;

FIG. 2 is a flowchart of the method of Step1 in FIG. 1;

FIG. 3 is a block diagram of a face video retrieval system in accordance with a preferred embodiment of the present invention;

FIG. 4 is a block diagram of the first video search apparatus of FIG. 3;

FIG. 5 is a view showing the structure of the subdivision determination device in FIG. 4;

FIG. 6 is a structural view of the rough judgment means in FIG. 4;

fig. 7 is a structural diagram of the second video search apparatus in fig. 3.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and examples, and for convenience of description, only parts related to the examples of the present invention are shown. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The embodiment of the invention provides a face video retrieval method and a face video retrieval system, wherein the method determines a search area of a key frame through information of a non-compressed domain, and then acquires a tracking search area through motion and prediction information of a compressed domain, so that the data volume and the operation amount of video search are reduced, and the timeliness of video search is improved; in addition, the method also aims at the characteristics of face retrieval, and reduces the calculation amount by reducing the search area; through preprocessing, the accuracy of searching is improved.

Example one

FIG. 1 is a flow chart of a face video retrieval method according to a preferred embodiment of the present invention; the method comprises the following steps:

step: 0: judgment parameter par_tIf the value is 1, the Step1 is entered, otherwise, the Step4 is entered.

Wherein par_tRepresents pic_tThe determination parameter of (a) is determined,

pic_tthe method comprises the steps of representing the t-th frame of a current search video (namely the current frame of the current search video), wherein t represents the frame number of a search video sequence, and the initial value of t is 1; condition 1 represents: t is 1_Orpic_{t is an intra-predicted frame or}tp_t≥0.9*bkh*bkw；tp_tFor scene switching parameters, tp_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw); sum (b) of_{Variables of}|_Condition) Means for summing the variables that satisfy the condition;

condition 2 represents: bk_t(i, j) is an intra prediction block or at least comprises one intra prediction sub-block; bk_t(i, j) denotes pic_tRow i and row j of (1) the decoded block (the size of the block is 16x16 (standard such as H264), 64x64(HEVC), when the block is further divided, these small-sized blocks are called sub-blocks), bkw and bkh respectively represent the number of columns and rows of the image in units of blocks after the image of one frame is divided into blocks;

step 1: the current frame is searched using a first video search mode.

First video search mode (fig. 2 is the method flow diagram of Step1 in fig. 1):

step 11: and decoding the current frame of the current search video to obtain a decoded image.

Step 12: according to the characteristics of face recognition, a search area is defined for the decoded image; namely, all decoded blocks of the decoded image are processed as follows: if bk_t(i, j) the prediction mode is a subblock prediction mode, namely, if the block is further divided, entering a subdivision judgment mode; otherwise, go intoEnter the rough point decision mode. A subdivision judgment mode:

step A1: and taking each pixel point in the block as a skin color judgment point, judging the skin color of the skin color judgment point, and adding 1 to the number of the skin color pixel points of the block if the skin color judgment point is the skin color.

Step A2: and if the number of the skin color pixel points in the block is greater than a fourteenth threshold value, judging that the block is drawn into a face video search area, otherwise, drawing the block into a non-face video search area. The fourteenth threshold upper limit is the total number of the pixel points of the block, and the lower limit is half of the total number of the pixel points of the selectable block.

Rough classification judgment mode:

step B1: and taking the average value of the pixel points in the block as a unit as a skin color decision point, namely taking the average value of corresponding components of all the pixel points in the block as the value of each color model component.

Step B2: judging the skin color of the skin color judging point, and if the skin color judging point is the skin color, dividing the skin color judging point into a face video searching area; otherwise, the block is scribed into a non-face video search area.

In the subdivision judgment mode and the rough division judgment mode, if the skin color judgment point is the skin color, the following 6 conditions are satisfied at the same time:

the method comprises the following steps of 1; thres1< b-g < Thres2, requirement 2: thres3< r-g < Thres 4x Wr,

Requirement 3: gup < g < Gdown, requirement 4: thres5< Wr, requirement 5: thres6< Co < Thres7,

The method comprises the following steps: thres8<energy_UV<Thres9&&U*Thres10<V&&U*Thres11>V or Thres12<energy_UV<Thres13

Wherein Thresjj, jj ∈ [1,13 ]]The first threshold value to the thirteenth threshold value are respectively set according to the actual situation; based on the normalized RGB model, the RGB model,

obtaining normalized RGB color components r, g and b; color balance parameter Wr ═ (r-1/3)²+(g-1/3)²(ii) a Constructing a green component upper bound model Gup ═ a_upr²+b_upr+c_upWherein a is_up,b_up,c_upAs a model parameter, Gdown ═ a_downr²+b_downr+c_down(ii) a Wherein a is_down,b_down,c_downIs a model parameter; model-based YUV model

Obtaining color energy

Y is a brightness component, and U, V represents two chrominance components of the YUV model respectively; based on YCoCg model

Obtaining Co, wherein Co is a color component value of a YCgCo model;

in the first video search mode, the search area comprises a face video search area and a non-face video search area; the skin color decision point method may be any one of those disclosed in the art.

Step 13: the resolution of the current search area and the search target are unified, and then the current search area and the search target are scaled to the same size with the unified resolution.

Step 14: firstly, extracting image characteristics from a search area of a current decoding image; and then comparing with a search target, matching and finishing the search of the current frame of the current search video.

The image features are extracted and compared with the search target, and the matching method can be any method disclosed in the field of corresponding video search, and is not repeated herein.

Step 15: and identifying the identification parameters of each decoding block of the current frame of the current search video according to the matching result of the current frame of the current search video.

Step 2: if the next frame of the current search video exists, making t equal to t +1, setting the next frame of the current search video as the current frame of the current search video, and then entering Step 3; otherwise, ending.

Step 3: if not sbk_tIf (i, j) is 1, go to Step 4; otherwise, go to Step 6.

Step 4: if pic_tFor intra-predicted frames, let tp_tBkh × bkw; otherwise, calculate tp_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw).

Step 5: if tp_tFirst, all sbk are set to 0_t(i, j) ═ 0, then proceed to Step 2; otherwise, if tp_tEntering Step1 when the pressure is not less than 0.9 × bkh × bkw; otherwise, Step6 is entered.

Step 6: using the second video search mode, the current frame is searched and then Step2 is entered.

Second video search mode:

step 61: if bk_t(i, j) as an intra prediction block, decoding the block and then delimiting the block as a search region; if not, then,

if spbk_tWhen (i, j) is 1, sbk is set_t(i, j) ═ 1, namely, represents that the current block matches the target; otherwise, sbk is set_tAnd (i, j) ═ 0, namely, the current block does not match the target.

Wherein spbk_t(i, j) denotes bk_t(ii) an identification parameter of the reference block of (i, j).

Step 62: the current search area is preprocessed, i.e. the resolutions of the current search area and the search target are unified, and then the current search area and the search target are scaled to the same size with the unified resolution.

Step 63: firstly, extracting image characteristics of a search area, then comparing the image characteristics with a search target, matching and finishing the search of the current frame of the current search video.

Step 64: and identifying the identification parameters of the decoding blocks according to the matching results of the decoding blocks in the search area.

Example two

FIG. 3 is a block diagram of a face video retrieval system in accordance with a preferred embodiment of the present invention; the system comprises:

wherein par_tRepresents pic_tThe determination parameter of (a) is determined,

and the second judgment processing module is used for judging whether the next frame of the current search video current frame exists or not, if so, making t equal to t +1, setting the next frame of the current search video current frame as the current search video current frame, and then entering the third judgment processing module, otherwise, ending.

a scene switching parameter calculation module for judging if the current frame pic of the video is searched currently_tFor intra-predicted frames, let tp_tBkh × bkw; otherwise, tp is calculated_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw).

A fourth judgment processing module for judging whether tp is available_tIf 0, all sbk are set_t(i, j) is equal to 0, and then the second judgment processing module is entered; otherwise, judging if tp_tEntering the first video searching device if the speed is more than or equal to 0.9 and bkh and bkw; otherwise, entering a second video searching device.

further, fig. 4 is a structural diagram of the first video search apparatus in fig. 3, the first video search apparatus comprising:

a prediction mode decision module for deciding if bk_t(i, j) entering a subdivision decision device if the prediction mode is a subblock prediction mode; otherwise, entering a rough classification judgment device.

the first target image searching module is used for extracting image characteristics from a searching area of a current decoding image; and then comparing with a search target, matching and finishing the search of the current frame of the current search video.

And the first identification parameter identification module is used for identifying the identification parameters of each decoding block of the current frame of the current search video according to the matching result of the current frame of the current search video.

Further, fig. 5 is a structural view of the subdivision determination device in fig. 4;

the subdivision judging device comprises a block skin color pixel point counting module and a first face video searching area dividing module,

the block skin color pixel counting module is used for taking each pixel in the block as a skin color judgment point, judging the skin color of the skin color judgment point, and if the skin color judgment point is skin color, adding 1 to the number of the block skin color pixels;

and the first face video search area dividing module is connected with the block skin color pixel point counting module and is used for judging whether the block is drawn into the face video search area if the number of the block skin color pixel points is greater than a fourteenth threshold value or not.

The fourteenth threshold upper limit is the total number of the pixel points of the block, and the lower limit is half of the total number of the pixel points of the selectable block.

FIG. 6 is a structural view of the rough judgment means in FIG. 4;

the rough-dividing judging device comprises a block color model component value calculating module and a second human face video searching area dividing module,

the block color model component value calculation module is used for taking the mean value of pixel points in the block as a skin color decision point and taking the mean value of corresponding components of all pixel points in the block as the value of each color model component;

the second face video search area dividing module is connected with the block color model component value setting module and used for judging the skin color of the skin color judging point, and if the skin color judging point is the skin color, the second face video search area dividing module is divided into a face video search area; otherwise, the block is drawn into a non-human face video search area.

Obtaining color energy

Obtaining Co, wherein Co is a color component value of a YCgCo model; the skin color decision point method may be any one of those disclosed in the art.

Further, fig. 7 is a structural diagram of a second video search apparatus in fig. 3, the second video search apparatus comprising:

a second search area defining module for determining if bk_t(i, j) as an intra prediction block, decoding the block and then delimiting the block as a search region; otherwise, if spbk_tWhen (i, j) is 1, sbk is set_t(i, j) ═ 1, namely, represents that the current block matches the target; otherwise, sbk is set_tAnd (i, j) ═ 0, namely, the current block does not match the target. Wherein spbk_t(i, j) denotes bk_t(ii) an identification parameter of the reference block of (i, j).

And the second size unifying module is connected with the second searching area demarcating module and is used for preprocessing the current searching area, namely unifying the resolution of the current searching area and the searching target, and then zooming the current searching area and the searching target to the same size with the unified resolution.

And the second target image searching module is used for firstly extracting image characteristics from a searching area, then comparing the image characteristics with a searching target, matching and finishing the searching of the current frame of the current searching video.

And the second identification parameter identification module is used for identifying the identification parameters of the decoding blocks according to the matching results of the decoding blocks in the search area.

It will be understood by those skilled in the art that all or part of the steps in the method according to the above embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, such as ROM, RAM, magnetic disk, optical disk, etc.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A face video retrieval method, the method comprising the steps of:

and B: searching a current frame by using a first video searching mode;

step D: if not sbk_tIf (i, j) is 1, entering step E; otherwise, entering step G;

step E: if the current searching video current frame pic_tFor intra-predicted frames, let tp_tBkh × bkw; otherwise, calculate tp_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw); tp_tFor the scene change parameter, condition 2 indicates: bk_t(i, j) is an intra prediction block or at least comprises one intra prediction sub-block;

it is characterized in that the preparation method is characterized in that,

the first video search mode comprises the steps of:

decoding a current frame of a current search video to obtain a decoded image;

2. The face video retrieval method of claim 1,

pic_trepresents the current search video t frame, and condition 1 represents: t 1 or pic_tFor intra-prediction of frames or tp_t≥0.9*bkh*bkw；tp_tFor scene switching parameters, tp_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw); sum (variable | condition) represents summing the variables that satisfy the condition;

condition 2 represents: bk_t(i, j) is an intra prediction block or at least includes one intra prediction sub-block.

3. The face video retrieval method of claim 1,

a subdivision judgment mode:

step A1: taking each pixel point in the block as a skin color judgment point, judging the skin color of the skin color judgment point, and if the skin color judgment point is the skin color, adding 1 to the number of the skin color pixel points of the block;

step A2: if the number of the skin color pixel points in the block is larger than a fourteenth threshold value, judging that the block is drawn into a face video search area, otherwise, drawing the block into a non-face video search area;

rough classification judgment mode:

step B1: taking the mean value of pixel points in a block as a skin color decision point, and taking the mean value of corresponding components of all pixel points in the block as the value of each color model component;

4. The face video retrieval method of claim 3,

the fourteenth threshold has an upper limit of the total number of pixels of the block and a lower limit of half the total number of pixels of the block.

5. The face video retrieval method of claim 3,

the method comprises the following steps of 1: thres1< b-g < Thres 2; the method comprises the following steps: thres3< r-g < Thres4 × Wr;

requirement 3: gup < g < Gdown; and 4, requirement: thres5< Wr; the requirements are 5: thres6< Co < Thres 7;

acquiring normalized RGB color components r, g and b; color balance parameter Wr ═ (r-1/3)²+(g-1/3)²(ii) a Constructing a green component upper bound model Gup ═ a_upr²+b_upr+c_upWherein a is_up，b_up，c_upAs a model parameter, a green component lower bound model Gdown ═ a_downr²+b_downr+c_down(ii) a Wherein, a_down，b_down，c_downIs a model parameter; model-based YUV model

Obtaining color energy

And obtaining Co which is the color component value of the YCgCo model.

6. The face video retrieval method of claim 1, wherein the second video search mode comprises the steps of:

if bk_t(i, j) as an intra prediction block, decoding the block and then delimiting the block as a search region; otherwise, if spbk_tWhen (i, j) is 1, sbk is set_t(i, j) ═ 1; otherwise, sbk is set_t(i, j) ═ 0; wherein spbk_t(i, j) denotes bk_t(ii) an identification parameter of the reference block of (i, j);

preprocessing the current search area, namely unifying the resolution of the current search area and the search target, and then scaling the current search area and the search target to the same size with the unified resolution;

firstly, extracting image characteristics from a search area, then comparing the image characteristics with a search target, matching and completing the search of a current frame of a current search video;

and identifying the identification parameters of the decoding blocks according to the matching results of the decoding blocks in the search area.

7. A face video retrieval system, the system comprising:

wherein par_tRepresents pic_tThe determination parameter of (a) is determined,

pic_tthe method comprises the steps of representing the tth frame of a current search video, wherein t represents the frame number of a search video sequence, and the initial value of t is 1; condition 1 represents: t 1 or pic_tFor intra-prediction of frames or tp_t≥0.9*bkh*bkw；tp_tFor scene switching parameters, tp_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw); sum (variable | condition) represents summing the variables that satisfy the condition;

condition 2 represents: bk_t(i, j) is an intra prediction block or at least comprises one intra prediction sub-block; bk_t(i, j) denotes pic_tRow i, row j decoding block of (1); bkw and bkh respectively represent the column number and the row number of a frame of image in a unit of block after the frame of image is divided into blocks;

the second judgment processing module is used for judging the existence of the next frame of the current search video current frame, if so, the next frame of the current search video current frame is made to be t +1, the next frame of the current search video current frame is set to be the current search video current frame, and then the current search video current frame enters the third judgment processing module; otherwise, ending;

third judging and processing moduleFor determining whether sbk is present_tIf the (i, j) is not 1, entering a scene switching parameter calculation module, otherwise, entering a second video search device;

a scene switching parameter calculation module for judging if the current frame pic of the video is searched currently_tFor intra-predicted frames, let tp_tBkh × bkw; otherwise, tp is calculated_t＝sum(sign(bk_t(i, j) | Condition 2) |1 ≦ i ≦ bkh and 1 ≦ j ≦ bkw);

it is characterized in that the preparation method is characterized in that,

the first video search device includes:

a prediction mode judging module for judging if bk_t(i, j) entering a subdivision decision device if the prediction mode is a subblock prediction mode; otherwise, entering a rough classification judgment device;

wherein, sbk_t(i,j)＝sign(bk_t(i, j) | Condition 3), sbk_t(i, j) denotes bk_t(ii) the identification parameter of (i, j); condition 3 represents: bk_t(i, j) matching the target.

8. The face video retrieval system of claim 7,

the first face video search area dividing module is connected with the block skin color pixel point counting module and used for judging whether the block is divided into a face video search area if the number of the block skin color pixel points is larger than a fourteenth threshold value or not;

9. The face video retrieval system of claim 8,

the method comprises the following steps: thres8<energy_UV<Thres9&&U*Thres10<V&&U*Thres11>V

Or Thres12<energy_UV<Thres13

Obtaining color energy

And obtaining Co which is the color component value of the YCgCo model.

10. The face video retrieval system of claim 7,

the second video search device includes:

second search area delineationA module for judging if bk_t(i, j) as an intra prediction block, decoding the block and then delimiting the block as a search region; otherwise, if spbk_tWhen (i, j) is 1, sbk is set_t(i, j) ═ 1, namely, represents that the current block matches the target; otherwise, sbk is set_t(i, j) ═ 0, namely, it means that the current block does not match the target; wherein spbk_t(i, j) denotes bk_t(ii) an identification parameter of the reference block of (i, j);

the second size unifying module is connected with the second searching area dividing module and is used for preprocessing the current searching area, namely unifying the resolution of the current searching area and the searching target and then zooming the current searching area and the searching target to the same size with the unified resolution;

the second target image searching module is used for firstly extracting image characteristics of a searching area, then comparing the image characteristics with a searching target, matching and completing searching of a current frame of a current searching video;