CN107509045A

CN107509045A - Image processing method and device, electronic installation and computer-readable recording medium

Info

Publication number: CN107509045A
Application number: CN201710811811.3A
Authority: CN
Inventors: 张学勇
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2017-09-11
Filing date: 2017-09-11
Publication date: 2017-12-22

Abstract

The invention discloses a kind of image processing method and device, electronic installation and computer-readable recording medium, image processing method, for electronic installation.Method includes：Gather the first scene video of active user, obtain multiple depth images of active user, according to multiple depth images, handle each frame scene image of the first scene video, to obtain personage's area image in each frame scene image, according to the video frame rate of the first scene video, the target dynamic background of matching is chosen from multiple dynamic backgrounds, the background image that everyone object area image is corresponded to frame with target dynamic background is merged to obtain merging image.Personage's extracted region in each frame scene image is come out by obtaining multiple depth images of active user, influenceed because the acquisition of depth image is not easy the factor such as COLOR COMPOSITION THROUGH DISTRIBUTION in by illumination, scene, therefore, the people's object area extracted by depth image is more accurate, it is particularly possible to which accurate calibration goes out the border of people's object area.

Description

Image processing method and device, electronic installation and computer-readable recording medium

Technical field

The present invention relates to technical field of image processing, more particularly to a kind of image processing method and device, electronic installation and Computer-readable recording medium.

Background technology

With the development of image processing techniques and mobile terminal, people are increasing to the demand of video, people at any time with Ground can shoot video or carry out video calling, but when people are imaged, and wish to hide in some scenarios true Background, come meet it is specific shooting scene the needs of.Change the background of shooting, it is necessary to by the former character image imaged in scene Extract, and the character image extracted and selected dynamic background are merged.

It is existing by the technology that personage is merged with virtual background usually using feature point extraction character contour, but use feature The character contour accuracy of point extraction is not high, especially can not accurate calibration go out the border of personage, influence the effect of image co-registration.

The content of the invention

Can the embodiment provides a kind of image processing method, image processing apparatus, electronic installation and computer Read storage medium.

The image processing method of embodiment of the present invention is used for electronic installation, and described image processing method includes：

Gather the first scene video of active user；

Obtain multiple depth images of the active user；

According to the multiple depth image, each frame scene image of first scene video is handled, is worked as so that acquisition is described People object area of the preceding user in each frame scene image and personage's area image corresponding to obtaining；

According to video frame rate used by first scene video, the target that matching is chosen from multiple dynamic backgrounds is moved State background；

The background image that each described personage's area image is corresponded to frame with the target dynamic background is merged to obtain To merging image.

The image processing apparatus of embodiment of the present invention, for electronic installation.Described image processing unit includes visible ray Camera, depth image acquisition component and processor.

Visible image capturing head, the visible image capturing head are used for the first scene video for gathering active user.

Depth image acquisition component, the depth image acquisition component are used for the multiple depth maps for obtaining the active user Picture.

Processor, the processor are used for：

The electronic installation of embodiment of the present invention includes one or more processors, memory and one or more programs. Wherein one or more of programs are stored in the memory, and are configured to by one or more of processors Perform, described program includes being used for the instruction for performing above-mentioned image processing method.

The computer-readable recording medium of embodiment of the present invention includes what is be used in combination with the electronic installation that can be imaged Computer program, the computer program can be executed by processor to complete above-mentioned image processing method.

Image processing method, image processing apparatus, electronic installation and the computer-readable storage medium of embodiment of the present invention Matter is come out personage's extracted region in each frame scene image by obtaining multiple depth images of active user.Due to depth The acquisition of image, which is not easy the factor such as COLOR COMPOSITION THROUGH DISTRIBUTION in by illumination, scene, to be influenceed, and therefore, is extracted by multiple depth images Corresponding people's object area is more accurate, it is particularly possible to which accurate calibration goes out the border of people's object area.Further, since the first scene video and The frame per second matching of target dynamic background so that more accurately background of personage's area image with corresponding to frame in target dynamic background Image can be merged well.

The additional aspect and advantage of the present invention will be set forth in part in the description, and will partly become from the following description Obtain substantially, or recognized by the practice of the present invention.

Brief description of the drawings

Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Substantially and it is readily appreciated that, wherein：

Fig. 1 is the module diagram for the image processing apparatus that the embodiment of the present invention is provided；

Fig. 2 is the structural representation for the electronic installation that the embodiment of the present invention is provided；

Fig. 3 is the schematic flow sheet for the image processing method that the embodiment of the present invention one is provided；

Fig. 4 is the schematic flow sheet for the image processing method that the embodiment of the present invention two is provided；

Fig. 5 (a) to Fig. 5 (e) is the schematic diagram of a scenario of structural light measurement according to an embodiment of the invention；

Fig. 6 (a) and Fig. 6 (b) structural light measurements according to an embodiment of the invention schematic diagram of a scenario；

Fig. 7 is the schematic flow sheet for the image processing method that the embodiment of the present invention three is provided；

Fig. 8 is the schematic flow sheet for the image processing method that the embodiment of the present invention four is provided；

The schematic flow sheet for the image processing method that Fig. 9 A are provided by the embodiment of the present invention five；

The schematic flow sheet for the image processing method that Fig. 9 B are provided by the embodiment of the present invention six

A kind of module diagram for electronic installation that Figure 10 is provided by the embodiment of the present invention；And

The module diagram for another electronic installation that Figure 11 is provided by the embodiment of the present invention.

Embodiment

Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.

Below with reference to the accompanying drawings the image processing method and device, electronic installation and computer-readable of the embodiment of the present invention are described Storage medium.

User, when shooting video or video calling, can change dynamic background, meet user according to the hobby of oneself To the demand of video capture scene, still, in correlation technique, the technology that personage is merged with virtual background carries usually using characteristic point The character contour in each frame scene image in video is taken, but it is not high using the character contour accuracy of feature point extraction, cause nothing Method accurate calibration goes out personage border, influences the effect of image co-registration.

For this problem, the embodiment of the present invention proposes a kind of image processing method, at the image of the embodiment of the present invention Reason method can realize that image processing apparatus is used for electronic installation by the image processing apparatus of the embodiment of the present invention.

The module diagram for the image processing apparatus that Fig. 1 is provided by the embodiment of the present invention, and Fig. 2 are implemented for the present invention The structural representation for the electronic installation that example is provided.

As depicted in figs. 1 and 2, in the embodiment of the present invention, image processing apparatus 100 is used for electronic installation 1000, that is, Say, electronic installation 1000 includes image processing apparatus 100.

Image processing apparatus 100 includes visible image capturing first 11, depth image acquisition component 12 and processor 20.

Visible image capturing first 11, for gathering the video of user's shooting.

Depth image acquisition component 12, for gathering multiple depth images of user.

Processor 20, for handling the data collected.

As a kind of possible implementation, depth image acquisition component 12, can also include：Structured light projector 121 With structure light video camera head 122.

Wherein, structured light projector 121, for active user's projective structure light.

Structure light video camera head 122, shoot the structure light image modulated through user.

Electronic installation 1000 can include mobile phone, tablet personal computer, notebook computer, Intelligent bracelet, intelligent watch, intelligent head Helmet, intelligent glasses etc..

A kind of schematic flow sheet for image processing method that Fig. 3 is provided by the embodiment of the present invention, the image of the present embodiment Image processing apparatus and electronic installation of the processing method based on above-described embodiment realize that the image processing method includes：

Step S301, gather the first scene video of active user.

Wherein, for convenience, the frame per second of the first scene video can be referred to as video frame rate, by the frame of dynamic background Rate is referred to as background frame per second.

Specifically, application program is installed, user opens application program, gathered by visible image capturing head in electronic installation The current video scene image of user, referred to as the first scene video, wherein, the first scene video includes multiframe scene image.

Step S302, obtain multiple depth images of active user.

Structure light (Structured Light) arrives body surface to project specific light, due to body surface be bumps not Flat, the change of body surface and possible gap can be modulated to irradiating the light come, then will be launched.The present embodiment In, the equipment of generating structure light is structured light projector, from structured light projector to user's emitting structural light, when structure light irradiation After on to human body, because user's body surface is out-of-flatness, so body when reflecting structure light, can cause to tie The distortion of structure light.Further, the structure light image modulated by the structure light video camera head shooting on electronic installation through active user, And phase information corresponding to each pixel in the structure light image is demodulated, phase information is converted into depth information, according to depth Information to obtain a depth image in multiple depth images, and then, obtain all depth images.Depth image is more It is individual, can be the corresponding depth image of each frame in video, when personage's transfixion in multiframe scene image in video When, i.e., there can be actionless multiframe scene image and correspond to same depth image.Wherein, collected in structure light video camera head Depth image and the scene image that collects of visible image capturing head corresponding to be same frame picture, corresponding scene domain phase Together, each pixel and in scene image can be found in depth image to should pixel depth information.

Step S303, according to multiple depth images, each frame scene image of the first scene video is handled, is currently used with obtaining People object area of the family in each frame scene image and personage's area image corresponding to obtaining.

Specifically, for each frame scene image, the human face region in scene image is identified, according to the frame scene image In corresponding depth image, depth information corresponding with human face region is obtained, people is determined according to the depth information of the human face region The depth bounds of object area, then, determine to be connected with human face region according to the depth bounds of people's object area and fall into the depth In the range of people's object area, so as to obtain personage's area image, and can accurately determine personage's area image.

Step S304, according to video frame rate used by the first scene video, matching is chosen from multiple dynamic backgrounds Target dynamic background.

Multiple dynamic backgrounds are stored with the memory cell of electronic installation or in cloud server.Specifically, Duo Gedong Frame per second is different used by state background, the frame per second used according to the first scene video, and default threshold value, from multiple dynamic backgrounds In, target dynamic background is chosen, the difference of frame per second is less than predetermined threshold value used by the target dynamic background and the first scene video.

In general, dynamic background and video should use several frame per second of standard, such as 30 frames/second, 60 frames/second, 120 Frame/second etc..But, can in the dynamic background of default standard frame per second in view of that non-standard frame per second may be present The dynamic background matched completely can not be inquired, for such case, the predetermined threshold value of non-zero can be set, to inquire The approximate dynamic background of frame per second is for example, frame per second is 30 frames/second used by the first scene video, and predetermined threshold value is 2 frames, then target The frame per second scope of dynamic background is 29 frames to 31 frames.Preferential choose uses the minimum target of frame per second difference with the first scene video Dynamic background, such as the target dynamic background that the frame per second in this example is 30 frames；If it is existing be with the first scene video used by Frame per second difference is less than multiple target dynamic backgrounds of threshold value, then an optional target dynamic background, if the frame per second in this example is 29 Frame/second or the target dynamic background of 31 frames/second.

It should be noted that when the target dynamic background matched is multiple, can as a kind of possible implementation Target dynamic background is presented to user, one target dynamic background of selection is carried out according to fancy grade by user.As another The possible implementation of kind, randomly select one by the processor of electronic installation and be used as target dynamic background.

Step S305, everyone object area image is corresponded to target dynamic background frame background image merge with To merging image.

Specifically, the first scene video and target dynamic background frame per second are matching, personage's area image and the first video In each two field picture there is corresponding relation, personage's area image can be with the scene image of the corresponding frame of the target dynamic background of selection Merged, because everyone its edge of object area image acquired is all more accurate than more visible, obtained after fusion Merging image effect it is preferable.

Need what is explained, when the frame per second difference of the first scene video and target background video is not 0, and be numerical value very little During threshold value, e.g., 1 frame or 2 frames, when being merged, because the difference of frame per second is smaller, after fusion, human eye, which is not easy to discover, to deposit In the picture not merged.

In the image processing method of embodiment of the present invention, the first scene video of active user is gathered, obtains current use Multiple depth images at family, according to multiple depth images, each frame scene image of the first scene video of processing, to obtain each frame field Personage's area image in scape image, according to the video frame rate of the first scene video, matching is chosen from multiple dynamic backgrounds Target dynamic background, the background image that everyone object area image is corresponded to frame with target dynamic background are merged to be closed And image.The method of existing segmentation personage and background is mainly according to similitude of the adjacent pixel in terms of pixel value and discontinuously Property carry out the segmentation of personage and background, but this dividing method easily by ambient light according to etc. environmental factor influenceed.The present invention is implemented The image processing method of mode, by obtaining the depth image of active user with by the people in each frame scene image in scene video Object area extracts.Influenceed because the acquisition of depth image is not easy the factor such as COLOR COMPOSITION THROUGH DISTRIBUTION in by illumination, scene, therefore, The people's object area extracted by depth image is more accurate, it is particularly possible to which accurate calibration goes out the border of people's object area.Further Ground, obtains the target dynamic background that matches of the first scene video frame per second with user's shooting, realization more accurately people's object area Image can be merged well with target dynamic background.

On the basis of above-described embodiment, the present invention proposes a kind of possible implementation of image processing method, enters one Step illustrates how to obtain the method for depth image, and Fig. 4 is the flow for the image processing method that the embodiment of the present invention two is provided Schematic diagram, as shown in figure 4, on the basis of a upper embodiment, step S302 can also comprise the following steps：

Step S3021, to active user's projective structure light.

Specifically, the equipment for the application call generating structure light installed on electronic equipment, i.e. structured light projector, knot Structure light projector by the project structured light of certain pattern to active user face and body on after, active user face and The surface of body can form the structure light image after being modulated by active user.Wherein, the pattern of structure light can be laser stripe, Gray code, sine streak, non-homogeneous speckle etc..

Step S3022, shoot the structure light image modulated through active user.

Specifically, the structure light video camera head in depth image acquisition component, the structure light image modulated through user is shot.

Step S3023, phase information corresponding to each pixel in demodulation structure light image.

Specifically, compared with non-modulated structure light, the phase information of the structure light after modulation is changed, and is being tied The structure light showed in structure light image is to generate the structure light after distortion, wherein, the phase information of change can characterize The depth information of object.Therefore, structure light video camera head demodulates phase information corresponding to each pixel in structure light image.

Step S3024, phase information is converted into depth information, and generated according to depth information in multiple depth images One depth image.

Specifically, the phase of image is the function of locus, passes through phase information corresponding to each pixel, you can is obtained The functional relation that phase and locus determine, so as to obtain depth information corresponding to each pixel, according to each pixel Position and depth information, generate one in multiple depth images.

And then can according to corresponding to each frame scene image depth information, depth image corresponding to generation, so as to get Multiple depth images of active user.

In the image processing method of embodiment of the present invention, the first scene video of active user is gathered, obtains current use Multiple depth images at family, according to multiple depth images, each frame scene image of the first scene video of processing, to obtain each frame field Personage's area image in scape image, according to the video frame rate of the first scene video, matching is chosen from multiple dynamic backgrounds Target dynamic background, the background image that everyone object area image is corresponded to frame with target dynamic background are merged to be closed And image.The method of existing segmentation personage and background is mainly according to similitude of the adjacent pixel in terms of pixel value and discontinuously Property carry out the segmentation of personage and background, but this dividing method easily by ambient light according to etc. environmental factor influenceed.The present invention is implemented The image processing method of mode, by obtaining the depth image of active user with by the people in each frame scene image in scene video Object area extracts.Influenceed because the acquisition of depth image is not easy the factor such as COLOR COMPOSITION THROUGH DISTRIBUTION in by illumination, scene, therefore, The people's object area extracted by depth image is more accurate, it is particularly possible to which accurate calibration goes out the border of people's object area.Further Ground, obtains the target dynamic background that matches of the first scene video frame per second with user's shooting, realization more accurately people's object area Image can preferably be merged with target dynamic background.

In order that those skilled in the art is more apparent from gathering the face of active user and body according to structure The process of the depth image of body, illustrate it by taking a kind of widely used optical grating projection technology (fringe projection technology) as an example below Concrete principle.Wherein, optical grating projection technology belongs to sensu lato area-structure light.

As shown in Fig. 5 (a), when being projected using area-structure light, sine streak is produced by computer programming first, And sine streak is projected to measured object by structured light projector, recycle structure light video camera head shooting striped to be modulated by object Degree of crook afterwards, then demodulate the curved stripes and obtain phase, then phase is converted into depth information to obtain depth map Picture.The problem of to avoid producing error or error coupler, need to adopt depth image before carrying out depth information collection using structure light Collect component and carry out parameter calibration, demarcation includes geometric parameter (for example, relative between structure light video camera head and structured light projector Location parameter etc.) demarcation, the inner parameter of structure light video camera head and the demarcation of inner parameter etc. of structured light projector.

Specifically, the first step, computer programming produce sine streak.Need to obtain using the striped of distortion due to follow-up Phase, for example phase is obtained using four step phase-shifting methods, therefore produce four width phase differences here and beStriped, then structure light throw Emitter projects the four spokes line timesharing on measured object (mask shown in Fig. 5 (a)), and structure light video camera head is collected such as Fig. 5 (b) figure on the left side, while to read the striped of the plane of reference shown on the right of Fig. 5 (b).

Second step, carry out phase recovery.Bar graph (the i.e. structure that structure light video camera head is modulated according to four width collected Light image) to calculate the phase diagram by phase modulation, now obtained be to block phase diagram.Because the knot that four step Phase-shifting algorithms obtain Fruit is to calculate gained by arctan function, therefore the phase after structure light modulation is limited between [- π, π], that is to say, that every Phase after modulation exceedes [- π, π], and it can restart again.Shown in the phase main value such as Fig. 5 (c) finally given.

Wherein, it is necessary to carry out the saltus step processing that disappears, it is continuous phase that will block phase recovery during phase recovery is carried out Position.As shown in Fig. 5 (d), the left side is the continuous phase bitmap modulated, and the right is to refer to continuous phase bitmap.

3rd step, subtract each other to obtain phase difference (i.e. phase information) by the continuous phase modulated and with reference to continuous phase, should Phase difference characterizes depth information of the measured object with respect to the plane of reference, then phase difference is substituted into the conversion formula (public affairs of phase and depth The parameter being related in formula is by demarcation), you can obtain the threedimensional model of the object under test as shown in Fig. 5 (e).

It should be appreciated that in actual applications, according to the difference of concrete application scene, employed in the embodiment of the present invention Structure light in addition to above-mentioned grating, can also be other arbitrary graphic patterns.

As a kind of possible implementation, the depth information of pattern light progress active user also can be used in the present invention Collection.

Specifically, the method that pattern light obtains depth information is that this spreads out using a diffraction element for being essentially flat board The relief diffraction structure that there are element particular phases to be distributed is penetrated, cross section is with two or more concavo-convex step embossment knots Structure.Substantially 1 micron of the thickness of substrate in diffraction element, each step it is highly non-uniform, the span of height can be 0.7 Micron~0.9 micron.Structure shown in Fig. 6 (a) is the local diffraction structure of the collimation beam splitting element of the present embodiment.Fig. 6 (b) is edge The unit of the cross sectional side view of section A-A, abscissa and ordinate is micron.The speckle pattern of pattern photogenerated has The randomness of height, and can with the difference of distance changing patterns.Therefore, depth information is being obtained using pattern light Before, it is necessary first to the speckle pattern in space is calibrated, for example, in the range of 0~4 meter of distance structure light video camera head, often A reference planes are taken every 1 centimetre, then just save 400 width speckle images after demarcating, the spacing of demarcation is smaller, acquisition The precision of depth information is higher.Then, structured light projector is by pattern light projection to measured object (i.e. active user), quilt The speckle pattern that the difference in height on survey thing surface to project the pattern light on measured object changes.Structure light video camera head After shooting projects the speckle pattern (i.e. structure light image) on measured object, then preserved after speckle pattern and early stage are demarcated 400 width speckle images carry out computing cross-correlation one by one, and then obtain 400 width correlation chart pictures.In space where testee Position can show peak value on correlation chart picture, above-mentioned peak value is superimposed and can obtain after interpolation arithmetic by Survey the depth information of thing.

Most diffraction lights are obtained after diffraction is carried out to light beam due to common diffraction element, but per beam diffraction light light intensity difference Greatly, it is also big to the risk of human eye injury.Re-diffraction even is carried out to diffraction light, the uniformity of obtained light beam is relatively low. Therefore, the effect projected using the light beam of common diffraction element diffraction to measured object is poor.Using collimation in the present embodiment Beam splitting element, the element not only have the function that to collimate uncollimated rays, also have the function that light splitting, i.e., through speculum The non-collimated light of reflection is emitted multi-beam collimation light beam, and the multi-beam collimation being emitted after collimating beam splitting element toward different angles The area of section approximately equal of light beam, flux of energy approximately equal, and then to carry out using the scatterplot light after the beam diffraction The effect of projection is more preferable.Meanwhile laser emitting light is dispersed to every light beam, the risk of injury human eye is reduce further, and dissipate Spot structure light is for other uniform structure lights of arrangement, when reaching same collection effect, the consumption of pattern light Electricity is lower.

Fig. 7 is the schematic flow sheet for the image processing method that the embodiment of the present invention three is provided, and this method can be by image The processor managed in device is realized, people of the active user in scene image is extracted by handling scene image and depth image Object area and obtain personage's area image, as shown in fig. 7, on the basis of above-described embodiment, step S303 can also be included such as Lower step:

Step S3031, for each frame scene image, identify the human face region in scene image.

Specifically, the processor in electronic equipment is directed to each frame scene image, can use the deep learning trained The human face region that Model Identification goes out in scene image.

Step S3032, from depth image corresponding to scene image, obtain depth information corresponding with human face region.

Specifically, the depth information of human face region is can determine that according to the corresponding relation of scene image and depth image.By Include the features such as nose, eyes, ear, lip in human face region, therefore, each feature in human face region is in depth image Corresponding depth data is different, for example, in face face depth image acquisition component, depth image acquisition component is clapped In the depth image taken the photograph, depth data corresponding to nose may be smaller, and depth data corresponding to ear may be larger.Cause This, the depth information of above-mentioned human face region may be a numerical value or a number range.Wherein, when the depth of human face region When degree information is a numerical value, the numerical value can be by averaging to obtain to the depth data of human face region；Or it can pass through It is worth in being taken to the depth data of human face region.

Step S3033, the depth bounds of people's object area is determined according to the depth information of human face region.

Specifically, because people's object area includes human face region, in other words, people's object area is in together a certain with human face region , therefore, can be according to the depth information of human face region after processor determines the depth information of human face region in individual depth bounds Set the depth bounds of people's object area.

Step S3034, determine to be connected and fall into depth bounds with human face region according to the depth bounds of people's object area People's object area is to obtain personage's area image.

Specifically, fallen into the depth bounds according to the extraction of the depth bounds of people's object area and be connected with human face region People's object area is to obtain personage's area image.

In this way, personage's area image can be extracted from scene image according to depth information.Due to obtaining for depth information The image of the not factor such as illumination, colour temperature in by environment is taken to ring, therefore, the personage's area image extracted is more accurate.

On the basis of above-described embodiment, Fig. 8 is the flow signal for the image processing method that the embodiment of the present invention four is provided Figure, it can be realized by the processor in image processing apparatus the step of this method, as shown in figure 8, as a kind of possible realization side Formula, image processing method are further comprising the steps of：

Step S801, scene image is handled to obtain the whole audience edge image of scene image.

Specifically, processor carries out edge extracting to obtain whole audience edge image to scene image first, wherein, whole audience side Edge lines in edge image include the edge lines of background object in scene residing for active user and active user.Specifically Ground, edge extracting can be carried out to scene image by Canny operators.Canny operators carry out the core master of the algorithm of edge extracting To include the following steps：First, convolution is carried out to scene image to eliminate noise with 2D gaussian filterings template；Then, utilization is micro- Divide operator to obtain the Grad of the gray scale of each pixel, and the gradient direction of the gray scale according to each pixel of Grad calculating, lead to Adjacent pixels of the respective pixel along gradient direction can be found by crossing gradient direction；Then, each pixel is traveled through, if some pixel Gray value be not maximum compared with the gray value of former and later two adjacent pixels on its gradient direction, then think this pixel It is not marginal point.In this way, the pixel that marginal position is in scene image is can determine that, it is complete after edge extracting so as to obtain Field edge image.

Step S802, according to whole audience edge image amendment personage's area image.

Specifically, after processor obtains whole audience edge image, personage's area image is carried out further according to whole audience edge image Amendment.It is appreciated that personage's area image is will to be connected and fall into the depth bounds of setting in scene image with human face region Obtained after all pixels progress merger, in some scenarios, it is understood that there may be some are connected with human face region and fall into depth model Enclose interior object.Therefore, to cause personage's area image of extraction more accurate, whole audience edge graph can be used to personage's administrative division map As being modified.

Further, processor can also carry out second-order correction to revised personage's area image, for example, after can be to amendment Personage's area image carry out expansion process, expand personage's area image to retain the edge details of personage's area image.

After processor obtains personage's area image, you can personage's area image is merged with target dynamic background, is entered And obtain merging image.As a kind of possible implementation, target dynamic background can be randomly selected by processor, or Voluntarily selected by active user.Merging image after fusion can be shown on the display screen of electronic installation, also can by with The printer of electronic installation connection is printed.

In some application scenarios, for example, active user carries out wishing to hide current background during video with other people, Now, you can moved personage's area image corresponding to active user with target using the image processing method of embodiment of the present invention The background image fusion of corresponding frame in state background, then show the merging image after fusion to other side.Due to active user just with Other side's video calling, therefore, it is seen that light video camera head needs the scene image of captured in real-time active user, depth image acquisition component Need to gather depth image corresponding to active user in real time, and by the processor scene image and depth map to gathering in real time in time As carrying out being processed so that other side the video pictures combined by multiframe merging image it can be seen that smooth.

Target dynamic background is pre-stored within cloud server or mobile phone memory, its background frame per second and first The video frame rate of scene video not necessarily matches, when the video frame rate of the first scene video and the background frame per second of target dynamic background , it is necessary to handle the first scene video and/or target dynamic background during mismatch, treated is caused into the first scene visual The frame per second of frequency and target dynamic background matches, therefore, the embodiment of the present invention proposes the possible of another image processing method Implementation.

The schematic flow sheet for the image processing method that Fig. 9 A are provided by the embodiment of the present invention five, by the side for extracting frame Method, the first scene video and the matching of target dynamic background frame per second are realized, as shown in Figure 9 A, this method includes：

Step S901, gather the first scene video of active user.

Step S902, obtain multiple depth images of active user.

Step S903, according to multiple depth images, each frame scene image of the first scene video is handled, is currently used with obtaining People object area of the family in each frame scene image and personage's area image corresponding to obtaining.

Step S901 to step S903, it can refer to Fig. 3 and correspond to step S301 to step S303 in embodiment, herein no longer Repeat.

Step S904, judge used by the first scene video video frame rate whether the background frame per second with multiple dynamic backgrounds Mismatch, if so, step S906 is performed, if it is not, then performing step S905.

Specifically, user, which shoots, obtains the first scene video, with the frame per second of multiple dynamic backgrounds that prestores not necessarily Matching relationship be present, if video frame rate and the background frame per second of multiple dynamic backgrounds that the first scene video uses mismatch, The first dynamic background is chosen from multiple dynamic backgrounds, is matched if existing in multiple dynamic backgrounds with the frame per second of the first scene video Dynamic background, then the target dynamic background is chosen from multiple dynamic backgrounds.

Step S905, the target dynamic background of matching is chosen from multiple dynamic backgrounds.

Specifically, the difference that the video frame rate of background frame per second and the first scene video is chosen from multiple dynamic backgrounds is less than threshold The dynamic background of value, wherein, threshold value value is smaller, and threshold value is integer, such as threshold value value is 2, the video of the first scene video Frame per second is 30 frames/second, is then then 29 frames/second with the frame per second of the dynamic background of the first scene video matching according to threshold value, or 30 Frame/second, or 31 frames/second, dynamic background corresponding to 30 frames/second that the preferential difference for choosing frame per second is 0, the dynamic as matching are carried on the back Scape.

Step S906, the first dynamic background is chosen from multiple dynamic backgrounds.

Specifically, if the dynamic background with the matching of the first scene video is not present in dynamic background, carried on the back from multiple dynamics Jing Zhong, the first dynamic background is chosen, wherein, the background frame per second of the first dynamic background and the video frame rate of video frame rate are closest, Or first dynamic background background frame per second and the video frame rate of video frame rate multiple proportion be present.

Step S907, judges whether the video frame rate of the first scene video is more than the background frame per second of the first dynamic background, if It is to perform step S909, if it is not, performing step S908.

Step S908, frame extraction is carried out to the first dynamic background, obtain the second scene video that frame per second is mutually matched and the Two dynamic backgrounds.

Specifically, if the video frame rate of the first scene video is less than the background frame per second of the first dynamic background, to the first dynamic The background frame per second of background carries out frame extraction, so that the video frame rate of the first scene video and frame extract the second obtained dynamic background The matching of background frame per second, and using the first scene video as the second scene video.

For the method for frame extraction, for example, for example, calculating the frame per second of the first dynamic background and the first scene video Difference, e.g., difference are 7 frames, then for the first dynamic background, 7 frame deletions are randomly selected from the multiframe background image of each second, The second dynamic background is obtained, so that complete of the background frame per second of the video frame rate of the second scene video and the second dynamic background Match somebody with somebody.

Step S909, frame extraction is carried out to the first scene video, obtain the second scene video that frame per second is mutually matched and the Two dynamic backgrounds.

Specifically, if the video frame rate of the first scene video is more than the background frame per second of the first dynamic background, to the first scene Video carries out frame extraction, so that frame extracts the video frame rate of the second obtained scene video and the background frame per second of the first dynamic background Matching, and using the first dynamic background as the second dynamic background.

Step S910, everyone object area image is corresponded to obtained dynamic background frame background image merge with Obtain merging image.

Specifically, the second scene video and the frame per second of the second dynamic background obtained matches, personage in the second scene video Each two field picture has corresponding relation in area image and the second dynamic background, and personage's area image can be with the target dynamic of selection The corresponding two field picture of background is merged, because everyone its edge of object area image acquired is all than more visible standard True, the merging image effect obtained after fusion is preferable.

In the image processing method of embodiment of the present invention, the first scene video of active user is gathered, obtains current use Multiple depth images at family, according to multiple depth images, each frame scene image of the first scene video of processing, to obtain each frame field Personage's area image in scape image, according to the video frame rate of the first scene video, matching is chosen from multiple dynamic backgrounds Target dynamic background, the background image that everyone object area image is corresponded to frame with target dynamic background are merged to be closed And image.The method of existing segmentation personage and background is mainly according to similitude of the adjacent pixel in terms of pixel value and discontinuously Property carry out the segmentation of personage and background, but this dividing method easily by ambient light according to etc. environmental factor influenceed.The present invention is implemented The image processing method of mode, by obtaining the depth image of active user with by the people in each frame scene image in scene video Object area extracts.Influenceed because the acquisition of depth image is not easy the factor such as COLOR COMPOSITION THROUGH DISTRIBUTION in by illumination, scene, therefore, The people's object area extracted by depth image is more accurate, it is particularly possible to which accurate calibration goes out the border of people's object area.Further Ground, the scene video of method first and the frame per second of target dynamic background extracted by frame match so that extract obtained essence Accurate personage's area image can be merged preferably with predetermined target dynamic background.

The schematic flow sheet for the image processing method that Fig. 9 B are provided by the embodiment of the present invention six, by being inserted in consecutive frame The method for supplementing frame, the first scene video and the matching of target dynamic background frame per second are realized, as shown in Figure 9 B, this method includes：

Step S911, gather the first scene video of active user.

Step S912, obtain multiple depth images of active user.

Step S913, according to multiple depth images, each frame scene image of the first scene video is handled, is currently used with obtaining People object area of the family in each frame scene image and personage's area image corresponding to obtaining.

Step S911~step S913, it can refer to Fig. 3 and correspond to step S301 to step S303 in embodiment, herein no longer Repeat.

Step S914, judge used by the first scene video video frame rate whether the background frame per second with multiple dynamic backgrounds Matching, if so, step S915 is performed, if it is not, then performing step S916.

Step S915, the target dynamic background of matching is chosen from multiple dynamic backgrounds.

Step S916, the first dynamic background is chosen from multiple dynamic backgrounds.

Step S914~step S916, step S904~step S906 in an embodiment is can refer to, it is no longer superfluous herein State.

Step S917, judges whether the video frame rate of the first scene video is more than the background frame per second of the first dynamic background, if It is to perform step S919, if it is not, performing step S918.

Step S918, insertion supplement frame, obtains frame per second is mutually matched second between the consecutive frame of the first scene video Scene video and the second dynamic background.

Specifically, if the video frame rate of the first scene video is less than the background frame per second of the first dynamic background, in the first scene Insertion supplement frame in the consecutive frame of video, so that frame inserts the video frame rate and the first dynamic background of the second obtained scene video The matching of background frame per second, and using the first dynamic background as the second dynamic background.

Wherein, the value of each pixel is according to respective pixel in consecutive frame in the supplement frame inserted in the first scene video What the value of point determined, that is, the supplement frame and consecutive frame inserted is duplicate scenic picture.

Step S919, insertion supplement frame, obtains frame per second is mutually matched second between the consecutive frame of the first dynamic background Scene video and the second dynamic background.

Specifically, if the video frame rate of the first scene video is more than the background frame per second of the first dynamic background, in the first dynamic Insertion supplement frame between the consecutive frame of background, so that the second dynamic that the video frame rate of the first scene video inserts to obtain with frame is carried on the back The background frame per second matching of scape, and using the first scene video as the second scene video.

Wherein, the value of each pixel is according to respective pixel in consecutive frame in the supplement frame inserted in the first dynamic background What the value of point determined, that is, the supplement frame and the background frame of consecutive frame obtained is duplicate.

Step S920, everyone object area image is corresponded to obtained dynamic background frame background image merge with Obtain merging image.

In the image processing method of embodiment of the present invention, the first scene video of active user is gathered, obtains current use Multiple depth images at family, according to multiple depth images, each frame scene image of the first scene video of processing, to obtain each frame field Personage's area image in scape image, according to the video frame rate of the first scene video, matching is chosen from multiple dynamic backgrounds Target dynamic background, the background image that everyone object area image is corresponded to frame with target dynamic background are merged to be closed And image.The method of existing segmentation personage and background is mainly according to similitude of the adjacent pixel in terms of pixel value and discontinuously Property carry out the segmentation of personage and background, but this dividing method easily by ambient light according to etc. environmental factor influenceed.The present invention is implemented The image processing method of mode, by obtaining the depth image of active user with by the people in each frame scene image in scene video Object area extracts.Influenceed because the acquisition of depth image is not easy the factor such as COLOR COMPOSITION THROUGH DISTRIBUTION in by illumination, scene, therefore, The people's object area extracted by depth image is more accurate, it is particularly possible to which accurate calibration goes out the border of people's object area.Further Ground, the scene video of method first of frame and the frame per second matching of target dynamic background are supplemented by being inserted between consecutive frame, is made The accurately personage area image that must extract to obtain can be merged preferably with predetermined target dynamic background.

To realize above-described embodiment, the invention also provides a kind of electronic installation, Figure 10 is provided by the embodiment of the present invention A kind of electronic installation module diagram.

Reference picture 3 and Figure 10, electronic installation 1000 include image processing apparatus 100.Image processing apparatus 100 can utilize Hardware and/or software are realized.Image processing apparatus 100 includes imaging device 10 and processor 20.

Imaging device 10 includes visible image capturing first 11 and depth image acquisition component 12.

Specifically, it is seen that light video camera head 11 includes imaging sensor 111 and lens 112, it is seen that light video camera head 11 can be used for The colour information of active user is caught to obtain scene video image, wherein, imaging sensor 111 includes color filter lens array (such as Bayer filter arrays), the number of lens 112 can be one or more.Visible image capturing first 11 is obtaining scene video figure As during, each imaging pixel in imaging sensor 111 senses luminous intensity and wavelength information in photographed scene, Generate one group of raw image data；Imaging sensor 111 sends this group of raw image data into processor 20, processor 20 The scene image of colour is obtained after the computings such as denoising, interpolation are carried out to raw image data.Processor 20 can be in various formats Each image pixel in raw image data is handled one by one, for example, each image pixel there can be 8,10,12 or 14 bits Bit depth, processor 20 can be handled each image pixel by identical or different bit depth.

Depth image acquisition component 12 includes structured light projector 121 and structure light video camera head 122, depth image collection group The depth information that part 12 can be used for catching active user is to obtain depth image.Structured light projector 121 is used to throw structure light Active user is incident upon, wherein, structured light patterns can be the speckle of laser stripe, Gray code, sine streak or random alignment Pattern etc..Structure light video camera head 122 includes imaging sensor 1221 and lens 1222, and the number of lens 1222 can be one or more It is individual.Imaging sensor 1221 is used for the structure light image that capturing structure light projector 121 is projected on active user.Structure light figure As can be sent by depth acquisition component 12 to processor 20 be demodulated, the processing such as phase recovery, phase information calculate to be to obtain The depth information of active user.

In some embodiments, it is seen that the function of light video camera head 11 and structure light video camera head 122 can be by a camera Realize, in other words, imaging device 10 only includes a camera and a structured light projector 121, and above-mentioned camera is not only Structure light image can also be shot with photographed scene image.

Except using structure light obtain depth image in addition to, can also by binocular vision method, based on differential time of flight (Time Of Flight, TOF) even depth obtains the depth image of active user as acquisition methods.

Processor 20 is further used for personage's area image by being extracted from scene image and depth image, and multiple The image co-registration of frame is corresponded in the target dynamic background for the matching chosen in dynamic background.When extracting personage's area image, place The depth information that reason device 20 can be combined in depth image extracts personage's area image of two dimension from scene image, can also Depth information in depth image establishes the graphics of people's object area, in conjunction with the color information in scene image to three-dimensional People's object area carry out color fill up with obtain three-dimensional colored personage's area image.Therefore, each personage of fusion treatment Can be to carry on the back personage's area image of two dimension and target dynamic when corresponding to the image of frame in area image and target dynamic background Two-dimensional background image in scape merged with obtain merge image or by three-dimensional colored personage's area image with Two-dimensional background image in target dynamic background is merged to obtain merging image.

In addition, image processing apparatus 100 also includes video memory 30.Video memory 30 can be embedded in electronic installation In 1000 or independently of the memory outside electronic installation 1000, and it may include direct memory access (DMA) (Direct Memory Access, DMA) feature.The raw image data or depth image acquisition component 12 of first 11 collection of visible image capturing are adopted The structure light image related data of collection, which can transmit, to be stored or is cached into video memory 30.Processor 20 can be from image Raw image data is read in memory 30 to be handled to obtain scene image, also can read structure from video memory 30 Light image related data is to be handled to obtain depth image.Deposited in addition, scene image and depth image are also storable in image In reservoir 30, calling is handled device 20 for processing at any time, for example, processor 20 calls scene image and depth image to carry out personage Extracted region, and obtained personage's area image after carrying carries out fusion treatment to be merged with target dynamic background image Image.Wherein, target dynamic background image and merging image may be alternatively stored in video memory 30.

Image processing apparatus 100 may also include display 50.Display 50 can obtain merging figure directly from processor 20 Picture, it can also be obtained from video memory 30 and merge image.The display of display 50 merges image so that user watches, or by scheming Shape engine or graphics processor (Graphics Processing Unit, GPU) are further processed.Image processing apparatus 100 also include encoder/decoder 60, and encoder/decoder 60 encoding and decoding scene image, depth image and can merge image etc. View data, the view data of coding can be stored in video memory 30, and can be shown in display 50 in image By decoder decompresses to be shown before upper.Encoder/decoder 60 can be by central processing unit (Central Processing Unit, CPU), GPU or coprocessor realize.In other words, encoder/decoder 60 can be central processing unit Any one or more in (Central Processing Unit, CPU), GPU and coprocessor.

Image processing apparatus 100 also includes control logic device 40.Imaging device 10 imaging when, processor 20 can according into As the data that equipment obtains are analyzed to determine one or more control parameters of imaging device 10 (for example, time for exposure etc.) Image statistics.Processor 20 sends image statistics to control logic device 40, the control imaging of control logic device 40 Equipment 10 is imaged with the control parameter determined.Control logic device 40 may include to perform one or more routines (such as firmware) Processor and/or microcontroller.One or more routines can determine imaging device 10 according to the image statistics of reception Control parameter.

To realize above-described embodiment, the module signal for another electronic installation that Figure 11 is provided by the embodiment of the present invention Figure, as shown in figure 11, the electronic installation 1000 of embodiment of the present invention include one or more processors 200, the and of memory 300 One or more programs 310.Wherein one or more programs 310 are stored in memory 300, and are configured to by one Or multiple processors 200 perform.Program 310 includes being used to perform the finger of the image processing method of above-mentioned any one embodiment Order.

For example, program 310 includes being used for the instruction for performing the image processing method described in following steps：

Step S301, gather the first scene video of active user.

Step S302, obtain multiple depth images of active user.

For another example program 310 also includes being used for the instruction for performing the image processing method described in following steps：

Step S3023, phase information corresponding to each pixel in demodulation structure light image；

The computer-readable recording medium of embodiment of the present invention includes being combined with the electronic installation 1000 that can be imaged making Computer program.Computer program can be performed by processor 200 to complete at the image of above-mentioned any one embodiment Reason method.

For example, computer program can be performed by processor 200 to complete the image processing method described in following steps：

Step S301, gather the first scene video of active user.

Step S302, obtain multiple depth images of active user.

For another example computer program can be also performed by processor 200 to complete the image processing method described in following steps：

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description Point is contained at least one embodiment or example of the present invention.In this manual, to the schematic representation of above-mentioned term not Identical embodiment or example must be directed to.Moreover, specific features, structure, material or the feature of description can be with office Combined in an appropriate manner in one or more embodiments or example.In addition, in the case of not conflicting, the skill of this area Art personnel can be tied the different embodiments or example and the feature of different embodiments or example described in this specification Close and combine.

In addition, term " first ", " second " are only used for describing purpose, and it is not intended that instruction or hint relative importance Or the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the invention, " multiple " are meant that at least two, such as two, three It is individual etc., unless otherwise specifically defined.

Any process or method described otherwise above description in flow chart or herein is construed as, and represents to include Module, fragment or the portion of the code of the executable instruction of one or more the step of being used to realize specific logical function or process Point, and the scope of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discuss suitable Sequence, including according to involved function by it is basic simultaneously in the way of or in the opposite order, carry out perform function, this should be of the invention Embodiment person of ordinary skill in the field understood.

Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system including the system of processor or other can be held from instruction The system of row system, device or equipment instruction fetch and execute instruction) use, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicate, propagate or pass Defeated program is for instruction execution system, device or equipment or the dress used with reference to these instruction execution systems, device or equipment Put.The more specifically example (non-exhaustive list) of computer-readable medium includes following：Electricity with one or more wiring Connecting portion (electronic installation), portable computer diskette box (magnetic device), random access memory (RAM), read-only storage (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device, and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium, which can even is that, to print the paper of described program thereon or other are suitable Medium, because can then enter edlin, interpretation or if necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned In embodiment, software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage Or firmware is realized.If, and in another embodiment, can be with well known in the art for example, realized with hardware Any one of row technology or their combination are realized：With the logic gates for realizing logic function to data-signal Discrete logic, have suitable combinational logic gate circuit application specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..

Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method carries Suddenly it is that by program the hardware of correlation can be instructed to complete, described program can be stored in a kind of computer-readable storage medium In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.

In addition, each functional unit in each embodiment of the present invention can be integrated in a processing module, can also That unit is individually physically present, can also two or more units be integrated in a module.Above-mentioned integrated mould Block can both be realized in the form of hardware, can also be realized in the form of software function module.The integrated module is such as Fruit is realized in the form of software function module and as independent production marketing or in use, can also be stored in a computer In read/write memory medium.

Storage medium mentioned above can be read-only storage, disk or CD etc..Although have been shown and retouch above Embodiments of the invention are stated, it is to be understood that above-described embodiment is exemplary, it is impossible to be interpreted as the limit to the present invention System, one of ordinary skill in the art can be changed to above-described embodiment, change, replace and become within the scope of the invention Type.

Claims

1. a kind of image processing method, for electronic installation, it is characterised in that described image processing method includes：

Gather the first scene video of active user；

Obtain multiple depth images of the active user；

According to the multiple depth image, each frame scene image of processing first scene video, to obtain the current use People object area of the family in each frame scene image and personage's area image corresponding to obtaining；

According to video frame rate used by first scene video, the target dynamic back of the body of matching is chosen from multiple dynamic backgrounds Scape；

The background image that each described personage's area image is corresponded to frame with the target dynamic background is merged to be closed And image.

2. image processing method according to claim 1, it is characterised in that the multiple depths for obtaining the active user The step of spending image includes：

To active user's projective structure light；

The structure light image that shooting is modulated through the active user；With

Phase information corresponding to each pixel of the structure light image is demodulated to obtain one in the multiple depth image Depth image.

3. image processing method according to claim 2, it is characterised in that described to demodulate each of the structure light image Phase information corresponding to pixel the step of obtaining a depth image in the multiple depth image to include：

Demodulate phase information corresponding to each pixel in the structure light image；

The phase information is converted into depth information；With

One depth image is generated according to the depth information.

4. image processing method according to claim 1, it is characterised in that described according to the multiple depth image, place Each frame scene image of first scene video is managed, to obtain people object area of the active user in each frame scene image And include corresponding to obtaining the step of personage's area image：

For each frame scene image, the human face region in the scene image is identified；

From depth image corresponding to the scene image, depth information corresponding with the human face region is obtained；

The depth bounds of people's object area is determined according to the depth information of the human face region；With

The people for determining to be connected and fall into the depth bounds with the human face region according to the depth bounds of people's object area Object area is to obtain personage's area image.

5. image processing method according to claim 4, it is characterised in that described image processing method also includes：

The scene image is handled to obtain the whole audience edge image of the scene image；With

According to personage's area image described in the whole audience edge image amendment.

6. image processing method according to claim 1, it is characterised in that described to be adopted according to first scene video Video frame rate, include the step of the target dynamic background that matching is chosen from multiple dynamic backgrounds：

According to background frame per second used by multiple dynamic backgrounds, from multiple dynamic backgrounds, the target dynamic background is chosen；Institute The difference for stating the background frame per second of target dynamic background and the video frame rate of first scene video is less than threshold value.

7. image processing method according to claim 1, it is characterised in that methods described also includes：

If the background frame per second of the video frame rate of first scene video and the multiple dynamic background mismatches, from described more In individual dynamic background, the first dynamic background is chosen；Wherein, the background frame per second of first dynamic background and the video frame rate Video frame rate is closest, or the background frame per second of first dynamic background has multiple pass with the video frame rate of the video frame rate System；

First scene video and/or first dynamic background are handled, the second scene visual being mutually matched Frequency and the second dynamic background, wherein, the background frame per second of the video frame rate of second scene video and second dynamic background Matching.

8. image processing method according to claim 7, it is characterised in that it is described to first scene video and/or The step of first dynamic background is handled includes：

If the video frame rate of first scene video is more than the background frame per second of first dynamic background, to first scene Video carries out frame extraction, so that frame extracts the video frame rate of the second obtained scene video and the background of first dynamic background Frame per second matches, and using first dynamic background as second dynamic background；

If the video frame rate of first scene video is less than the background frame per second of first dynamic background, to the described first dynamic Background carries out frame extraction, so that the video frame rate of first scene video and frame extract the background of the second obtained dynamic background Frame per second matches, and using first scene video as second scene video.

9. image processing method according to claim 7, it is characterised in that described to the scene video and/or described The step of target dynamic background is handled includes：

If the video frame rate of first scene video is more than the background frame per second of first dynamic background, in the described first dynamic Insertion supplement frame between the consecutive frame of background so that the video frame rate of first scene video and frame insert to obtain it is second dynamic The background frame per second matching of state background, and using first scene video as second scene video；

If the video frame rate of first scene video is less than the background frame per second of first dynamic background, in first scene Insertion supplement frame in the consecutive frame of video, so that frame inserts the video frame rate and the described first dynamic of the second obtained scene video The background frame per second matching of background, and using first dynamic background as second dynamic background.

10. image processing method according to claim 9, it is characterised in that each pixel takes in the supplement frame of insertion Value is determined according to the value of corresponding pixel points in the consecutive frame.

11. a kind of image processing apparatus, for electronic installation, it is characterised in that described image processing unit includes：

Visible image capturing head, the visible image capturing head are used for the first scene video for gathering active user；

Depth image acquisition component, the depth image acquisition component are used for the multiple depth images for obtaining the active user； With

Processor, the processor are used for：

12. a kind of electronic installation, it is characterised in that the electronic installation includes：

One or more processors；

Memory；With

One or more programs, wherein one or more of programs are stored in the memory, and be configured to by One or more of computing devices, described program include being used at the image that perform claim is required described in 1-10 any one The instruction of reason method.

A kind of 13. computer-readable recording medium, it is characterised in that the meter being used in combination including the electronic installation with that can image Calculation machine program, the computer program can be executed by processor to complete the image procossing described in claim 1-10 any one Method.