CN116781857B

CN116781857B - Video conference background processing system and method

Info

Publication number: CN116781857B
Application number: CN202311076156.3A
Authority: CN
Inventors: 宋淑萍; 李洁; 王帅; 常建龙; 齐明龙; 秦伟; 王超英; 王月梅; 宋亚静; 潘翔; 王更生
Original assignee: Shijiazhuang Changchuan Electric Technology Co ltd
Current assignee: Shijiazhuang Changchuan Electric Technology Co ltd
Priority date: 2023-08-25
Filing date: 2023-08-25
Publication date: 2023-10-20
Anticipated expiration: 2043-08-25
Also published as: CN116781857A

Abstract

The invention relates to a video conference background processing system and method, and belongs to the technical field of video conferences. The system comprises: the intelligent identification device is used for intelligently identifying whether partial pixel points in the conference acquisition picture corresponding to the next conference moment of the current conference moment are background edge pixel points or not based on the identification information of the past acquisition picture by adopting an AI identification model; and the advanced configuration device is connected with the intelligent identification device and is used for executing advanced assignment processing on partial pixel points in the conference acquisition picture corresponding to the next conference time of the current conference time based on the intelligent identification result. The invention can determine whether partial pixel points in the conference picture at the future moment are the assignment of the background edge pixel points in advance by adopting the artificial intelligence model, thereby reducing the number of the pixel points to be processed at the future moment and improving the display effect of the background edge part in the conference picture.

Description

Video conference background processing system and method

Technical Field

The invention relates to the technical field of video conferences, in particular to a video conference background processing system and method.

Background

Video conferencing refers to a conference in which people located at two or more sites are talking face-to-face through a communication device and a network. Video conferences can be divided into point-to-point conferences and multipoint conferences, depending on the number of sites involved. Individuals in daily life have no requirements on conversation content safety, conference quality and conference scale, and can adopt video chat software to carry out video chat. The business video conference of government authorities and enterprises and institutions requires stable and safe network, reliable conference quality, formal conference environment and other conditions, and a special background wall, such as a solid background wall, is generally set, so that the business video conference is a true video conference in a general sense.

For example, chinese patent publication CN115086592a proposes a video conference background processing method, which includes: acquiring image data of a video conference and voice data of the video conference; determining the background of the video conference according to the image data of the video conference; determining feature data according to the voice data of the video conference; according to the characteristic data, determining conference background corresponding to the characteristic data; the background of the video conference is replaced by the conference background corresponding to the characteristic data, so that the video background in the video conference can be quickly and accurately replaced, the matching degree of the replaced video background and the video conference type is higher, and the user experience is improved. The invention also provides a video conference background processing device, a medium and electronic equipment.

For example, chinese patent publication CN103997616a proposes a method, an apparatus and a conference terminal for processing a video conference picture, where the method includes: acquiring background picture information of a video conference; when a video conference is carried out, the instant image information of the video conference system is segmented; acquiring foreground picture information serving as picture centering from the segmented instant image information; and superposing the foreground picture information into the background picture information to generate picture-centered video image information. The invention divides the instant image of the video conference system of the participated video conference personnel to obtain the foreground image, and generates the video image with the centered image by superposing the foreground image into the background image, so that the video conference participator is positioned in the centered position of the video image, and the user experience of the video conference system is improved.

However, the above prior art is limited to the definition of the background picture and the optimization of the background picture, and is not considered that the continuous playing of each frame of video picture is a data processing process with huge data volume and extremely high real-time requirement in the video conference process, if the video picture is displayed after being collected, on one hand, the efficiency of data replacement is low, the display speed cannot meet the real-time requirement, and on the other hand, the delayed data replacement causes flickering and degradation of the background edge, thereby causing the reduction of the playing efficiency and effect of the whole video data stream of the video conference.

Disclosure of Invention

In order to solve the technical problems in the prior art, the invention provides a video conference background processing system and a video conference background processing method, which aim at each background edge pixel point in a currently acquired conference picture, intelligently judge whether the pixel point at the same position as the background edge pixel point belongs to the background edge pixel point in the conference picture at the future moment based on the identification information of whether the pixel point at the relevant position in the past multi-frame conference picture belongs to the background edge pixel point or not, further determine whether the corresponding pixel point at the respective position of each background edge pixel point in the currently acquired conference picture at the future moment belongs to the background edge pixel point or not, and replace the pixel value of the background color of part of the pixel points of the conference picture corresponding to the future moment in advance based on the determination result before the future moment arrives, thereby reducing the number of the pixel points to be processed at the future moment, improving the display speed of the video conference picture and avoiding the flicker and degradation of the background edge.

According to a first aspect of the present invention there is provided a videoconference background processing system, the system comprising:

the video acquisition device is used for acquiring each frame of conference acquisition picture corresponding to each conference moment in the video conference, the conference moments are equal in interval every other and the time length between two adjacent conference moments is a multiple of the acquisition interval time length of the video acquisition device, and a conference host is positioned in front of a conference background wall in the video conference;

The edge detection device is connected with the video acquisition device and is used for acquiring a conference acquisition picture corresponding to the current conference moment and identifying each background edge pixel point in the conference acquisition picture corresponding to the current conference moment based on the color imaging characteristics of the conference background wall;

the information capturing device is connected with the edge detection device and is used for executing the following processing for each background edge pixel point in the conference acquisition picture corresponding to the current conference moment: acquiring a preset number of conference acquisition pictures before a current conference time as previous acquisition pictures of each frame, acquiring coordinate values of background edge pixel points in the conference acquisition pictures corresponding to the current conference time as reference coordinate values, and taking edge identifiers corresponding to pixel points covered in a pixel point window taking the reference coordinate values as the center in each previous acquisition picture of each frame as judging reference data corresponding to the previous acquisition picture of the frame so as to output judging reference data corresponding to the previous acquisition pictures of each frame;

the intelligent identification device is connected with the information capture device and is used for intelligently identifying the edge identification of the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference moment of the current conference moment by adopting an AI identification model based on each judgment reference data corresponding to each frame of past acquisition picture, the number of the pixel points covered by the pixel point window taking the reference coordinate value as the center and the horizontal resolution and the vertical resolution of the video acquisition picture;

The advanced configuration device is connected with the intelligent identification device and is used for configuring the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference moment of the current conference moment into the background edge pixel point in advance at the time point between the current conference moment and the next conference moment of the current conference moment when the received edge mark is represented as the background edge pixel point, and configuring the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference moment of the current conference moment into the non-background edge pixel point in advance at the time point between the current conference moment and the next conference moment of the current conference moment when the received edge mark is represented as the non-background edge pixel point, traversing each background edge pixel point in the conference acquisition picture corresponding to the current conference moment to finish the advanced configuration of each reference coordinate value respectively corresponding to each background pixel point in the conference acquisition picture corresponding to the next conference moment of the current conference moment;

and the replacement display device is respectively connected with the video acquisition device and the advance configuration device and is used for acquiring a conference acquisition picture corresponding to the next meeting time of the current meeting time from the video acquisition device as an actual acquisition picture when the next meeting time of the current meeting time arrives, determining each edge mark corresponding to each pixel point in the actual acquisition picture, judging the error rate of a background edge pixel point which is configured in advance and corresponds to the conference acquisition picture corresponding to the next meeting time of the current meeting time based on each edge mark, and displaying the conference acquisition picture corresponding to the next meeting time of the current meeting time based on the advance configuration result of the pixel point at each reference coordinate value when the error rate is lower than or equal to a preset ratio limit.

According to a second aspect of the present invention, there is provided a video conference background processing method, the method comprising:

acquiring each frame of conference acquisition picture corresponding to each conference moment in the video conference by using a video acquisition device, wherein each conference moment is equal in interval and the time length between two adjacent conference moments is a multiple of the acquisition interval time length of the video acquisition device, and a conference host is positioned in front of a conference background wall in the video conference;

acquiring a conference acquisition picture corresponding to the current conference moment, and identifying each background edge pixel point in the conference acquisition picture corresponding to the current conference moment based on the color imaging characteristics of the conference background wall;

the following processing is executed for each background edge pixel point in the conference acquisition picture corresponding to the current conference time: acquiring a preset number of conference acquisition pictures before a current conference time as previous acquisition pictures of each frame, acquiring coordinate values of background edge pixel points in the conference acquisition pictures corresponding to the current conference time as reference coordinate values, and taking edge identifiers corresponding to pixel points covered in a pixel point window taking the reference coordinate values as the center in each previous acquisition picture of each frame as judging reference data corresponding to the previous acquisition picture of the frame so as to output judging reference data corresponding to the previous acquisition pictures of each frame;

Based on each judgment reference data corresponding to each frame of past acquisition picture, the number of pixel points covered by a pixel point window taking a reference coordinate value as a center, and the horizontal resolution and the vertical resolution of the video acquisition device acquisition picture, intelligent identification of the edge identification of the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference time at the current conference time by adopting an AI identification model;

when the received edge mark is represented as a background edge pixel point, the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference time of the current conference time is configured as a background edge pixel point in advance at the time point between the current conference time and the next conference time of the current conference time, and when the received edge mark is represented as a non-background edge pixel point, the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference time of the current conference time is configured as a non-background edge pixel point in advance at the time point between the current conference time and the next conference time of the current conference time, and each background edge pixel point in the conference acquisition picture corresponding to the current conference time is traversed to complete the advanced configuration of the pixel point at each reference coordinate value corresponding to each background edge pixel point in the conference acquisition picture corresponding to the next conference time of the current conference time;

When the next meeting time of the current meeting time arrives, acquiring a meeting acquisition picture corresponding to the next meeting time of the current meeting time from the video acquisition device to be used as an actual acquisition picture, determining each edge mark corresponding to each pixel point in the actual acquisition picture, judging the error rate of background edge pixels configured in advance for the meeting acquisition picture corresponding to the next meeting time of the current meeting time based on each edge mark, and displaying the meeting acquisition picture corresponding to the next meeting time of the current meeting time based on the advanced configuration result of the pixels at each reference coordinate value when the error rate is lower than or equal to a preset ratio limit.

It can be seen that the present invention has at least the following four key inventions:

inventive point a: under the video conference picture display scene with high display speed requirement and large display data quantity, adopting an artificial intelligent model to intelligently judge whether the pixel points of the video conference picture at the same pixel point position at the future moment belong to the background edge pixel points based on a plurality of pieces of judgment information of whether the pixel points of the video conference picture at the same pixel point position of each frame of video conference picture acquired in the past belong to the background edge pixel points;

Invention point B: performing intelligent judgment on each background edge pixel point in a video conference picture at the moment before the future moment one by one, and configuring pixel values of partial pixel points in the video conference picture at the future moment in advance before the future moment is reached based on the intelligent judgment result, so that the number of pixel points configuring the pixel values when the future moment is reached is reduced, the display speed of the video conference picture is improved, the display reaction delay of the video conference picture is avoided, and the probability of edge flickering when a background wall body in the video conference picture is displayed is reduced;

invention point C: the artificial intelligent model is a deep neural network after multiple training is completed, the training times are positively correlated with the horizontal resolution of the acquired picture and positively correlated with the vertical resolution of the acquired picture, wherein the horizontal resolution of the acquired picture is the total number of pixel columns of the acquired picture, and the vertical resolution of the acquired picture is the total number of pixel rows of the acquired picture;

invention point D: acquiring each constituent pixel point of an imaging region forming a conference background wall body in a conference acquisition picture corresponding to the current conference moment, and determining the inclination grade of the conference background wall body based on the standard deviation of each imaging depth value corresponding to each constituent pixel point, wherein the larger the standard deviation value of each imaging depth value corresponding to each constituent pixel point is, the higher the determined inclination grade of the conference background wall body is, so that the intelligent identification of the state of the background wall body is completed in the video conference process.

Drawings

Embodiments of the present invention will be described below with reference to the accompanying drawings, in which:

fig. 1 is a technical flow diagram of a video conference background processing system and method according to the present invention.

Fig. 2 is a block diagram of structural components of the video conference background processing system shown in embodiment 1 according to the present invention.

Fig. 3 is a block diagram of structural components of the video conference background processing system shown in embodiment 2 according to the present invention.

Fig. 4 is a block diagram of structural components of a video conference background processing system shown in embodiment 3 according to the present invention.

Fig. 5 is a block diagram of structural components of a video conference background processing system shown in embodiment 4 according to the present invention.

Fig. 6 is a block diagram of structural components of the video conference background processing system shown in embodiment 5 according to the present invention.

Detailed Description

As shown in fig. 1, a technical flowchart of a video conference background processing system and method according to the present invention is provided.

As shown in fig. 1, the specific technical process of the present invention is as follows:

first,: acquiring each position corresponding to each background edge pixel point for a video conference picture at the current moment;

as shown in fig. 1, two types of pixel points exist in the video conference picture at the current moment, namely a background edge pixel point and a non-background edge pixel point, wherein the background edge pixel point is an imaging area corresponding to a background wall body, and each pixel value corresponding to each background edge pixel point is the same, for example, all the pixel values are blue solid pixel values;

Specifically, each position is each item of positioning data required for each intelligent judgment of the subsequent expanding traversal;

secondly: for each position, acquiring each pixel value corresponding to a plurality of positions associated with the position in each previous frame of video conference picture, thereby screening out reliable basic data for intelligent judgment of whether the pixel value of the pixel point of the video conference picture at the future moment of the position is a background edge pixel point;

for example, in a previous video conference picture, a plurality of edge marks which respectively correspond to a plurality of pixel points covered by a pixel point window taking the position as the center and represent whether the pixel points respectively belong to background edge pixel points are used as basic data of the video conference picture of the previous frame for the subsequent intelligent judgment related to the position;

again: aiming at each position, adopting screened basic data corresponding to each frame of video conference picture respectively, the number of pixel points covered by the pixel point window, and the horizontal resolution and the vertical resolution of the acquisition picture to intelligently identify the edge mark of the pixel point at the position in the conference acquisition picture corresponding to the future moment by adopting an AI identification model, thereby completing the intelligent judgment on whether the pixel point at the position corresponding to the future moment belongs to the background edge pixel point;

For example, in order to ensure the validity and stability of the intelligent identification result, the adopted AI identification intelligent model is a deep neural network after multiple training is completed, and the training times are positively correlated with the horizontal resolution of the acquired picture and positively correlated with the vertical resolution of the acquired picture, wherein the horizontal resolution of the acquired picture is the total number of pixel columns of the acquired picture, and the vertical resolution of the acquired picture is the total number of pixel rows of the acquired picture;

and the following times: each edge mark obtained by each intelligent judgment corresponding to each position is adopted to carry out pixel value configuration on partial pixel points corresponding to each position in a conference acquisition picture corresponding to the future moment in advance, so that the data quantity required to be configured at the future moment is reduced, the instantaneity of pixel point data is improved, the flicker phenomenon of the background edge is avoided during configuration, and the effective response of video conference playing scenes with big data and high frame rate is realized;

as shown in fig. 1, the technical process is generally implemented by the advanced configuration device and the alternate display device in fig. 1;

specifically, the advanced configuration device adopts each edge identifier obtained by each intelligent judgment corresponding to each position to carry out pixel value configuration on partial pixel points corresponding to each position in the conference acquisition picture corresponding to the future moment in advance, and the replacement display device is used for executing replacement display operation based on the advanced configuration result when the future moment arrives;

Finally: when the future time arrives, the result of pixel value configuration performed in advance on part of pixel points corresponding to each position can be verified through the conference acquisition image surface of the future time which is actually acquired, when the error rate obtained by verification is not out of limit, the result of the advanced configuration is approved and displayed correspondingly, and conversely, when the error rate obtained by verification is out of limit, the acquired conference acquisition image is directly adopted for displaying, so that the data display real-time performance is improved, and meanwhile, the accuracy of the data display is considered.

The key points of the invention are as follows: whether each background edge pixel point based on positioning data still belongs to traversal analysis of the background edge pixel point at a future time, a replacement display mechanism taking the instantaneity and the accuracy into consideration, a targeted design of an AI identification model and screening of customized basic data.

The video conference background processing system and method of the present invention will be described in detail by way of embodiments.

Embodiment 1

As shown in fig. 2, the video conference background processing system includes the following steps:

the video capturing device may be an ultra-clear camera mechanism, configured to be disposed in front of a conference scene with a background wall, perform ultra-clear frame capturing with a set frame rate on the conference scene, and may select, at uniform intervals, each frame of conference capturing frame corresponding to each conference moment from each frame of captured ultra-clear frame;

specifically, the ultra-clear camera shooting mechanism comprises an image sensor, an optical filter, an ultra-clear imaging lens and a flexible circuit board, wherein the optical filter is arranged between the image sensor and the ultra-clear imaging lens, and the image sensor is arranged on the flexible circuit board;

further, the image sensor may be selected from a CCD sensor or a CMOS sensor;

Specifically, by acquiring each background edge pixel point in a conference acquisition picture corresponding to the current conference moment, intelligent judgment of whether the pixel point at the set position in the conference acquisition picture corresponding to the subsequent future conference moment belongs to the background edge pixel point or not is provided with positioning data, wherein the positioning data is the positioning coordinate of a certain background edge pixel point;

for example, the operation screen selects each part of judgment reference data corresponding to each frame of past acquisition picture to be used as basic data of subsequent intelligent authentication;

for example, a MATLAB toolbox may be selected to implement a simulation operation of performing data processing on edge identifiers of pixel points at reference coordinate values in a conference acquisition picture corresponding to a next conference moment at a current conference moment by using an AI identification model based on each piece of judgment reference data corresponding to each frame of past acquisition picture, the number of pixel points covered by a pixel point window centered on the reference coordinate values, and horizontal resolution and vertical resolution of the video acquisition device acquisition picture;

Specifically, the above traversal operation completes the extraction and judgment of whether each set position in the conference acquisition picture corresponding to the future conference moment belongs to the background edge pixel point or not through respective intelligent judgment of each positioning data corresponding to each background edge pixel point, and provides a solution for the advanced configuration of the pixel value of the subsequent pixel point;

the device comprises a video acquisition device, a video display device, an advance configuration device, a reference coordinate value acquisition device, a display device and a display device, wherein the video acquisition device is used for acquiring a conference acquisition picture corresponding to the next conference moment of the current conference moment from the video acquisition device as an actual acquisition picture when the next conference moment of the current conference moment arrives, determining each edge mark corresponding to each pixel point in the actual acquisition picture, judging the error rate of a background edge pixel point which is configured in advance for the conference acquisition picture corresponding to the next conference moment of the current conference moment based on each edge mark, and displaying the conference acquisition picture corresponding to the next conference moment of the current conference moment based on the advance configuration result of the pixel point at each reference coordinate value when the error rate is lower than or equal to a preset ratio limit;

The above-described replacement display operation is described as follows: the pixel values of partial pixel points of the conference acquisition picture at the future moment are acquired through the intelligent identification mode, and the advanced pixel values are configured, so that the number of the pixel points of which the pixel values need to be configured for the conference acquisition picture at the future moment is reduced, and meanwhile, the advanced pixel value configuration operation aims at the background edge pixel points, so that the phenomenon that the background wall edge flickers when the conference acquisition picture is displayed at the future moment can be avoided through the advanced configuration operation;

when the error rate is lower than or equal to a preset ratio limit, displaying the conference acquisition picture corresponding to the next conference time of the current conference time based on the advanced configuration result of the pixel points at each reference coordinate value comprises the following steps: when a conference acquisition picture corresponding to the next conference time of the current conference time is displayed, the pixel values of the pixel points of a plurality of positions except for each position corresponding to each reference coordinate value in the next conference time of the current conference time are obtained by adopting the pixel values of the pixel points of the plurality of positions in the actual acquisition picture;

The replacement display device is further used for directly displaying a conference acquisition picture as a conference acquisition picture corresponding to the next conference moment of the current conference moment when the error rate is higher than the preset rate limit;

therefore, when the error rate is higher than the preset ratio limit, the fact that the accuracy rate of the pixel values of the extraction configuration is insufficient is indicated, the difference between the advanced configuration and the actual data is far, at this time, the conference acquisition picture which is actually acquired is directly displayed, and the pixel values of the pixel points of the extraction configuration are not used, so that the data accuracy of the display picture is ensured;

the AI identification model is a depth neural network after training for a plurality of times, and the training times are positively correlated with the horizontal resolution of the picture acquired by the video acquisition device and positively correlated with the vertical resolution of the picture acquired by the video acquisition device;

illustratively, the AI identification model is a depth neural network after training for a plurality of times, and the forward correlation of the training times and the horizontal resolution of the video acquisition device acquisition picture and the forward correlation of the vertical resolution of the video acquisition device acquisition picture comprise: the higher the horizontal resolution of the picture collected by the video collecting device is, the more training times are selected, the higher the vertical resolution of the picture collected by the video collecting device is, and the more training times are selected;

The video acquisition device acquires the horizontal resolution of the picture to be the total number of pixel columns of the picture acquired by the video acquisition device, and the vertical resolution of the picture acquired by the video acquisition device to be the total number of pixel rows of the picture acquired by the video acquisition device;

the method for outputting the judging reference data corresponding to the previous acquisition picture of each frame by taking the edge marks corresponding to the pixel points covered in the pixel point window taking the reference coordinate value as the center in the previous acquisition picture of each frame as the judging reference data corresponding to the previous acquisition picture of the frame comprises the following steps: when the edge corresponding to a certain pixel covered in a pixel window taking the reference coordinate value as the center is marked as 0B01, the certain pixel is a background edge pixel in the previous acquisition picture where the certain pixel is located, and when the edge corresponding to a certain pixel covered in a pixel window taking the reference coordinate value as the center is marked as 0B00, the certain pixel is a non-background edge pixel in the previous acquisition picture where the certain pixel is located;

and acquiring each frame of conference acquisition picture corresponding to each conference time in the video conference, wherein each conference time is equal in interval and the time length between two adjacent conference times is a multiple of the acquisition interval time length of the video acquisition device, and the method comprises the following steps: the acquisition interval time of the video acquisition device is the interval time of two adjacent frames of pictures acquired by the video acquisition device.

Embodiment 2

As shown in fig. 3, unlike fig. 2, the video conference background processing system further includes:

the grade judging device is connected with the edge detecting device and is used for acquiring each forming pixel point of an imaging area forming a conference background wall body in a conference acquisition picture corresponding to the current conference moment, and determining the inclination grade of the conference background wall body based on the standard deviation of each imaging depth value corresponding to each forming pixel point;

for example, a numerical conversion function may be selected to represent a numerical correspondence between a standard deviation of each imaging depth value corresponding to each constituent pixel point and a determined inclination level of the conference background wall, specifically, the standard deviation of each imaging depth value corresponding to each constituent pixel point is an input parameter of the numerical conversion function, and the determined inclination level of the conference background wall is an output parameter of the numerical conversion function;

wherein, determining the inclination grade of the conference background wall based on the standard deviation of the imaging depth values corresponding to the constituent pixel points comprises: the larger the standard deviation value of the imaging depth values of each imaging depth value corresponding to each forming pixel point is, the higher the inclination level of the determined conference background wall body is.

Embodiment 3

As shown in fig. 4, unlike fig. 2, the video conference background processing system further includes:

the successive training device is connected with the intelligent identification device and is used for performing multiple training on the deep neural network to obtain the deep neural network after the multiple training is completed and sending the deep neural network to the intelligent identification device as the AI identification model for use;

for example, the simulation process and the test process of the deep neural network after performing a plurality of training on the deep neural network may be selectively implemented using a numerical simulation mode to obtain the deep neural network after completing the plurality of training.

Embodiment 4

As shown in fig. 5, unlike fig. 4, the video conference background processing system includes the following components:

a model storage device connected with the successive training device for performing storage of the AI authentication model by storing various model parameters of the AI authentication model;

the model memory device may be selected from FLASH memory, MMC memory device, TF memory device, or SD memory device, for example.

Embodiment 5

As shown in fig. 6, unlike fig. 2, the video conference background processing system includes the following components:

the timing service device is respectively connected with the edge detection device, the video acquisition device and the information capture device and is used for respectively providing respective required timing service for the edge detection device, the video acquisition device and the information capture device;

for example, in the timing service device, a quartz oscillator for generating a pulse train and a counter may be built in;

and the counter is connected with the quartz oscillator and is used for counting the pulse period of the pulse sequence to provide the respective needed timing service for the edge detection device, the video acquisition device and the information acquisition device.

Next, further description will be continued for the video conference background processing system in various embodiments of the present invention.

In a video conferencing background processing system according to any embodiment of the present invention:

The identification of each background edge pixel point in the conference acquisition picture corresponding to the current conference moment based on the color imaging characteristics of the conference background wall comprises the following steps: the color imaging characteristic of the conference background wall body is a red-green component value interval, a black-white component value interval and a yellow-blue component value interval of the conference background wall body in an LAB color space;

illustratively, each component value interval of the conference background wall body in the red-green component value interval, the black-white component value interval and the yellow-blue component value interval in the LAB color space is defined by an upper limit component threshold value and a lower limit component threshold value, the values of the upper limit component threshold value and the lower limit component threshold value are all between 0 and 255, and the upper limit component threshold value is larger than the lower limit component threshold value;

wherein, discernment each background border pixel in the meeting collection picture that corresponds current meeting moment based on the colour imaging characteristic of meeting background wall body still includes: when a red-green component value, a black-white component value and a yellow-blue component value of a certain pixel point in a conference acquisition picture corresponding to the current conference time fall into a red-green component value interval, a black-white component value interval and a yellow-blue component value interval respectively, judging that the certain pixel point is a formed pixel point of an imaging area forming a conference background wall;

Wherein, discernment each background border pixel in the meeting collection picture that corresponds current meeting moment based on the colour imaging characteristic of meeting background wall body still includes: when the number of other constituent pixel points existing around a certain constituent pixel point is smaller than or equal to a set number threshold, judging the certain constituent pixel point as a background edge pixel point, otherwise, judging the certain constituent pixel point as a non-background edge pixel point;

wherein, discernment each background border pixel in the meeting collection picture that corresponds current meeting moment based on the colour imaging characteristic of meeting background wall body still includes: when the red-green component value of a certain pixel point in a conference acquisition picture corresponding to the current conference moment falls outside a red-green component value interval, judging that the certain pixel point is a formed pixel point of an image area outside an imaging area of a conference background wall body in the conference acquisition picture corresponding to the current conference moment;

wherein, discernment each background border pixel in the meeting collection picture that corresponds current meeting moment based on the colour imaging characteristic of meeting background wall body still includes: when the black and white component value of a certain pixel point in a conference acquisition picture corresponding to the current conference moment falls outside a black and white component value interval, judging that the certain pixel point is a formed pixel point of an image area outside an imaging area of a conference background wall in the conference acquisition picture corresponding to the current conference moment;

And identifying each background edge pixel point in the conference acquisition picture corresponding to the current conference moment based on the color imaging characteristics of the conference background wall body further comprises: when the yellow Lan Fenliang value of a certain pixel point in the conference acquisition picture corresponding to the current conference time falls outside the yellow Lan Fenliang value interval, judging that the certain pixel point is a formed pixel point of an image area outside an imaging area of a conference background wall body in the conference acquisition picture corresponding to the current conference time.

Embodiment 6

Embodiment 6 of the present invention shows a video conference background processing method including the steps of:

s701: acquiring each frame of conference acquisition picture corresponding to each conference moment in the video conference by using a video acquisition device, wherein each conference moment is equal in interval and the time length between two adjacent conference moments is a multiple of the acquisition interval time length of the video acquisition device, and a conference host is positioned in front of a conference background wall in the video conference;

further, the image sensor may be selected from a CCD sensor or a CMOS sensor;

s702: acquiring a conference acquisition picture corresponding to the current conference moment, and identifying each background edge pixel point in the conference acquisition picture corresponding to the current conference moment based on the color imaging characteristics of the conference background wall;

s703: the following processing is executed for each background edge pixel point in the conference acquisition picture corresponding to the current conference time: acquiring a preset number of conference acquisition pictures before a current conference time as previous acquisition pictures of each frame, acquiring coordinate values of background edge pixel points in the conference acquisition pictures corresponding to the current conference time as reference coordinate values, and taking edge identifiers corresponding to pixel points covered in a pixel point window taking the reference coordinate values as the center in each previous acquisition picture of each frame as judging reference data corresponding to the previous acquisition picture of the frame so as to output judging reference data corresponding to the previous acquisition pictures of each frame;

s704: based on each judgment reference data corresponding to each frame of past acquisition picture, the number of pixel points covered by a pixel point window taking a reference coordinate value as a center, and the horizontal resolution and the vertical resolution of the video acquisition device acquisition picture, intelligent identification of the edge identification of the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference time at the current conference time by adopting an AI identification model;

s705: when the received edge mark is represented as a background edge pixel point, the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference time of the current conference time is configured as a background edge pixel point in advance at the time point between the current conference time and the next conference time of the current conference time, and when the received edge mark is represented as a non-background edge pixel point, the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference time of the current conference time is configured as a non-background edge pixel point in advance at the time point between the current conference time and the next conference time of the current conference time, and each background edge pixel point in the conference acquisition picture corresponding to the current conference time is traversed to complete the advanced configuration of the pixel point at each reference coordinate value corresponding to each background edge pixel point in the conference acquisition picture corresponding to the next conference time of the current conference time;

s706: when the next meeting time of the current meeting time arrives, acquiring a meeting acquisition picture corresponding to the next meeting time of the current meeting time from the video acquisition device to be used as an actual acquisition picture, determining each edge identifier corresponding to each pixel point in the actual acquisition picture, judging the error rate of background edge pixels configured in advance of the meeting acquisition picture corresponding to the next meeting time of the current meeting time based on each edge identifier, and displaying the meeting acquisition picture corresponding to the next meeting time of the current meeting time based on the advanced configuration result of the pixel points at each reference coordinate value when the error rate is lower than or equal to a preset ratio limit;

In addition, in the video conference background processing system and method according to the present invention, the substantial progress of the present invention may also be highlighted by:

acquiring each frame of conference acquisition picture corresponding to each conference time in the video conference, wherein each conference time is equal to each other in pairs, the time length between two adjacent conference times is a multiple of the acquisition interval time length of a video acquisition device, and the conference host in the video conference is positioned in front of a conference background wall and comprises: the frame number of each frame of conference acquisition picture is positively correlated with the total number of pixel points of each frame of conference acquisition picture;

the intelligent identification of the edge identification of the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference moment at the current conference moment by adopting an AI identification model based on each piece of judgment reference data corresponding to each frame of past acquisition picture, the number of the pixel points covered by the pixel point window taking the reference coordinate value as the center, and the horizontal resolution and the vertical resolution of the video acquisition device acquisition picture comprises the following steps: taking each judgment reference data corresponding to each frame of past acquisition picture, the number of pixel points covered by a pixel point window taking a reference coordinate value as a center, and the horizontal resolution and the vertical resolution of the acquisition picture of the video acquisition device as the item-by-item input content of the AI identification model;

The intelligent identification method for the edge identification of the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference moment of the current conference moment by adopting an AI identification model based on each piece of judgment reference data corresponding to each frame of past acquisition picture, the number of the pixel points covered by the pixel point window taking the reference coordinate value as the center, and the horizontal resolution and the vertical resolution of the video acquisition device acquisition picture further comprises the following steps: and executing the AI identification model to obtain the edge identification of the pixel point at the reference coordinate value in the conference acquisition picture corresponding to the next conference time of the current conference time output by the AI identification model.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

From the above description of embodiments, it will be clear to a person skilled in the art that the embodiments of the method described above may be implemented by means of software plus a necessary general purpose hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims

1. A video conferencing background processing system, the system comprising:

2. The video conferencing background processing system of claim 1 wherein:

the horizontal resolution of the video acquisition device acquisition picture is the total number of pixel columns of the video acquisition device acquisition picture, and the vertical resolution of the video acquisition device acquisition picture is the total number of pixel rows of the video acquisition device acquisition picture.

3. The video conferencing background processing system of claim 2 wherein:

the method for outputting the judgment reference data corresponding to the previous acquisition picture of each frame by taking the edge marks corresponding to the pixel points covered in the pixel point window taking the reference coordinate value as the center in the previous acquisition picture of each frame as the judgment reference data corresponding to the previous acquisition picture of the frame comprises the following steps: when the edge corresponding to a certain pixel covered in a pixel window taking the reference coordinate value as the center is marked as 0B01, the certain pixel is a background edge pixel in the previous acquisition picture where the certain pixel is located, and when the edge corresponding to a certain pixel covered in a pixel window taking the reference coordinate value as the center is marked as 0B00, the certain pixel is a non-background edge pixel in the previous acquisition picture where the certain pixel is located;

The method for acquiring the conference acquisition pictures of each frame corresponding to each conference time in the video conference, wherein each conference time is equal in interval and the time length between two adjacent conference times is a multiple of the acquisition interval time length of the video acquisition device comprises the following steps: the acquisition interval time of the video acquisition device is the interval time of two adjacent frames of pictures acquired by the video acquisition device.

4. The video conferencing background processing system of claim 3 wherein the system further comprises:

5. The video conferencing background processing system of claim 3 wherein the system further comprises:

And the successive training device is connected with the intelligent identification device and is used for performing multiple training on the deep neural network to obtain the deep neural network after the multiple training is completed and sending the deep neural network to the intelligent identification device as the AI identification model for use.

6. The video conferencing background processing system of claim 5 wherein the system further comprises:

and the model storage device is connected with the successive training device and is used for storing the AI identification model by storing various model parameters of the AI identification model.

7. The video conferencing background processing system of claim 3 wherein the system further comprises:

and the timing service device is respectively connected with the edge detection device, the video acquisition device and the information capture device and is used for respectively providing respective required timing service for the edge detection device, the video acquisition device and the information capture device.

8. A video conferencing background processing system as claimed in any of claims 3-7, wherein:

wherein, discernment each background border pixel in the meeting collection picture that corresponds current meeting moment based on the colour imaging characteristic of meeting background wall body still includes: when the number of other constituent pixel points existing around a certain constituent pixel point is smaller than or equal to a set number threshold, judging that the certain constituent pixel point is a background edge pixel point, otherwise, judging that the certain constituent pixel point is a non-background edge pixel point.

9. The video conferencing background processing system of claim 8 wherein:

identifying each background edge pixel point in the conference acquisition picture corresponding to the current conference moment based on the color imaging characteristics of the conference background wall further comprises: when the red-green component value of a certain pixel point in a conference acquisition picture corresponding to the current conference moment falls outside a red-green component value interval, judging that the certain pixel point is a formed pixel point of an image area outside an imaging area of a conference background wall body in the conference acquisition picture corresponding to the current conference moment;

wherein, discernment each background border pixel in the meeting collection picture that corresponds current meeting moment based on the colour imaging characteristic of meeting background wall body still includes: when the yellow Lan Fenliang value of a certain pixel point in the conference acquisition picture corresponding to the current conference time falls outside the yellow Lan Fenliang value interval, judging that the certain pixel point is a formed pixel point of an image area outside an imaging area of a conference background wall body in the conference acquisition picture corresponding to the current conference time.

10. A method for video conference background processing, the method comprising: