CN117640877A

CN117640877A - Picture reconstruction method for online conference and electronic equipment

Info

Publication number: CN117640877A
Application number: CN202410100077.XA
Authority: CN
Inventors: 吕少卿; 沈亚军; 俞鸣园; 王克彦; 曹亚曦; 费敏健
Original assignee: Zhejiang Huachuang Video Signal Technology Co Ltd
Current assignee: Zhejiang Huachuang Video Signal Technology Co Ltd
Priority date: 2024-01-24
Filing date: 2024-01-24
Publication date: 2024-03-01
Anticipated expiration: 2044-01-24
Also published as: CN117640877B

Abstract

The application discloses a picture reconstruction method of an online conference and electronic equipment, wherein the picture reconstruction method of the online conference comprises the following steps: determining a parameter object which can be acquired by each candidate angle, and obtaining an object set which can be acquired and corresponds to each candidate angle; calculating an angle score of each candidate angle by using the collectable object set corresponding to each candidate angle; selecting a candidate angle with the angle score meeting a preset condition as a target angle, and acquiring an initial conference image acquired by an image acquisition device under the target angle; dividing the image content of a reference object contained in the initial conference image to obtain a plurality of object images; and carrying out picture reconstruction on the plurality of object images to obtain a target conference image to be displayed. The method can collect the picture of the reference object with higher quality under the selected target angle, is convenient for the segmentation and reconstruction of the subsequent reference object, and ensures that the reconstructed target conference image can display the image content of the reference object more accurately and clearly.

Description

Picture reconstruction method for online conference and electronic equipment

Technical Field

The present disclosure relates to the field of video processing technologies, and in particular, to a method for reconstructing a frame of an online conference and an electronic device.

Background

The video conference system, also called conference television system, is a system device for realizing instant interactive communication by transmitting sound, image and file data to each other through a transmission line and a video conference terminal by two or more than two individuals or groups with different conference scenes. Through the video conference system, individuals or groups at sites at different sites can see participants in other sites participating in the conference, and can conduct a series of communications such as conversation, and most importantly, can also see the expression and action of the other party, so that the participants at sites at different sites can communicate as conveniently as if they were in the same conference room.

The video conference system in the prior art can effectively develop a small conference, but is not suitable for a large conference, and can not meet the requirement of the large conference.

Disclosure of Invention

In order to solve the above problems, the present application provides at least a method for reconstructing a frame of an online conference and an electronic device.

The first aspect of the present application provides a method for reconstructing a frame of an online conference, in which an image acquisition device is deployed in a conference scene, the image acquisition device has a plurality of candidate angles, and a plurality of conference objects exist in the conference scene, the method comprising: determining a parameter object which can be acquired by each candidate angle, and obtaining an object set which can be acquired and corresponds to each candidate angle; calculating an angle score of each candidate angle by using the collectable object set corresponding to each candidate angle; selecting a candidate angle with the angle score meeting a preset condition as a target angle, and acquiring an initial conference image acquired by an image acquisition device under the target angle; dividing the image content of a reference object contained in the initial conference image to obtain a plurality of object images; and carrying out picture reconstruction on the plurality of object images to obtain a target conference image to be displayed.

In one embodiment, determining a participant that is collectable for each candidate angle includes: acquiring an object position of each participant in the conference scene, and acquiring a collection area of the candidate angle; and taking the reference object with the object position in the acquisition area of the candidate angle as the reference object with the acquirable candidate angle.

In an embodiment, calculating an angle score for each candidate angle using the set of collectable objects corresponding to each candidate angle includes: counting the number of the objects of the reference object in the collectable object set corresponding to each candidate angle; and calculating an angle score of each candidate angle based on the number of objects corresponding to each candidate angle.

In one embodiment, each participant corresponds to a priority score; calculating an angle score for each candidate angle based on the number of objects corresponding to each candidate angle, comprising: based on the number of objects corresponding to each candidate angle, respectively obtaining the number of objects corresponding to each candidate angle; and combining the priority scores and the object quantity scores, and calculating to obtain the angle scores corresponding to each candidate angle.

In an embodiment, selecting a candidate angle whose angle score satisfies a preset condition as the target angle includes: and selecting the candidate angle with the highest angle score as the target angle.

In an embodiment, selecting a candidate angle whose angle score satisfies a preset condition as the target angle includes: counting the accumulated duration and/or the duration with the highest angle score of each candidate angle in a preset time period; calculating a selection recommendation index of each candidate angle based on the accumulated time length and/or the duration time length; and selecting the candidate angle with the highest recommendation index as the target angle.

In one embodiment, segmenting image content of a participant object contained in an initial conference image to obtain a plurality of object images includes: acquiring an object position of each participant in the initial conference image; clustering the participant objects in the initial conference image based on the object position of each participant object to obtain an object clustering result; and dividing the image content of each object clustering result in the initial conference image to obtain a plurality of object images.

In one embodiment, each participant corresponds to a priority score; performing picture reconstruction on a plurality of object images to obtain a target conference image to be displayed, wherein the method comprises the following steps: calculating the display score of each object image based on the priority score corresponding to the reference object in each object image; the view layout template of the conference scene is obtained, the view layout template comprises a plurality of image playing areas, and each image playing area corresponds to a highlighting level; confirming an image playing area matched with each object image based on the display score of each object image and the highlighting grade of each image playing area; and carrying out picture reconstruction on the plurality of object images according to the image playing areas matched with each object image to obtain a target conference image to be displayed.

In one embodiment, obtaining a view layout template of a conference scene includes: scene classification is carried out on the conference scene, and the conference type of the conference scene is obtained; the view layout templates that match the meeting type are queried.

A second aspect of the present application provides a screen reconstruction apparatus for an online conference, the apparatus including: the collectable object determining module is used for determining a parameter object which can be collected by each candidate angle to obtain a collectable object set which corresponds to each candidate angle respectively; the scoring calculation module is used for calculating the angle score of each candidate angle by using the collectable object set corresponding to each candidate angle; the angle selecting module is used for selecting a candidate angle with the angle score meeting a preset condition as a target angle and acquiring an initial conference image acquired by the image acquisition device under the target angle; the image segmentation module is used for segmenting the image content of the reference object contained in the initial conference image to obtain a plurality of object images; and the picture reconstruction module is used for carrying out picture reconstruction on the plurality of object images to obtain a target conference image to be displayed.

A third aspect of the present application provides an electronic device, including a memory and a processor, where the processor is configured to execute program instructions stored in the memory, so as to implement the method for reconstructing a frame of an online conference.

A fourth aspect of the present application provides a computer-readable storage medium having stored thereon program instructions that, when executed by a processor, implement a method for reconstructing a picture of an online conference as described above.

According to the scheme, the parameter object which can be acquired by each candidate angle is determined, so that an object set which can be acquired and corresponds to each candidate angle is obtained; calculating an angle score of each candidate angle by using the collectable object set corresponding to each candidate angle; and selecting a candidate angle with the angle score meeting a preset condition as a target angle, acquiring an initial conference image acquired by an image acquisition device under the target angle, and acquiring a parameter object picture with higher quality under the selected target angle, so that the segmentation and reconstruction of a subsequent parameter object are facilitated. Then, dividing the image content of the participant contained in the initial conference image to obtain a plurality of object images; and carrying out picture reconstruction on the plurality of object images to obtain a target conference image to be displayed, wherein the reconstructed target conference image can display the image content of the participant more accurately and clearly, reduce the display of irrelevant images in a conference scene and improve the online conference effect of the large conference scene.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the technical aspects of the application.

FIG. 1 is a schematic diagram of an implementation environment involved in a screen reconstruction method for an online conference according to an exemplary embodiment of the present application;

FIG. 2 is a flow chart of a screen reconstruction method for an online meeting shown in an exemplary embodiment of the present application;

FIG. 3 is a schematic view of a conference scenario illustrated by an exemplary embodiment of the present application;

FIG. 4a is a schematic diagram of image acquisition at view 1 by an IPC as shown in an exemplary embodiment of the present application;

FIG. 4b is a schematic diagram of image acquisition at view 2 by an IPC as shown in an exemplary embodiment of the present application;

FIG. 5 is a schematic diagram of a picture reconstruction shown in an exemplary embodiment of the present application;

FIG. 6 is a schematic diagram illustrating a segmentation strategy determination according to an exemplary embodiment of the present application;

FIG. 7 is a block diagram of a screen reconstruction apparatus for an online conference shown in an exemplary embodiment of the present application;

FIG. 8 is a schematic diagram of an electronic device shown in an exemplary embodiment of the present application;

fig. 9 is a schematic structural view of a computer-readable storage medium shown in an exemplary embodiment of the present application.

Detailed Description

The following describes the embodiments of the present application in detail with reference to the drawings.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the present application.

The term "and/or" is herein merely an association information describing an associated object, meaning that three relationships may exist, e.g., a and/or B may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

In the prior art, a video conference system generally adopts a fixed image acquisition angle and a picture layout, and a conference scene in a large conference may have a plurality of participants, if the video conference system adopts the fixed image acquisition angle, each participant cannot be guaranteed to get corresponding attention, especially under the condition that the participant moves frequently. In addition, the fixed screen layout also limits the visibility of each participant, affecting the conference effect.

Therefore, the application at least provides a picture reconstruction method for an online conference and electronic equipment, so as to improve the applicability of a conference system to a large conference.

The following describes a picture reconstruction method for online conferences provided in the embodiments of the present application.

Referring to fig. 1, a schematic diagram of an implementation environment of an embodiment of the present application is shown. The implementation environment of the scheme may include an image acquisition device 110 and a server 120, where the image acquisition device 110 and the server 120 are communicatively connected to each other.

The number of image capture devices 110 may be one or more. The image capturing device 110 may be a device with an image capturing function, such as a video camera, a smart phone, a tablet computer, a notebook computer, or the like, but is not limited thereto.

The server 120 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content delivery network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligence platform.

In one example, the server 120 may perform a screen reconstruction process for the initial conference image acquired by the image acquisition device 110 to obtain a target conference image to be displayed, and then the server 120 may store the target conference image to be displayed locally, transmit the target conference image to other conference terminals, and so on.

In the method for reconstructing a frame of an online conference provided in the embodiment of the present application, the execution subject of each step may be the image capturing device 110, or may be the server 120, or the image capturing device 110 and the server 120 are interactively matched to execute, that is, a part of steps of the method are executed by the image capturing device 110, and another part of steps are executed by the server 120.

It will be appreciated that in the specific embodiments of the present application, related data such as user information, user images, etc. are referred to, and that when the embodiments of the present application are applied to specific products or technologies, user permissions or consents need to be obtained, and that the collection, use, and processing of related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions.

Referring to fig. 2, fig. 2 is a flowchart illustrating a screen reconstruction method of an online conference according to an exemplary embodiment of the present application. The screen reconstruction method of the online conference can be applied to the implementation environment shown in fig. 1 and is specifically performed by a server in the implementation environment. It should be understood that the method may be adapted to other exemplary implementation environments and be specifically executed by devices in other implementation environments, and the implementation environments to which the method is adapted are not limited by the present embodiment.

As shown in fig. 2, the method for reconstructing the screen of the online conference at least includes steps S210 to S250, which are described in detail as follows:

step S210: and determining the parameter objects which can be acquired by each candidate angle, and obtaining the acquired object set corresponding to each candidate angle.

It should be noted that, there are multiple conference objects in the conference scene, and one or more image acquisition devices are disposed in the conference scene, the image acquisition devices have multiple candidate angles corresponding to each other, and different conference scene images can be acquired by different candidate angles.

The reference object may be a person or an object, which is not limited in this application.

Referring to fig. 3, fig. 3 is a schematic view of a conference scenario shown in an exemplary embodiment of the present application, where, as shown in fig. 3, the conference scenario includes a plurality of participants (a 1 to a 6), and a network camera (Internet Protocol Camera, IPC) is disposed, and the IPC is configured with a pan-tilt, through which the IPC can be rotated, so as to implement image acquisition at a plurality of angles, and each angle is a candidate angle of the IPC.

Because conference scene pictures acquired by different candidate angles are different, the conference objects which can be acquired by different candidate angles are also different, and further, the different candidate angles correspond to different sets of the acquirable objects.

With the image capturing device as IPC, the candidate angles include view 1 and view 2 for illustration:

for example, referring to fig. 4a, fig. 4a is a schematic diagram illustrating an IPC performing image acquisition at a view angle 1 according to an exemplary embodiment of the present application, as shown in fig. 4a, the IPC may perform image acquisition on the reference objects a1, a2 and a6 at the view angle 1, that is, the acquirable object set corresponding to the view angle 1 contains the reference objects a1, a2 and a6.

For example, referring to fig. 4b, fig. 4b is a schematic diagram illustrating image capturing by the IPC at the view angle 2 according to an exemplary embodiment of the present application, as shown in fig. 4b, the IPC may perform image capturing on the reference objects a2, a3, a4, a5 and a6 at the view angle 2, that is, the acquirable object set corresponding to the view angle 2 contains the reference objects a2, a3, a4, a5 and a6.

It will be appreciated that the above-mentioned view 1 and view 2 are merely exemplary, and may include more candidate angles in a practical application scenario, which is not limited in this application.

For example, the set of collectable objects corresponding to each candidate angle may be determined according to the relationship between the position of the reference object and the coverage area of the candidate angle. Illustrating: acquiring an object position of each participant in the conference scene, and acquiring a collection area of the candidate angle; and taking the reference object with the object position in the acquisition area of the candidate angle as the reference object with the acquirable candidate angle.

For example, the image acquisition device may acquire an image at each candidate angle to obtain a conference scene image under each candidate angle currently, and then perform participant object recognition on the conference scene image to obtain an acquirable object set corresponding to each candidate angle.

Step S220: and calculating the angle score of each candidate angle by using the collectable object set corresponding to each candidate angle.

And respectively calculating the angle score of each candidate angle according to the collectable object set corresponding to each candidate angle.

The angle scores of the candidate angles are used for representing the execution effect of the online conference when conference image acquisition is carried out under the candidate angles. The higher the angle score of the candidate angle is, the better the execution effect of the online meeting corresponding to the candidate angle is; the lower the angle score of the candidate angle is, the poorer the execution effect of the online conference corresponding to the candidate angle is.

For example, the angle score for each candidate angle may be obtained by counting the number of reference objects in the set of collectable objects, the number of reference objects being proportional to the angle score.

For another example, the angle score of each candidate angle may be obtained by analyzing the quality of the participant in the collectable object set, such as the activity level of the participant, the importance level of playing the role, and the like, where the activity level of the participant and the importance level of playing the role are proportional to the angle score.

Step S230: and selecting a candidate angle with the angle score meeting a preset condition as a target angle, and acquiring an initial conference image acquired by the image acquisition device under the target angle.

If the angle score meets the preset condition, the online conference effect corresponding to the conference scene image acquisition by adopting the candidate angle is better, the candidate angle is taken as the target angle, the image acquisition device acquires the conference scene images under the target angle, and the conference scene images are taken as the initial conference images.

Wherein, the angle score meeting the preset condition may be the highest angle score; or the angle score is larger than a preset angle threshold value and the highest; the angle score may be an angle score having a duration with the highest angle score in a preset time period greater than a preset time threshold, which is not limited in the present application.

Step S240: the image content of the reference object contained in the initial conference image is divided to obtain a plurality of object images.

The image content of the reference object contained in the initial conference image is divided to obtain a plurality of object images.

It should be noted that one object image may contain one or more reference objects.

For example, when the image content of the reference object is segmented, the image area of the reference object and the adjacent background image area thereof can be cut to obtain an object image, and the object image can simultaneously contain the image content and the background content of the reference object.

For example, in the segmentation of the image content of the reference object, it is also possible to divide only the image region of the reference object (in this case, also other objects having an interactive relationship with the reference object should be divided into image regions belonging to the reference object, such as a hand-held object of a participant, a notebook computer with a gaze, etc.), i.e. the participant is separated from the background to obtain an object image, and the object image only contains the image content of the participant.

Optionally, the image content segmentation mode of the reference object can be flexibly selected according to actual conditions.

For example, the image content segmentation method of the reference object is determined according to the complexity of the background.

For example, calculating the complexity of the background, if the complexity of the background is greater than the complexity threshold, only the image area of the reference object may be cut, so as to ensure that the attention is not drawn by the background with higher complexity, so that the attention of the user can be given to the reference object more; if the complexity of the background is not greater than the complexity threshold, the image region of the reference object and its neighboring background image region may be cropped to reduce the computational resource requirements.

Wherein the complexity of the background can be determined by analyzing the texture, brightness, color, frequency of change per unit time, etc. of the background.

For another example, the image content segmentation mode of the participant object is determined according to the association degree between the image content of the background and the conference subject.

For example, calculating the association degree between the background and the conference subject, if the association degree between the background and the conference subject is smaller than the association degree threshold, clipping can be performed only on the image area of the conference subject to improve the accuracy of the display of the conference subject; if the association degree of the background is not smaller than the association degree threshold, the image area of the reference object and the adjacent background image area can be cut, so that more conference contents can be reserved on the premise of reducing the requirement of computing resources.

The background can be identified, such as text identification, action identification and the like, and the association degree between the identification result and the conference theme is calculated.

Step S250: and carrying out picture reconstruction on the plurality of object images to obtain a target conference image to be displayed.

Illustratively, a plurality of object images may be stitched together to obtain a target conference image to be displayed; and filling a plurality of object images into the view layout template according to the preset view layout template to obtain a target conference image to be displayed, which is not limited in the application.

Optionally, when the picture reconstruction is performed on the multiple object images, the reconstruction mode of each object image can be determined according to the position of each reference object in the initial conference image, so as to ensure that the relative position and size between the reference objects conform to the visual logic of the real scene, and improve the conference picture display effect.

Referring to fig. 5, fig. 5 is a schematic diagram of image reconstruction according to an exemplary embodiment of the present application, and as shown in fig. 5, an initial conference image acquired by an image acquisition device under a target angle is acquired, and image contents of a participant in the initial conference image are segmented to obtain a plurality of object images. Then, each object image is adjusted, such as zooming, rotation, flipping, etc., and further, color correction, brightness adjustment, contrast adjustment, etc. may be performed on the object image to obtain an adjusted object image. And then carrying out picture reconstruction on the adjusted object image to obtain a target conference image.

And carrying out picture reconstruction on the plurality of object images so as to enable the image content of the reference object to be displayed more accurately and clearly in the target conference image to be displayed, reduce the display of irrelevant images in the conference scene and improve the online conference effect of the large conference scene.

Some embodiments of the present application are described in detail below.

In some embodiments, calculating the angle score for each candidate angle in step S220 using the set of collectible objects corresponding to each candidate angle includes: counting the number of the objects of the reference object in the collectable object set corresponding to each candidate angle; and calculating an angle score of each candidate angle based on the number of objects corresponding to each candidate angle.

If the number of the objects of the reference objects in the collectable object set is larger, the angle score of the corresponding candidate angle is higher; if the number of the objects of the reference object in the collectable object set is smaller, the angle score of the corresponding candidate angle is lower.

Specifically, after the number of the objects of the reference object in each collectable object set is obtained, the object data obtained through statistics is normalized, and the normalized result is used as the angle score of each candidate angle.

For example, the calculation formula of the angle score can be referred to formula 1:

equation 1:

wherein,an angle score representing candidate angle i; />Representing the number of the objects of the reference object in the collectable object set corresponding to the candidate angle i; />Representing the sum of the number of objects corresponding to all candidate angles.

Illustratively, each participant corresponds to a priority score; calculating an angle score for each candidate angle based on the number of objects corresponding to each candidate angle, comprising: based on the number of objects corresponding to each candidate angle, respectively obtaining the number of objects corresponding to each candidate angle; based on the priority scores of the reference objects in the collectable object set corresponding to each candidate angle, calculating to obtain an object quality score corresponding to each candidate angle; and combining the object quality scores and the object quantity scores, and calculating to obtain angle scores corresponding to each candidate angle.

The calculation of the number of objects score can be seen in the above embodiment and equation 1.

The priority scores of the participant are used for representing the display importance degrees of the participant, and the higher the priority scores of the participant, the higher the display importance degrees of the participant; the lower the priority score of a participant, the lower the importance of the participant's display.

Illustratively, the step of calculating the priority score of the participant comprises:

and carrying out conference character recognition on the participant, wherein the conference character comprises a host, a speaker and the like, and determining a first score of the participant according to the conference character recognition result. Wherein, scoring parameters can be preset for different conference roles, and a first score of the participant is obtained by inquiring scores corresponding to the conference roles; the method can also be that scene classification is carried out on the conference scene to obtain the current conference type and/or conference progress of the conference scene, and the first score of the conference object is obtained according to the score corresponding to the conference role of the conference type and/or conference progress query, which is not limited in the application.

And detecting the liveness of the participant, and determining a second score of the participant according to the liveness detection result. The activity of the participant can be determined by counting the speaking times, the language of limbs, the focusing point of the sight of the participant in a preset time period.

Then, according to the first score and/or the second score of the participant, obtaining a priority score of the participant, for example, directly taking the first score or the second score of the participant as the priority score; or the first score and the second score are subjected to weighted calculation to obtain a priority score, the weighted calculation weight parameter can be preset or determined according to the conference type and/or the conference progress, if the conference progress is conference start, more weight is allocated to the first score, and if the conference progress is conference mid-conference discussion, more weight is allocated to the second score.

And then, calculating to obtain the object quality score corresponding to the candidate angle according to the priority score of the reference object in the collectable object set corresponding to the candidate angle. For example, the priority scores of the individual reference objects in the set of collectable objects may be summed to obtain an object quality score; the priority scores of all the reference objects in the collectable object set can be subjected to mean value calculation to obtain object quality scores; the highest priority score in the collectable object set may also be determined, and the highest priority score is taken as the object quality score, which is not limited in this application.

And after obtaining the object quality scores and the object quantity scores of the candidate angles, calculating to obtain the angle scores corresponding to each candidate angle by combining the object quality scores and the object quantity scores. For example, the highest or lowest score of the object quality scores and the object number scores of the candidate angles may be selected as the angle score corresponding to the candidate angle; the object quality score and the object quantity score can be weighted to obtain an angle score corresponding to the candidate angle, and the weighted weight parameter can be preset or determined according to the conference type and/or the conference progress, if the conference type is a single lecture conference, more weights are allocated to the object quality score, and if the conference type is a group discussion conference, more weights are allocated to the object quantity score.

For example, the calculation formula of the angle score can also be referred to as formula 2:

equation 2:

wherein,an angle score representing candidate angle i; />A score representing the number of objects corresponding to the candidate angle i;weights representing the number of objects scores; />Representing an object quality score corresponding to the candidate angle i; />Weights representing object quality scores.

In some embodiments, selecting, in step S230, a candidate angle whose angle score satisfies a preset condition as the target angle includes: counting the accumulated duration and/or the duration with the highest angle score of each candidate angle in a preset time period; calculating a selection recommendation index of each candidate angle based on the accumulated time length and/or the duration time length; and selecting the candidate angle with the highest recommendation index as the target angle.

The longer the accumulated duration and/or the duration, the higher the recommendation index is selected; the shorter the cumulative duration and/or duration, the lower the recommendation index is selected.

Specifically, calculating the angle score of each candidate angle in real time, determining the candidate angle with the highest angle score under each time, then counting the accumulated duration and/or the duration of the angle score of each candidate angle with the highest angle score in a preset time period, and calculating the selection recommendation index of each candidate angle according to the accumulated duration and/or the duration.

For example, the candidate angles include a view angle 1 and a view angle 2, the preset time period is 1 minute, the cumulative duration of the view angle 1 is 10 seconds within 1 minute, the cumulative duration of the view angle 2 is 50 seconds, and the selection recommendation index of the view angle 1 is 10/60, namely 0.17; the selected recommendation index for viewing angle 2 was calculated to be 50/60, i.e., 0.83.

For example, the candidate angles include a view angle 1 and a view angle 2, the preset time period is 1 minute, the duration of the view angle 1 is 6 seconds within 1 minute, the duration of the view angle 2 is 30 seconds, and the selection recommendation index of the view angle 1 is calculated to be 6/60, namely 0.1; the selected recommendation index for viewing angle 2 was calculated to be 30/60, i.e., 0.5.

Of course, the selection recommendation index of each candidate angle may also be calculated in combination with the accumulated duration and/or the duration, which is not limited in this application.

And after the selection recommendation index of each candidate angle is calculated, selecting the candidate angle with the highest recommendation index as the target angle.

Through the embodiment, not only the angle score of the candidate angle is considered, but also the average performance of the candidate angle in the time dimension is considered, so that when the conference scene is changed greatly (such as the conference object frequently moves), if the target angle frequently switches caused by the angle score at a single moment is considered, the accuracy of selecting the target angle can be improved, and the acquisition effect of the conference image is improved.

In some embodiments, segmenting the image content of the participant object contained in the initial conference image in step S240 to obtain a plurality of object images includes: acquiring an object position of each participant in the initial conference image; clustering the participant objects in the initial conference image based on the object position of each participant object to obtain an object clustering result; and dividing the image content of each object clustering result in the initial conference image to obtain a plurality of object images.

Specifically, according to the object position of each reference object, determining the reference objects with the distances smaller than a preset distance threshold value, and clustering the reference objects to obtain an object clustering result.

That is, for the reference object with a relatively close distance, the reference object is divided into a whole (i.e. an object clustering result), and then the image content of each object clustering result is segmented during image segmentation, so as to obtain a plurality of object images.

Besides clustering the participant in the initial conference image according to the object position of each participant, the participant in the initial conference image can be clustered according to other parameters, for example, the participant in the initial conference image with the association relationship is divided into a whole according to the association relationship between the participant.

In some embodiments, the segmentation strategy of the reference object may be determined on a case-by-case basis. For example, referring to fig. 6, fig. 6 is a schematic diagram illustrating a determination of a segmentation policy according to an exemplary embodiment of the present application, and as shown in fig. 6, the number of objects of the reference objects in the initial conference image is counted, whether the number of objects is greater than a preset number is determined, and if the number of objects is greater than the preset number, a cluster segmentation policy is selected to segment the reference objects in the initial conference image; and if the number of the objects is not greater than the preset number, selecting a single segmentation strategy to segment the participant objects in the initial conference image.

Implementation of the cluster segmentation strategy the above embodiments. In FIG. 6, clustering segmentation is performed on the reference objects in the initial conference image to obtain an object image containing P-a1 and P-a2, wherein two reference objects in P-a1 belong to one object clustering result.

The single segmentation strategy refers to the segmentation of each reference object, namely, one object image only contains one reference object. In FIG. 6, the object images are obtained by single segmentation of the reference object in the initial conference image, wherein each object image contains only one reference object, and the object images contain P-b1, P-b2 and P-b 3.

In some embodiments, each participant corresponds to a priority score; in step S250, performing image reconstruction on the multiple object images to obtain a target conference image to be displayed, including: calculating the display score of each object image based on the priority score corresponding to the reference object in each object image; the view layout template of the conference scene is obtained, the view layout template comprises a plurality of image playing areas, and each image playing area corresponds to a highlighting level; confirming an image playing area matched with each object image based on the display score of each object image and the highlighting grade of each image playing area; and carrying out picture reconstruction on the plurality of object images according to the image playing areas matched with each object image to obtain a target conference image to be displayed.

Illustratively, obtaining a view layout template of a conference scene includes: scene classification is carried out on the conference scene, and the conference type of the conference scene is obtained; the view layout templates that match the meeting type are queried.

In addition to determining the view layout template according to the conference type, the view layout template may be determined according to the conference progress, the number of objects of the participant, and the like, which is not limited in this application.

If the view layout template contains grid layout, the view layout template contains a plurality of image playing grids, and each grid correspondingly plays an object image; the view layout template contains a multi-image hierarchical layout, the view layout template containing a plurality of image layers, each image layer for displaying an object image. The view layout template can be flexibly set according to actual conditions, and the application is not limited to the view layout template.

Illustratively, the view layout template contains a plurality of image play areas, each image play area corresponding to a highlighting level, wherein the highlighting level is used to characterize the degree of highlighting of the corresponding image play area. The larger the area, the higher the highlighting level of the image playing area near the center of the view; conversely, the smaller the region, the lower the highlighting level of the image play region farther from the center of the view.

Further, according to the priority scores corresponding to the reference objects in each object image, the display scores of each object image are calculated respectively, and the higher the priority scores of the reference objects in the object images, the higher the display scores of the object images. For example, the average value calculation is performed on the priority scores corresponding to the reference objects in the object image, and the average value calculation result is used as the display score of the object image.

Then, the image playing areas matched with each object image are respectively confirmed according to the display score of each object image and the highlighting level of each image playing area. For example, the object images are sorted in a descending order according to the size of the display score, the image playing areas are sorted in a descending order according to the height of the highlighting level, the object images and the image playing areas are matched according to the sorting sequence, for example, the object image with the first sorting is matched with the image playing area with the first sorting, the object image with the second sorting is matched with the image playing area with the second sorting, so that the image playing area matched with each object image is obtained, and the image reconstruction is carried out on a plurality of object images according to the image playing area matched with each object image, so that the target conference image to be displayed is obtained.

In some embodiments, an interactive page may be further provided for the user, and a control entry for image reconstruction may be provided for the user through the interactive page, for example, a parameter object segmentation adjustment function may be provided for the user through the interactive page, and an operation of the user may be received to adjust the segmented object image, for example, to re-segment a plurality of parameter objects in the object image, or to combine a plurality of object images into one object image, or to manually mark or adjust a segmentation boundary by the user, or the like.

According to the picture reconstruction method for the online conference, the parameter objects which can be acquired by each candidate angle are determined, so that the acquired object sets which correspond to each candidate angle respectively are obtained; calculating an angle score of each candidate angle by using the collectable object set corresponding to each candidate angle; and selecting a candidate angle with the angle score meeting a preset condition as a target angle, acquiring an initial conference image acquired by an image acquisition device under the target angle, and acquiring a parameter object picture with higher quality under the selected target angle, so that the segmentation and reconstruction of a subsequent parameter object are facilitated. Then, dividing the image content of the participant contained in the initial conference image to obtain a plurality of object images; and carrying out picture reconstruction on the plurality of object images to obtain a target conference image to be displayed, wherein the reconstructed target conference image can display the image content of the participant more accurately and clearly, reduce the display of irrelevant images in a conference scene and improve the online conference effect of the large conference scene.

Fig. 7 is a block diagram of a screen reconstruction apparatus of an online conference shown in an exemplary embodiment of the present application. As shown in fig. 7, the screen reconstruction apparatus 700 of the exemplary online conference includes: the object determination module 710, the score calculation module 720, the angle selection module 730, the image segmentation module 740, and the screen reconstruction module 750 may be acquired. Specifically:

The collectable object determining module 710 is configured to determine a parameter object that can be collected by each candidate angle, and obtain a collectable object set corresponding to each candidate angle respectively;

a score calculating module 720, configured to calculate an angle score of each candidate angle by using the collectable object set corresponding to each candidate angle;

the angle selecting module 730 is configured to select a candidate angle whose angle score meets a preset condition as a target angle, and obtain an initial conference image acquired by the image acquisition device under the target angle;

an image segmentation module 740, configured to segment image contents of a participant contained in the initial conference image to obtain a plurality of object images;

the frame reconstruction module 750 is configured to perform frame reconstruction on the plurality of object images to obtain a target conference image to be displayed.

It should be noted that, the device for reconstructing the picture of the online conference provided by the above embodiment and the method for reconstructing the picture of the online conference provided by the above embodiment belong to the same concept, and the specific manner in which each module and unit perform the operation has been described in detail in the method embodiment, which is not described herein. In practical application, the screen reconstruction device for online conferences provided in the above embodiment may allocate the functions to different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above, which is not limited herein.

Referring to fig. 8, fig. 8 is a schematic structural diagram of an embodiment of an electronic device of the present application. The electronic device 800 comprises a memory 801 and a processor 802, the processor 802 being arranged to execute program instructions stored in the memory 801 to implement the steps in the picture reconstruction method embodiment of any of the online conferences described above. In one particular implementation scenario, electronic device 800 may include, but is not limited to: the electronic device 800 may also include mobile devices such as a notebook computer and a tablet computer, and is not limited herein.

In particular, the processor 802 is used to control itself and the memory 801 to implement the steps in the picture reconstruction method embodiment of any of the online conferences described above. The processor 802 may also be referred to as a central processing unit (Central Processing Unit, CPU). The processor 802 may be an integrated circuit chip with signal processing capabilities. The processor 802 may also be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field-programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 802 may be commonly implemented by an integrated circuit chip.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an embodiment of a computer readable storage medium of the present application. The computer readable storage medium 900 stores program instructions 910 executable by a processor, the program instructions 910 for implementing the steps in the picture reconstruction method embodiment of any of the online conferences described above.

In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.

The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.

In the several embodiments provided in the present application, it should be understood that the disclosed methods and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units. The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all or part of the technical solution contributing to the prior art or in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims

1. An image reconstruction method for an online conference is characterized in that an image acquisition device is deployed in a conference scene, the image acquisition device corresponds to a plurality of candidate angles, and a plurality of conference objects exist in the conference scene, and the method comprises:

determining a parameter object which can be acquired by each candidate angle, and obtaining an object set which can be acquired and corresponds to each candidate angle;

calculating an angle score of each candidate angle by using the collectable object set corresponding to each candidate angle;

selecting a candidate angle with the angle score meeting a preset condition as a target angle, and acquiring an initial conference image acquired by the image acquisition device under the target angle;

dividing the image content of the participant object contained in the initial conference image to obtain a plurality of object images;

and carrying out picture reconstruction on the plurality of object images to obtain a target conference image to be displayed.

2. The method of claim 1, wherein said determining each candidate angle collectable participant comprises:

acquiring an object position of each participant in the conference scene, and acquiring an acquisition area of the candidate angle;

And taking the reference object with the object position in the acquisition area of the candidate angle as the reference object which can be acquired by the candidate angle.

3. The method of claim 1, wherein calculating an angle score for each candidate angle using the set of collectable objects for each candidate angle comprises:

counting the number of the objects of the reference object in the collectable object set corresponding to each candidate angle;

and calculating the angle score of each candidate angle based on the number of objects corresponding to each candidate angle.

4. A method according to claim 3, wherein each reference object corresponds to a priority score; the calculating the angle score of each candidate angle based on the number of objects corresponding to each candidate angle includes:

based on the number of the objects corresponding to each candidate angle, respectively obtaining the number scores of the objects corresponding to each candidate angle; and calculating the object quality score corresponding to each candidate angle based on the priority scores of the reference objects in the collectable object set corresponding to each candidate angle;

And combining the object quality scores and the object quantity scores, and calculating to obtain angle scores corresponding to each candidate angle.

5. The method according to claim 1, wherein selecting the candidate angle whose angle score satisfies the preset condition as the target angle includes:

and selecting the candidate angle with the highest angle score as the target angle.

6. The method according to claim 1, wherein selecting the candidate angle whose angle score satisfies the preset condition as the target angle includes:

counting the accumulated duration and/or the duration with the highest angle score of each candidate angle in a preset time period;

calculating a selection recommendation index of each candidate angle based on the accumulated time length and/or the duration time length;

and selecting the candidate angle with the highest recommendation index as the target angle.

7. The method of claim 1, wherein the segmenting the image content of the participant object contained in the initial conference image to obtain a plurality of object images comprises:

acquiring an object position of each participant in the initial conference image;

clustering the participant objects in the initial conference image based on the object position of each participant object to obtain an object clustering result;

And dividing the image content of each object clustering result in the initial conference image to obtain a plurality of object images.

8. The method of claim 1, wherein each participant corresponds to a priority score; the step of performing image reconstruction on the plurality of object images to obtain a target conference image to be displayed includes:

calculating the display score of each object image based on the priority score corresponding to the reference object in each object image; the view layout template of the conference scene is obtained, the view layout template comprises a plurality of image playing areas, and each image playing area corresponds to a highlighting level;

respectively confirming the image playing areas matched with each object image based on the display scores of each object image and the highlighting grades of each image playing area;

and carrying out picture reconstruction on the plurality of object images according to the image playing areas matched with each object image to obtain a target conference image to be displayed.

9. The method of claim 8, wherein the obtaining a view layout template of the conference scene comprises:

Performing scene classification on the conference scene to obtain the conference type of the conference scene;

querying a view layout template matched with the conference type.

10. An electronic device comprising a memory and a processor for executing program instructions stored in the memory to implement the steps of the method according to any of claims 1-9.